Giter VIP home page Giter VIP logo

lussierc / stockstoryscraper Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 2.0 154.29 MB

The Stock Story Scraper (SSS) is a text mining tool that gathers a stock's relevant articles and performs extensive sentiment analysis on them. Fall 2020 Independent Study - Allegheny College.

Python 94.92% Dockerfile 0.78% Shell 4.06% Batchfile 0.24%
predict-stock-swings text-mining stock-news sentiment stocks

stockstoryscraper's Issues

Write Report

Draft a basic report/accompanying document for the project that describes the tool, the project motivations, the work completed, future work that could be completed, the accuracy of the tool, and possible shortcomings of the tool.

Finalize Program Run Methods

Allow users to run the program using:

  • Regular Python
  • Pipenv
  • Docker

Include all the commands necessary in the README.

Clean up CML Printing

The way the CML is currently printing content needs to be improved. This would include:

  • Removing extraneous print statements of program data that the user should not see.
  • Cleaning up prompts and other print statements.

Finalize Documentation

Finalize the README with information about the tool, how to use it, and it's results/accuracy.

Include the run commands for the run methods discussed in #9.

Testing

Implement a basic test suite using Pytest.

Tests could include:

  • Testing that different articles are properly analyzed
  • Testing calculations are correct using sample data for results
  • Ensure sample articles can be properly scraped
  • etc.... more to be added to this issue/the PR for testing as time elapses

Create .gitignore

I need to create a .gitignore file that ignores things like CSV files, pycache files, .DS_Store files, and more.

It should remove all files not necessary on the remote repository to save space and reduce the time needed for new users to remove the repository.

Add More Results Calculations

Need to improve & refactor current calculations in addition to adding a few new ones.

Possible calculations to be added:

  • Create instability status calculation
  • Allow users to scrape multiple date ranges at once and them compare them
  • Create "Buy Status" calculation which will advise users to either buy, sell, or pass/hold on trading a stock

Fix Version Issues

Define versions for Python packages in the requirements.txt file and fix issues in the web app that came up with the introduction of a new streamlit version.

Refactor Code

As I near the completion of the basic version of the tool, I will need to refactor the code of the project to make it more efficient, syntactically correct, easier to test, and readable for others.

With this, there are a few things I know I need to refactor now:

  • CML code
  • Create standalone interface (contains options to run CML or UI)
  • Refactor results calculation code
  • etc.

Implement Interface

Once I finalize the backend, which includes calculating some more results and outputting/inputting them, I will need to implement my Streamlit interface.

This interface will include a main/welcome page which asks the user to input the stocks and websites they want to use. Once this information is downloaded, an overview of each stock will be displayed on the screen giving insights into it's overall "health". Users can then go to different pages which display the articles and their information for each stock. More info will also be displayed from results.

Improve Display of Results in Web App

Need to improve the display of metrics in the web app. Some of the graphs and flow of information just doesn't feel right.

Will add more comments in the future about what work this issue would entail.

Investigate Refactoring Opportunities

Investigate what areas of the code (and project as a whole) can & should be refactored. Make tickets for these areas, these improvements will be included in Version 2.

I already looked into it a bit when creating #12. Start with those code areas first & then look elsewhere. #12 will likely be closed once the other tickets are created & this research is complete.

Also, investigate new feature opportunities.

Improve Stock Well Being Prediction

Try using things like Neural Networks or Linear Models to calculate the stock well being prediction. Need to update it from the basic calculation it is making now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.