Giter VIP home page Giter VIP logo

stock-sector-analysis's Introduction

stock-clustering

Python project to determine stocks that exhibit similar price action.

Description

Data: Data is collected from SIMFIN API (free version) using Asynchronous request for concurrency and
faster HTML response. The data is OHLCV data for the stocks in the PostgreSQL populated earlier. The data
is stored as a csv and later populated into a SQL table. The database contains other tables like stock symbol,
ETF composition etc.
The data is ingested into pandas dataframe using Dynamic SQL (psycopg2) and unsupervised clustering algorithm
is implemented.

1. Setting up Postgres Instance

To create the postgres instance on docker

docker pull postgres

To start a postgres instance

docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres

The postgres instance name and password will be used to connect to the database using psycopg2

2. Setting up API

Create account with SIMFIN and Alphavantage for the api key. https://simfin.com/data/api
https://www.alphavantage.co/

3. Setup config

Enter the postgres database credentials and the API key in the config.py template file in the repository

4. Setting the Postgres database

Create a database and required tables by running the script etfdb.sql or by copy pasting the scripts in the terminal.

5. Populating the tables

Run the script populate-stocks.py and populate-timeseries.py to pull all stocks and price data for alst ten years using API. This will take a few hours to complete depending on whether account associated with API is free or premium

6. Stock cluster analysis

The script for training the unsupervised clustering model is in train.py. The script app.py is the front end design and can be run by following command. The webpage can be visited on local

streamlit run app.py

The console will display the address where the web app is being hosted. It will be localhost followed by a port (eg: http://localhost:8502/)

This is the homepage image

Here we can select the dates between which to carry out the cluster analysis and choose the model

Once the data is pulled from database, it will show the first 5 rows.

image

Click on start training to run the model. Once completed, we can view the clusters in an interactive 3d plot.

image

image

stock-sector-analysis's People

Contributors

nsus1103 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.