enzoampil / fastquant Goto Github PK
View Code? Open in Web Editor NEWfastquant — Backtest and optimize your ML trading strategies with only 3 lines of code!
License: MIT License
fastquant — Backtest and optimize your ML trading strategies with only 3 lines of code!
License: MIT License
I can add a financial network analysis example, following Dr. Legara's notebook.
This makes it a lot easier to perform EDAs with numerical columns. E.g. age
.
Plotting data takes a lot of steps and this can be painful for beginners:
from fastquant import get_stock_data
import pandas as pd
df = get_stock_data('JFC', '2018-01-01', '2019-01-01', format='dc')
#set dt as datetime object
df['dt'] = pd.to_datetime(df.dt)
#set dt as index
df = df.set_index('dt')
df.plot()
I propose to make dt column a pandas datetime object and set it as index by default in get_stock_data
so the above code can be simplified into:
from fastquant import get_stock_data
df = get_stock_data('JFC', '2018-01-01', '2019-01-01')
df.plot()
Currently the backtest
function only takes on one set of arguments at a time. In practice we actually want to run results on multiple combinations. (e.g. multiple possible values for slow and fast moving average).
Utilize cerebro.optstrategy
to perform the above (reference)
Problem: it takes a while to pull disclosures from a lot of companies
Solution: store the data into a DB for easy access
I think this should eventually live in a tutorial section in sphinx
webpage, but. this should be good for now. Related to #24 .
The current implementation is naive since it triggers a buy or sell at the moment the band is exceeded. An implementation with allowance that's configurable would be preferable.
I installed psequant using pip install psequant
and it worked alright.
When you pushed an update in get_company_disclosures()
, I tried to pip install psequant --upgrade
but it didn't seem to have the latest commit.
So I tried to clone your repo and install locally in develop mode using pip install -e .
so I have to git pull only to reflect the recent updates.
Doing so throwed an error:
Building wheel for psequant (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/jp/miniconda3/envs/py3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-mhaabhka/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-mhaabhka/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-tn2238pz --python-tag cp37
cwd: /tmp/pip-req-build-mhaabhka/
Complete output (11 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/psequant
copying psequant/psequant.py -> build/lib/psequant
copying psequant/__init__.py -> build/lib/psequant
running build_scripts
creating build/scripts-3.7
error: [Errno 21] Is a directory: 'psequant'
----------------------------------------
ERROR: Failed building wheel for psequant
I figured that the path to the script in setup.py should be psequant/psequant
instead. Since that script is still a template, i recommend to remove it in setup.py for the meantime .
Current dataset includes weekends and non-trading days and tags them as rows all with N/A values. Data could be treated either by removing the non-trading days, or filling these days with the last value.
We need to implement smart caching for get_pse_data
, similar to load_disclosures
. The former only checks for exact match of filename of saved stock data before loading; otherwise it re-downloads everything from scratch even if there's only 1 day difference between old and new query. The latter finds any saved discloures data of that company and appends older and/or newer data depending on the query so no data is downloaded twice.
We can use the ff sources as alt data source.
Problem:
The current Bollinger Band Strategy is naive, since it simply treats the upper and lower bands as resistance and support lines, respectively.
Solution:
We can make this more robust by applying a trend dimension, that e.g. recommends a buy only in the additional case that a current uptrend exists, and vice versa.
Disclosures can be found in PSE edge
Please fix the bug report template below, specifically the 2nd to the last line:
- lightkurve version (e.g. 1.0b6): <-- We need to take note of fastquant versioning
to something like
- fastquant version (e.g. 1.0)
It'll be better if version can be printed via the stardard means
import fastquant
fastquant.__version__
but not sure how. May be adding a version.py
and importing in __init__
?
import fastquant
# insert code here ...
Some company disclosures have attachment(s) that provide more details, e.g.
https://edge.pse.com.ph/openDiscViewer.do?edge_no=d01aed5ca14a1ab20de8473cebbd6407
It is not urgent but useful to scrape in the future.
Newspaper seems like a good tool to scrape and curate articles related to PSE-listed stocks.
I can imagine using a different tool to search for recent news related to a company and using newspaper
to scrape that article.
What do you think?
Edit:
Add nlp
module to analyze output of disclosures
module
Examples:
fastquant/fastquant/strategies.py
Line 536 in 36fc7ca
After running backtest, it'll be better to return the cerebro object in case user needs to access its properties (at least I do).
@enzoampil Do you agree? I will do this if so.
Check why investagrams returns empty html:
fastquant/fastquant/disclosures.py
Line 584 in 9286b55
I'm unable to run the "three line code" from lesson 1
Initially, installed fastquant in Anaconda terminal using
pip install git+git://github.com/enzoampil/fastquant.git
Proceeded to open my Jupyter notebook to run the following codes
from fastquant import get_pse_data
df = get_pse_data('JFC', '2018-01-01', '2019-01-01')
df.head()
Error showed
AssertionError Traceback (most recent call last)
in
1 from fastquant import get_pse_data
----> 2 df = get_pse_data('JFC', '2018-01-01', '2019-01-01')
3 df.head()C:\ProgramData\Anaconda3\lib\site-packages\fastquant\fastquant.py in get_pse_data(symbol, start_date, end_date, save, max_straight_nones, format)
404 )
405 else:
--> 406 cache = get_pse_data_cache(symbol=symbol)
407 cache = cache.reset_index()
408 # oldest_date = cache["dt"].iloc[0]C:\ProgramData\Anaconda3\lib\site-packages\fastquant\fastquant.py in get_pse_data_cache(symbol, cache_fp, update, verbose)
303 print("Loaded: ", cache_fp)
304 errmsg = "Cache does not exist! Try update=True"
--> 305 assert cache_fp.exists(), errmsg
306 df = pd.read_csv(cache_fp, index_col=0, header=[0, 1])
307 df.index = pd.to_datetime(df.index)AssertionError: Cache does not exist! Try update=True
Downloaded the "data" folder from fastquant repo and pasted in my installation directory
C:\ProgramData\Anaconda3\Lib\site-packages\fastquant
Re-ran my Jupyter notebook and re-ran
from fastquant import get_pse_data
df = get_pse_data('JFC', '2018-01-01', '2019-01-01')
df.head()
Same error message.
Current sample in README still reflects the old DCV default.
As seen in edge query form, only the first 50 results are shown by default. We need to fix get_company_disclosures
to get the entire results.
@enzoampil can you check this? I really don't know how to solve this.
Test that each of the strategies in the module (classes inheriting from BaseStrategy
) will work if they're called by the backtest
function.
quantopian, mlfinlab, etc offer advanced quant tools. Make sure to check which strategies/functionalities can be readily implemented here.
The old phisix API endpoint:
http://phisix-api2.appspot.com/stocks/
has been replaced with a new one:
http://1.phisix-api.appspot.com/stocks/
The fix is implemented in #58 .
For companies that have relevant hashtags, we can listen to the tweets about them which can serve as a financial indicator.
edge pse/ investagrams may change its api anytime. We should archive all disclosures information in a private database.
Will create functions that pull and listen to specified PSE related account:
Examples are the official PSE and stock brokerage accounts:
https://twitter.com/phstockexchange?lang=en
https://twitter.com/colfinancial?lang=en
https://twitter.com/firstmetrosec?lang=en
https://twitter.com/BPItrade
https://twitter.com/Philstocks_
https://twitter.com/itradeph
https://twitter.com/UTradePH
https://twitter.com/wealthsec
For COL, listening to #COLResearch specifically will filter to the analyst reports.
Note that acronyms preceded by a "$" are stock tickers, so we can use this to identify the company that a tweet is about (e.g. $MWC).
We may need to add in our contributing procedure code formatter and checker tests e.g. using black and flake8, similar to this.
get_company_disclosures()
currently returns a dataframe. It is useful to have another function that downloads and parses its contents, e.g. a particular press release, by supplying say the document's Circular Number. Then it would be easy to run sentiment analysis, etc.
The url format of company disclosures in pse is cryptic:
e.g. https://edge.pse.com.ph/openDiscViewer.do?edge_no=8571ce07732abd9643ca035510b6ec2b
Posting a request to https://edge.pse.com.ph/announcements/form.do seems to return only tables, not links to the actual document.
What do you think?
A link to examples e.g. nbviewer can be added in README
Please read through the references below and feel free to ask for help in the issues or slack channel!
Tutorials on the fastquant website:
https://enzoampil.github.io/fastquant-blog/
fastquant example notebooks:
https://nbviewer.jupyter.org/github/enzoampil/fastquant/tree/master/examples/
Intro to backtrader Reference:
https://algotrading101.com/learn/backtrader-for-backtesting/
Quickstart backtrader guide:
ERROR: fastquant 0.1.2.5 has requirement pandas==0.25.3, but you'll have pandas 1.0.0 which is incompatible.
I just upgraded to pandas 1.0 and found out that fastquant package is not yet compatible with it. Not a severe issue since I can just rollback to 0.25 version and given pandas 1.0 is very new.
Some notebooks are mature enough to be turned into a blog article. I propose to add a blog page in our gh-pages.
Here is a running list of articles:
get_stock_data(..., format='dcv')
throws error/ quits prematurely when start_date
input is much earlier than when data is available. This happens because we set 10 consecutive NaNs as threshold for escaping the loop. Ideal solution is to figure out if date inputs are wrong and prescribe the correct date, or just return available data even if far from either dates without raising an exception. I prefer the first solution so it notifies the user that requested data on date is unavailable.
Note: We can also refer to listing date column as earliest possible start_date
input especially for new companies enlisted in the last 5 years. Some companies are listed decades ago, but why data stock data is available only recently?
from fastquant import get_stock_data
df = get_stock_data("MAH", start_date="2018-01-01", end_date="2020-01-01", format="dcv")
throws error, whereas the one below does not
df = get_stock_data("MAH", start_date="2019-01-01", end_date="2020-01-01", format="dcv")
I found from data cache that MAH has data only from "2018-06-04". Probably a fix is to assert start_date
>first date entry in cache, if it exists and additionally add a note to user to use later dates when Exception arises. We may need to add a custom exception handler for this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.