Giter VIP home page Giter VIP logo

jerbouma / financedatabase Goto Github PK

View Code? Open in Web Editor NEW
2.8K 79.0 335.0 1.9 GB

This is a database of 300.000+ symbols containing Equities, ETFs, Funds, Indices, Currencies, Cryptocurrencies and Money Markets.

Home Page: https://www.jeroenbouma.com/projects/financedatabase

License: MIT License

Python 32.96% Jupyter Notebook 67.04%
finance database financials analysis equities etfs funds indices futures currencies

financedatabase's Introduction

Welcome to my profile! 👋

Thank you for taking the time to visit this page!

I’m Jeroen Bouma, a Quantitative Investment Strategist at a.s.r. asset management, one of the largest Dutch insurance companies with over €120 billion AUM. I am responsible for spearheading innovative initiatives within the asset management divisions by utilizing Python, particularly in portfolio analytics and optimization. Furthermore, I conduct Asset Liability Management (ALM) and Strategic Asset Allocation (SAA) analyses, encompassing a spectrum of topics including hedging strategies, liquidity risk management, Solvency II optimization, and asset-only studies.

I joined a.s.r. asset management. after working at OpenBB, an innovative open-source company transforming investment research, and PGGM, a prominent Dutch pension fund managing over €300 billion. What ties these experiences together, and reflects my own passion, is the incorporation of (advanced) Python modeling within Quantitative Finance.

Jeroen Bouma's GitHub stats

My main projects are:

  • Finance Toolkit which gives access to all relevant 100+ financial ratios, indicators and performance measurements which are written down in the most simplistic way allowing for complete transparency of the calculation method. It has over 2K Stars.
  • Finance Database which features 300.000+ symbols containing Equities, ETFs, Funds, Indices, Currencies, Cryptocurrencies and Money Markets. It therefore allows you to obtain a broad overview of sectors, industries, types of investments and much more. It has over 2.5k Stars.
  • The Passive Investor which allows you to screen various ETFs and do quick comparisons for the most important metrics. It has nearly 500 Stars.

Furthermore, I maintain a website which offers a comprehensive resume with testimonials, my open-source Python projects related to financial theory including extensive examples and documentation, all of my public speaking events and conferences I attended talks and a complete list of literature I’ve studied to enhance my understanding of the financial world.

financedatabase's People

Contributors

actions-user avatar carter4242 avatar chinmay7016 avatar cjaniake avatar colin99d avatar dependabot[bot] avatar deva2580 avatar devanshkyada27 avatar janumalaakhilendra avatar jerbouma avatar omimakhare avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

financedatabase's Issues

IndexError in technical analysis of biotech ETFs during coronacrisis sample

I was exploring the code in the README and encountered an IndexError: index 4 is out of bounds for axis 0 with size 4 when I tried one of the examples. this arises when the number of unique tickers from the bollinger_bands.columns.get_level_values(1).unique() call exceeds the number of subplots available in the grid specified.

for ticker in bollinger_bands.columns.get_level_values(1).unique():
    name = health_care_etfs_in_biotech.loc[health_care_etfs_in_biotech.index == ticker, 'name'].iloc[0]
    
    bollinger_bands.xs(ticker, level=1, axis=1).plot(
        ax=axis[row, column],
        xlabel='',
        title=name,
        legend=False
        )

    column += 1
    if column == 3:
        row += 1
        column = 0

there are a number of ways to do this more elegantly, but I'm thinking that we just break out if row is out of bounds since it's only for illustrative purposes. I've shot a PR in this way but let me know if a different solution is required.

thanks!

Typo error in contributing file

I found a small typo error in the contributing file under updating the database section. The word "tremendelously" is incorrect, and it should be "tremendously." Here's the corrected sentence:

"You can help out tremendously by updating one of the CSV files."

[DATA] - Added Kenya Agriculturals

What did you improve in the Database?
Added new data from Kenya - Agriculturals as shown here
A description of what's data you improved, deleted or added, and a bit of context (why is that).
I did this because Kenya is not on the list. I do not know how to do it well so this is a trial.
There is a local symbol (should I add a prefix?) and the ISIN code - I added 2 rows for every record.

I will work on the rest if this is ok.

Drop the files you have updated below
Did a ZIP (no passwords) because the file is more than 25Mb raw
equities.zip

[FR] List of ETFs and stocks

Hi, this is less of an issue and more of a request.

  • How do I get a list of all ETFs that are in the UK?
  • Can I get a list of which ETF is leveraged and non-leveraged?
  • Can I get a list of all stocks in an index, eg- S&P500

Thanks!

[TYPO] in https://github.com/JerBouma/FinanceDatabase/blob/main/compression/README.md?plain=1

What's the feature or data that should be improved?
I've identified a small typo in the provided text. The phrase "arrises" should be corrected to "arises."

Describe how you would like the feature or data improved
A description of what the current feature or data is vs what it would be after your suggestion.

Possibly describe the ideal way to improve this
If you have thought about how you would do it, add it here.

Additional information
Add any other information or screenshots about the feature improvement.

Can't import the package

Describe the bug

import FinanceDatabase as fd

Traceback (most recent call last):
File "<pyshell#5>", line 1, in
import FinanceDatabase as fd
File "C:\Python27\lib\site-packages\FinanceDatabase_init_.py", line 2, in
from .json_picker import select_cryptocurrencies
File "C:\Python27\lib\site-packages\FinanceDatabase\json_picker.py", line 41
raise ValueError(f"Not able to find any data for {cryptocurrency}.")
^
SyntaxError: invalid syntax

Using Python 2.7

Is it possible that json_picker .py has a 400 invalid URL?

def select_cryptocurrencies(cryptocurrency=None,
base_url="https://raw.githubusercontent.com/JerBouma/FinanceDatabase/master/"
"Database/Cryptocurrencies/",
use_local_location=False, all_cryptocurrencies_json="_Cryptocurrencies"):

Or is this a me issue? lol

Thanks!

add domain, state in the Equities

Is your feature request related to a problem? Please describe.
It would be nice to add domain like apple.com and the state: California, and zip code if possible

add product names

Here is an example where we extract products from summary field.
Uses NLP to detect nouns, matches nouns to a product database.
Requires some work to build up the DB, I added most common 3K items.

Company description can be scraped from different places to have wide range of products.
After we have products we can easily find precise competitors.

Extracted products for MSFT ['console', 'server', 'software', 'tablet', 'window', 'windows', 'xbox']
Post filtering can be added.

If anyone has time please pick this up. Install instruction on top.
Would make sense to start with SP500, nasdaq, then rest of the symbols.
products.zip

Adding market-cap offline

Would it worth adding the market-cap into the offline DB?
It doesn't change that often, and it would make market-share calculation much faster.

Failed downloads

Ik kan niets downloaden dus ik denk dat Yahoo weer iets aangepast heeft..... Kan/ga je dit aanpassen?

[DATA] Fixed data for FFH.TO

What did you improve in the Database?
Data for the symbol FFH.TO was incorrect. Correct company is Fairfax Financial from Canada, not X Financial from China.

Just wondering: Is there a proper method to validate new changes submitted by users? Otherwise propensity for human induced errors would be quite high.

Drop the files you have updated below (compress to zip if file size too large)
equities.zip

Missing US symbols

Hello,
I found these US stocks missing from "united_states.json"
diff.txt

They seems active based on my broker.

Cheers

Hosting the database .json files locally

Describe the solution you'd like
The library always reaches out to the Internet to get the database .json files. This is not ideal for performance or for a program that might not have an Internet connection.

I'd like a way to host the database files on my own local machine or intranet, and specify my own base URL for the .json database files. Perhaps it could be an optional parameter called base_url for the select_ series of methods in FinanceDatabase. I suppose the given base URL would just integrate and override the URLs you have hard-coded in json_picker.py.

Describe alternatives you've considered
Copying the files to my local machine and manually editing json_picker.py to use it.

Error by: core_selection = fd.select_etfs("core_selection_degiro_filtered")

Hi,

As I run the code:
...
core_selection = fd.select_etfs("core_selection_degiro_filtered")
...
I got error as follow:
`JSONDecodeError Traceback (most recent call last)
~\anaconda3\lib\site-packages\FinanceDatabase\json_picker.py in select_etfs(category)
105 request = requests.get(json_file)
--> 106 json_data = json.loads(request.text)
107 except json.decoder.JSONDecodeError:

~\anaconda3\lib\json_init_.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
347 parse_constant is None and object_pairs_hook is None and not kw):
--> 348 return _default_decoder.decode(s)
349 if cls is None:

~\anaconda3\lib\json\decoder.py in decode(self, s, _w)
339 if end != len(s):
--> 340 raise JSONDecodeError("Extra data", s, end)
341 return obj

JSONDecodeError: Extra data: line 1 column 4 (char 3)

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
----> 1 core_selection = fd.select_etfs("core_selection_degiro_filtered")
2 # core_selection = fd.select_etfs("")
3 # core_selection

~\anaconda3\lib\site-packages\FinanceDatabase\json_picker.py in select_etfs(category)
106 json_data = json.loads(request.text)
107 except json.decoder.JSONDecodeError:
--> 108 raise ValueError(f"Not able to find any data for {category}.")
109 else:
110 try:

ValueError: Not able to find any data for core_selection_degiro_filtered.`

ValueError: Not able to find the options for equities

Describe the bug
Was trying to view all available equities, but I am getting a ValueError which follows a JSONDecodeError (see screenshot).

To Reproduce
Steps to reproduce the behavior:
`
import FinanceDatabase as fd

equities = fd.show_options('equities')
print(equities)
`

Expected behavior
show all available equities

Screenshots
grafik

Desktop (please complete the following information):

  • OS: OSX El Capitan

Industry the company associated with

Hi JerBouma,

Thanks for this wonderful package. Very useful resource indeed. I was wondering is it possible to provide symbol of a company ans fish out the industry/sector that its associated with ? Some companies have mix of products that it woule be listed in two different Industry. It would be useful to get the associated industry given the symbol. Thanks

industry field anomalies

https://github.com/JerBouma/FinanceDatabase/raw/master/Database/Equities/Countries/United%20States/United%20States.json

"industry" field contains the following anomaly:


"Banks\u2014Diversified",
"Banks\u2014Regional",
"Beverages\u2014Brewers",
"Beverages\u2014Non-Alcoholic",
"Beverages\u2014Wineries & Distilleries",

"Drug Manufacturers\u2014General",
"Drug Manufacturers\u2014Specialty & Generic",
"Insurance\u2014Diversified",
"Insurance\u2014Life",
"Insurance\u2014Property & Casualty",
"Insurance\u2014Reinsurance",
"Insurance\u2014Specialty",
"Real Estate\u2014Development",
"Real Estate\u2014Diversified",

these could use some normalization
"Aerospace & Defense",
"Aerospace/Defense - Major Diversified",
"Aerospace/Defense Products & Services",

It's a lot faster to work on offline database, cheers!

[IMPROVE] Dynamic loading of Pickle data

Related: OpenBB-finance/OpenBBTerminal#4422

Not sure if to flag this as a bug or improvement (probably rather a security vulnerability). Either way, I think it's fairly dangerous to have terminals download on-demand Pickle files from random places on the Internet.

Pickle files allow for remote code execution and Python offers no sandbox mechanism. This is not so different from loading a DLL. See the warning at: https://docs.python.org/3/library/pickle.html or https://pandas.pydata.org/docs/reference/api/pandas.read_pickle.html


What's the feature or data that should be improved?

Code should move away from dynamically loading Pickle containers, e.g.:

self.data = pd.read_pickle(the_path, compression="xz")

Describe how you would like the feature improved

Same functionality using a safe serialization format.

Possibly describe the ideal way to improve this

E.g. msgpack, BSON or compressed JSON instead of Pickle.

Additional information

N/A

Download ETF data by countries

Just an expectation for enhancement. I am currently working on analyzing ETFs issued in North America region (roughly America). While it is possible to download these data via selenium, I wish you could add another country option in the function "select_etfs" as well as the function "select_equities". I would appreciate it if you could consider this.

TYPO ERROR

"which then results in approximately 155.000 different symbols."

The correct format for representing this number would be "155,000" with a comma instead of a period. So it should be:

"which then results in approximately 155,000 different symbols."

Please let me know if you need further assistance!

Some industry selectors fail

Describe the bug
The module seems to fail when the retrieved record length is too big!

It's working just fine when using:

fd.select_equities(country='United States', industry='Airlines')
fd.select_equities(country='United States', industry='Railroads')
fd.select_equities(country='United States', industry='Solar')
fd.select_equities(country='United States', industry='Silver')

But it fails when the following is used:

fd.select_equities(country='United States', industry='Biotechnology')

To Reproduce
Change industry in the example form "Airlines" to "Biotechnology"

Expected behavior
N/A

Screenshots
N/A

Desktop (please complete the following information):

  • OS: Mac OS Big Sur
  • Version 11.2

Additional context
Some differences I've detected are:

Airlines: 30 of 30 completed, 2 failed, 11 shown in graph
Railroads: 27 of 27 completed, 0 failed, 12 shown in graph
Solar: 49 of 49 completed, 1 failed, 31 shown in graph
Silver: 9 of 9 completed, 1 failed, 3 shown in graph
Biotechnology: 979 of 979 completed, 18 failed, NA

Errors reported in the console
Traceback (most recent call last):
Failed in
quick_ratio = airlines_us_fundamentals[symbol]['financialData']['quickRatio']
KeyError: 'financialData'

Display Industry set

Hi,

Thank you very much for this great module, I had been trying to do something similar for some time.

I have been trying to query your database for the equity data for a specific industry and I have found out that it can be difficult to know which industries are available without looking into the json files at FinanceDatabase/Database/Categories/.

Have I missed something here and there is already a function to display the industries ?

If not, do you think that it would be a good idea to add a feature that would list the available industries or other sets of values that are available for the different arguments?

As a solution, I would consider something that would just read the existing json files and display their contents as set objects.

Maybe just a summary of the database structure would be ideal as it would encompass everything and seems like the most helpful feature overall.

Thank you.

PS: I'm keen to help with the implementation if it helps.

AttributeError is raised when calling select_etfs() with no parameter given

Describe the bug
AttributeError is raised when calling select_etfs() with no parameter given.

Expected behavior
Method returns all ETFs.

Actual behavior
The following error is raised:

Traceback (most recent call last):
File "..snip..\lib\site-packages\FinanceDatabase\json_picker.py", line 113, in select_etfs
json_data = json.loads(request.text).decode("UTF-8")
AttributeError: 'dict' object has no attribute 'decode'

Additional context
Version 0.1.9

[FR] Multiple criteria while searching

Hi,

Thanks for the project! I was wondering whether or not it would be a good idea to add support for multiple parameters when performing a search function. Currently, this could certainly be performed by joining multiple data frames that are returned via the search function or just by querying the entire dump at once and filter it by ourselves. But I think it would be more convenient if this feature is built in to the function?

For example, to filter for both Large Cap & Mid Cap tech listed in NASDAQ it could be something like:

equities.search(country='United States', sector='Tech', market='NASDAQ', market_cap=['Large Cap', 'Mid Cap'])

Or perhaps there is already something available, if so I apologise.

No module named 'requests'

After pip installing FinanceDatabase I attempt to import it in my python file, but I get this error: Traceback (most recent call last):
File "/Users/usrname/PycharmProjects/stockScreener/main.py", line 1, in
import FinanceDatabase as fd
File "/Users/usrname/.virtualenvs/stockScreener/lib/python3.8/site-packages/FinanceDatabase/init.py", line 2, in
from .json_picker import select_cryptocurrencies
File "/Users/usrname/.virtualenvs/stockScreener/lib/python3.8/site-packages/FinanceDatabase/json_picker.py", line 1, in
import requests
ModuleNotFoundError: No module named 'requests'

[IMPROVE] remove accented characters from name column

Hi, what an awesome resource you've put together here! Thank you.

I've noticed that some of the entries have accented characters in the name column which don't appear to decode correctly (at least for me), for example:

equities.search(name="telef", exchange="MCE")

image
So neither of the following return anything:

>>> df = equities.search(name="telefonica", exchange="MCE")
>>> df.empty
True
>>> equities.search(name="telefónica", exchange="MCE")
>>> df.empty
True

The following returns a load of instruments because many entries for Telefonica do not included the accented 'o':

>>> df = equities.search(name="telefonica")
>>> len(df)
20

...although it won't include the Madrid listing (and other entries that have the accented o in the name).

To ensure consistent querying I'd suggest replacing all accented characters in the name column with their unaccented equivalents.

If I get a moment (unlikely tbh) I'll contribute the change. Thought I'd raise the issue in the meantime in case anyone else runs into this and has the opportunity to make the changes.

Thanks again for the library!
Marcus

[DATA] Added MBG.DE and DTG.DE

What did you improve in the Database?
Daimler company split in two operating and listed companies: Mercedes-Benz (for cars) and Daimler Trucks. DAI.DE no longer trades and the two new listed stocks are MBG.DE and DTG.DE. I added the two entries at the end of the current file.

Drop the files you have updated below (compress to zip if file size too large)
equities.zip

[DATA] How can we contribute ISIN codes ?

Hi,

you mention you'd like us to contribute with ISIN codes but how to add them? We could theoretically add a column but problem is that a single ticker can have multiple ISINs so that would create multiple columns or duplicate rows of the same ticker with different ISINs.

Thanks for your work!

PS: don't know if that can be relevant but I have an Excel file that, through powerquery, given a ticker, downloads all the historical values from yahoo finance. Problem is that now if the number of rows (tickers) changes, the query requires updating. I'm working on making it work on an array (to have it work no matter how many tickers, as any ticker is an iteration of the same cycle querying yahoo finance). The mix of your DB and this could create a very nice DB !

PPS: if you want to explore yourself here's the syntax:
https://query1.finance.yahoo.com/v7/finance/download/"&Ticker&"?period1=1214956800&period2=20037888000 where :

  • "&Ticker&" is the ticker (who would have guessed?!)
  • period1= start date in UNIX format
  • period2= end date in UNIX format
    this will return a nice csv file with the following columns:
  • Date
  • Open
  • High
  • Low
  • Close
  • Adj Close
  • Volume
  • Ticker

some ETF symbols have non-alphabetic characters

Describe the bug
some ETF symbols have non alphabetic characters

To Reproduce
Steps to reproduce the behavior:
`
import financedatabase as fd

res = fd.select_etfs()

for k,v in res.items():
if k.isalpha() is False:
print(k)
`

Screenshots

image

Typo Error In Readme File

What's the feature or data that should be improved?
there are many typo error in readme file

Describe how you would like the feature or data improved
A description of what the current feature or data is vs what it would be after your suggestion.

Possibly describe the ideal way to improve this
If you have thought about how you would do it, add it here.

Additional information
Add any other information or screenshots about the feature improvement.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.