pydata / pandas-datareader Goto Github PK

Extract data from a wide range of Internet sources into a pandas DataFrame.

Home Page: https://pydata.github.io/pandas-datareader/stable/index.html

License: Other

Python 100.00%

data data-analysis dataset econdb economic-data fama-french finance financial-data fred html pandas pydata python stock-data

pandas-datareader's People

Contributors

Stargazers

Watchers

Forkers

hayd davidastephens jorisvandenbossche jnmclarty evanpw brotchie stared bashtage sinhrks 0x0l yvfine extremewaysback jsiah mitlab jtkiley femtotrader srault95 keviny humdings e2thenegpii dmcleod3 zhengda joshowen linanqiu killedision monkeini hal2001 santanna andportnoy meichuanwang kurtforrester bearnshaw lina1 divkamath adamjoshuagray ahmedhamedtn sf99167 bmaggard phirov jasonstrimpel tianhm feizei2008 gabeesh scmarsh guotechfin lucasmartins92 beluga9 cediorio dzlabs trevorprater rwreynolds zaighumrajput coderfi adamchainz yy1117 henfee dscheste colemurray robottwo davidandreoletti richter-gh nme01 jeffcarey zen3d rezangyal raafek scls19fr kuonanhong chreko eyeglasses sathik11 spark-lin seahur louiekang daspecster hpsoar channa4 lsternlicht wenhai-zheng internalconsistence tigerxjtu vmarkelov aking1012 stguerin zglin james26s jaydenwhyte chrinide akbarboghani hongchhe luojiahuli xtsfullstack graingert ykkwon rgkimball paurichardson lnsongxf alphavertex nuoezguen harshith-t

pandas-datareader's Issues

REL: 0.2.0

Issue to keep track of 0.2.0 release.

Lets do a release with the bug fixes, package structure and additional features that have been added recently.

Options quotes are missing timezone

I ran into this problem on the 'Quote_Time' field when pulling options quotes using pandas.io.data.Options.get_all_data(). While 'Quote_Time' is objectively 'US/Eastern' timezone, it actually shows up in the dataframe as naive timezone, which in my use case then got interpreted as UTC. I would suggest setting the timezone correctly when the data is pulled from Yahoo.

I'm pretty sure that the Expiry field in the index also uses a naive timezone (= None).

Here is a related issue for further clarification.

Quandl integration

Hello,

I noticed a PR about Quandl integration #43

Maybe we should discuss about it here instead of in a PR

What about http://pythonhosted.org/Quandl ?

moreover it seems that API v1 will be deprecated

https://www.quandl.com/help/api

(Note: The guide to v1 of the Quandl API is available here. We encourage users to adopt the current version, v1 is on the path to deprecation.)

Python source code is available here
https://github.com/quandl/quandl-python/
there are also using v1 in master but v3 is in other branch

pandas_datareader/yahoo/daily.py should be named differently

Hello,

pandas_datareader/google/daily.py provide only daily data

but

pandas_datareader/yahoo/daily.py use interval so it might provide
not only daily data.

interval : string, default 'd'
    Time interval code, valid values are 'd' for daily, 'w' for weekly,
    'm' for monthly and 'v' for dividend.

So maybe we should name it differently.

Kind regards

improve coverage

At the moment it's .

Note: several tests are skipped atm (mostly these are labelled unreliable), this may partially explain results.

Can't install pandas-datareader using pip or conda

Hello,

I'm trying to install pandas-datareader using pip or conda.

$ conda install pandas-datareader
Fetching package metadata: ....
Error: No packages found in current osx-64 channels matching: pandas-datareader

You can search for this package on Binstar with

    binstar search -t conda pandas-datareader
$ conda --version
conda 3.10.0
$ pip install pandas-datareader
Collecting pandas-datareader
  Could not find any downloads that satisfy the requirement pandas-datareader
  No distributions at all found for pandas-datareader

I'm using Mac OS X.

Any idea ?

Kind regards

Pandas options data not working

I am trying to use get_all_data() method to download options data. I am getting following error:
RemoteDataError: Parsed URL 'http://finance.yahoo.com/q/op?s=SPY' has no rootelement
Actual code:
from pandas.io.data import Options
SPYD = Options('spy', 'yahoo')
data = SPYD.get_all_data()

My pandas version is 0.16.2

COMPAT with 0.16.0

if someone would like to put all of the accumulated changes since @hayd pulled this from master as a PR back into pandas would be appreciated.

I think a good idea would be to have pandas-datareader, the 0.1 (or whatever version), be released about the same time as 0.16.0 (target at end of feb). We'll put up an announcement in the pandas release about this, and about future availability..

Support caching

Are there any plans to add caching of downloaded files?

Decide on the package name

Currently, @hayd changed the package name to pandas_datareader. Is everybody satisfied with that? (to be clear: it is about the name that is imported)

Personally, I find it a bit long, and I think I also don't really like the underscore in the name. But of course, it is just personal taste! (I justed wanted to bring up the discussion to have this now, and not later, and deliberately decide on this to keep it as it is or to change it).
What about just datareader, or pddatareader (but that is a bit difficult with the two 'd's)

Yahoo Finance Options tests raises ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

Hello,

some Yahoo Finance Options tests raises

ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

I can see this exception using

$ nosetests -s -v

======================================================================
ERROR: test_get_all_data (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 358, in test_get_all_data
    data = self.aapl.get_all_data(put=True)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1197, in get_all_data
    expiry_dates = self.expiry_dates
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_all_data_calls_only (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 372, in test_get_all_data_calls_only
    data = self.aapl.get_all_data(call=True, put=False)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1197, in get_all_data
    expiry_dates = self.expiry_dates
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_call_data (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 337, in test_get_call_data
    calls = self.aapl.get_call_data(expiry=self.expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 901, in get_call_data
    expiry = self._try_parse_dates(year, month, expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1061, in _try_parse_dates
    expiry = [self._validate_expiry(expiry)]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1085, in _validate_expiry
    expiry_dates = self.expiry_dates
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_data_with_list (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 365, in test_get_data_with_list
    data = self.aapl.get_call_data(expiry=self.aapl.expiry_dates)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_expiry_dates (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 351, in test_get_expiry_dates
    dates, _ = self.aapl._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_near_stock_price (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 330, in test_get_near_stock_price
    expiry=self.expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1005, in get_near_stock_price
    expiry = self._try_parse_dates(year, month, expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1061, in _try_parse_dates
    expiry = [self._validate_expiry(expiry)]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1085, in _validate_expiry
    expiry_dates = self.expiry_dates
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_options_data (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 322, in test_get_options_data
    options = self.aapl.get_options_data(expiry=self.expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 750, in get_options_data
    self.get_call_data)]).sortlevel()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 749, in <listcomp>
    for f in (self.get_put_data,
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 964, in get_put_data
    expiry = self._try_parse_dates(year, month, expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1061, in _try_parse_dates
    expiry = [self._validate_expiry(expiry)]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1085, in _validate_expiry
    expiry_dates = self.expiry_dates
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_put_data (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 344, in test_get_put_data
    puts = self.aapl.get_put_data(expiry=self.expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 964, in get_put_data
    expiry = self._try_parse_dates(year, month, expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1061, in _try_parse_dates
    expiry = [self._validate_expiry(expiry)]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1085, in _validate_expiry
    expiry_dates = self.expiry_dates
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_get_underlying_price (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 381, in test_get_underlying_price
    url = options_object._yahoo_url_from_expiry(options_object.expiry_dates[0])
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'September 18, 2015' does not match format '%B %d, %Y'

======================================================================
ERROR: test_month_year (test_data.TestYahooOptions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1226, in expiry_dates
    expiry_dates = self._expiry_dates
AttributeError: 'Options' object has no attribute '_expiry_dates'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/tests/test_data.py", line 421, in test_month_year
    data = self.aapl.get_call_data(month=self.month, year=self.year)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 901, in get_call_data
    expiry = self._try_parse_dates(year, month, expiry)
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1075, in _try_parse_dates
    expiry = [expiry for expiry in self.expiry_dates if expiry.year == year and expiry.month == month]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1228, in expiry_dates
    expiry_dates, _ = self._get_expiry_dates_and_links()
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in _get_expiry_dates_and_links
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "/Users/femto/github/others/pandas-datareader/pandas_datareader/data.py", line 1250, in <listcomp>
    expiry_dates = [dt.datetime.strptime(element.text, "%B %d, %Y").date() for element in links]
  File "//anaconda/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "//anaconda/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data 'August 28, 2015' does not match format '%B %d, %Y'

but

$ nosetests -s -v pandas_datareader/tests/test_data.py:TestYahooOptions.test_get_all_data

don't raises any error !

Any idea ?

DataReader should raise an exception when a data_source is not implemented

Hello,

I think it will be more explicit if DataReader raised an exception when data_source is not
in `("yahoo", "yahoo-actions", "google", "fred", or "ff")``

Maybe something like

if data_source == "yahoo":
    return ...
...
elif data_source == "famafrench":
    return ...
else:
    raise(NotImplementedError("data_source=%r is not implemented" % data_source))

Kind regards

Idea - Import and export support for DataPackages

DataPackages is a project that offers a standardized and extensible way of packaging up data. There is already a growing community around Data Packages and an increasing number of core datasets which can easily be grabbed from a URL.

It would be great to see easy import and export to and from a Tabular Data Package - see spec here - into a Pandas DataFrame (e.g. pd.from_datapackage(...) & pd.to_datapackage(...)).

Moved from main Pandas repot to here on advice of @TomAugspurger

Fama french data library url not working anymore

Hi, I use your library extensively, and it seems the Fama french option doesnt work anymore -

I tried tracing the error - perhaps the url has changed (or denied access) -

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/

Is there a way you could fix this? Thanks in advance.

handling unicode in returned index names

pandas-dev/pandas#9026

TST: also test with pandas master

There is now already a broad test range of pandas versions (0.10-0.15), but ideally, there should also be a test against pandas master.

Google CSV API Deprecated

It looks like Google has dropped their finance API.

https://developers.google.com/finance/?csw=1

import pandas.io.data as web
web.DataReader('gs','google')

IOError: after 3 tries, Google did not return a 200 for url 'http://www.google.com/finance/historical?q=gs&startdate=Jan+01%2C+2010&enddate=Mar+24%2C+2015&output=csv'

Anyone else having this issue? Tests that use Google in pandas-datareader are failing now. We may have to remove Google finance support.

ImportError: No module named 'pandas_datareader.google'

Hello,

I'm facing a a problem when I try a "clean" install of pandas-datareader.

from pandas_datareader import data

raises

ImportError: No module named 'pandas_datareader.google'

When I uninstall pandas-datareader I can notice that there is no "google" directory (neither "yahoo", neither "tests" but that's less problematic).

pc:~ femto$ pip install git+https://github.com/pydata/pandas-datareader.git
Collecting git+https://github.com/pydata/pandas-datareader.git
  Cloning https://github.com/pydata/pandas-datareader.git to /var/folders/j_/v8b1bst93_94t724ptsswfsr0000gn/T/pip-bzcvj335-build
Requirement already satisfied (use --upgrade to upgrade): pandas in /anaconda/lib/python3.4/site-packages (from pandas-datareader==0.1.1)
Requirement already satisfied (use --upgrade to upgrade): python-dateutil>=2 in /anaconda/lib/python3.4/site-packages (from pandas->pandas-datareader==0.1.1)
Requirement already satisfied (use --upgrade to upgrade): pytz>=2011k in /anaconda/lib/python3.4/site-packages (from pandas->pandas-datareader==0.1.1)
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7.0 in /anaconda/lib/python3.4/site-packages (from pandas->pandas-datareader==0.1.1)
Requirement already satisfied (use --upgrade to upgrade): six>=1.5 in /anaconda/lib/python3.4/site-packages (from python-dateutil>=2->pandas->pandas-datareader==0.1.1)
Installing collected packages: pandas-datareader
  Running setup.py install for pandas-datareader
Successfully installed pandas-datareader-0.1.1
pc:~ femto$ ipython
Python 3.4.3 |Anaconda 2.3.0 (x86_64)| (default, Mar  6 2015, 12:07:41)
Type "copyright", "credits" or "license" for more information.

IPython 4.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from pandas_datareader import data
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-75f869254055> in <module>()
----> 1 from pandas_datareader import data

//anaconda/lib/python3.4/site-packages/pandas_datareader/data.py in <module>()
      9 from pandas_datareader._utils import _sanitize_dates
     10
---> 11 from pandas_datareader.google.daily import _get_data as get_data_google
     12 from pandas_datareader.google.quotes import _get_data as get_quote_google
     13

ImportError: No module named 'pandas_datareader.google'

In [2]:
Do you really want to exit ([y]/n)? y
pc:~ femto$ pip uninstall pandas_datareader
Uninstalling pandas-datareader-0.1.1:
  /anaconda/lib/python3.4/site-packages/pandas_datareader-0.1.1-py3.4.egg-info
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__init__.py
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__pycache__/__init__.cpython-34.pyc
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__pycache__/_utils.cpython-34.pyc
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__pycache__/data.cpython-34.pyc
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__pycache__/famafrench.cpython-34.pyc
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__pycache__/fred.cpython-34.pyc
  /anaconda/lib/python3.4/site-packages/pandas_datareader/__pycache__/wb.cpython-34.pyc
  /anaconda/lib/python3.4/site-packages/pandas_datareader/_utils.py
  /anaconda/lib/python3.4/site-packages/pandas_datareader/data.py
  /anaconda/lib/python3.4/site-packages/pandas_datareader/famafrench.py
  /anaconda/lib/python3.4/site-packages/pandas_datareader/fred.py
  /anaconda/lib/python3.4/site-packages/pandas_datareader/wb.py
Proceed (y/n)? y
  Successfully uninstalled pandas-datareader-0.1.1

So I wonder if someone else try a "fresh" install ?

Kind regards

Options parsing with commas from yahoo

xref here: pandas-dev/pandas#9430

ideally a PR should fix here and back-prop to pandas.

Options strike price float conversions

pandas issue here: pandas-dev/pandas#9010, with linked PR

method test_get_quote_string is already defined

Hello,

in test_data.py, method test_get_quote_string is defined twice.
https://github.com/pydata/pandas-datareader/blob/master/pandas_datareader/tests/test_data.py

I find this "issue" using

https://www.codacy.com/

https://landscape.io/ could also help but there are facing some problems with Heroku for now.

Kind regards

Yahoo Datareader RemoteDataError when end date added in Python 3

See stackoverflow description:

http://stackoverflow.com/questions/29296045/pandas-yahoo-datareader-remotedataerror-when-end-date-added?noredirect=1#comment46788752_29296045

Fama French Factors keys should be descriptive

web.DataReader('F-F_Momentum_Factor', 'famafrench') returns a dict of 1,2 pointing to monthly, annual data.
I believe the dicts should indicate about the data.
pandas-dev/pandas#8842

pandas.io.wb.download returned DataFrame does not contain iso_code

The docstring of this method states that the DataFrame contains the iso_code, which is not the case. The column is dropped from the DataFrame before it is returned.

I commented out the above line, but then most values for iso_code are NAN, because of the call to convert_objects. I think it would be useful to include the iso_code in the result, but if the conversion is to be kept, it would need to become part of the index.

Not sure, what would be the best solution here. In case the current behaviour is to be kept, the docstring should be corrected.

I previously reported this here: pandas-dev/pandas#10860

ME breakpoints in fama french factors library

ME_breakpoints.zip is not read because it is a different format.

ME breakpoints includes the breakpoints between the deciles of market cap. For example, what is the 5%, 10%, etc.
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html#Breakpoints

TST: Test class TestGoogle never runs due to bug in pandas.util.get_locales

pandas-dev/pandas#9744

World Bank download country default

Should the default be all to avoid the North American-centricness?

DOC: list the (optional) dependencies

For building the docs, I had to put some extra dependencies in the requirements.txt file (besides pandas), namely lxml, html5lib and beautifulsoup4.

I don't know to which extent they are optional, are hard depencies? (depends on which functions use pandas.read_html I suppose)

But in any case, it should also be documented.

Deprecate data.py

Given the subpackage structure implemented by #69, data.py is really just the api, could move this to the top level so you could import directly from pandas-datareader.

ie:

instead of:

from pandas_datareader import data 
data.Options('aapl','yahoo')

import pandas_datareader as pdr
pdr.Options('aapl','yahoo')

utils should be _utils

There are not public functions in utils so the entire module should probably be marked as private.

Options dispatch in data.py doesn't work correctly when None is given as data_source

Hi,

see

def Options(symbol, data_source=None):
    if data_source is None:
        warnings.warn("Options(symbol) is deprecated, use Options(symbol,"
                        " data_source) instead", FutureWarning)
        data_source = "yahoo"
    elif data_source == "yahoo":
        return YahooOptions(symbol)
    else:
        raise NotImplementedError("currently only yahoo supported")

if data_source is None nothing will be returned because it's like a nested if construct

It should be

    if data_source is None:
        warnings.warn("Options(symbol) is deprecated, use Options(symbol,"
                        " data_source) instead", FutureWarning)
        data_source = "yahoo"

    if data_source == "yahoo":
        return YahooOptions(symbol)
    else:
        raise NotImplementedError("currently only yahoo supported")

[ENH] Fama/French

pandas crashes when trying to read some files from Fama-French whose format is a bit different:

pd.io.data.get_data_famafrench("Portfolios_Formed_on_ME")

---------------------------------------------------------------------------
...
IndexError: index -1 is out of bounds for axis 0 with size 0

PR #56

0.15.2 causing problems with pandas.io.data.Options

I finally traced a problem I was having with options downloads to changes made between version 0.15.1 and version 0.15.2. Probably easiest is just to link the question I posed on Stack Overflow, because it shows the behavior: http://stackoverflow.com/questions/29182526/trouble-with-http-request-from-google-compute-engine

Weirdly, in 0.15.2, I was consistently able to get the options data for large cap companies ('aapl', 'ge' were my typical test cases) but not for small cap companies such as 'spwr' or 'ddd'. Not sure what was changed, but it looks to me like it might have to do with the list of expiration dates or with the handling of empty tables given an expiration date. Right now, in any case, if you hit the link shown in my stack trace (http://finance.yahoo.com/q/op?s=SPWR&date=1430438400), there's an empty table for puts and only 1 call. That would be something that's more common for smaller companies, too. The other possibility is that the initial Options object isn't getting good links in the newer version.

That's about all I know about it, but reverting to 0.15.1 seems to have solved the problems I was having.

http://pandas.pydata.org/pandas-docs/version/0.16.1/pandas.pdf - 25.2 Yahoo! Finance Options

Is this expected?

from pandas.io.data import Options
aapl = Options('aapl', 'yahoo')
data = aapl.get_all_data()
Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda\lib\site-packages\pandas\io\data.py", line 1115, in get_all_data
expiry_dates = self.expiry_dates
File "C:\Anaconda\lib\site-packages\pandas\io\data.py", line 1146, in expiry_dates
expiry_dates, _ = self._get_expiry_dates_and_links()
File "C:\Anaconda\lib\site-packages\pandas\io\data.py", line 1161, in _get_expiry_dates_and_links
root = self._parse_url(url)
File "C:\Anaconda\lib\site-packages\pandas\io\data.py", line 1193, in _parse_url
"{0!r}".format(url))
RemoteDataError: Unable to parse URL 'http://finance.yahoo.com/q/op?s=AAPL'

Add pandas-data reader to pypi

Add Real Time pricing for Yahoo Data

This is a feature request. I think adding real time price quotes would add a lot of value to the pandas-datareader. Anyone agree?

Use find_packages() to automatically find packages

Hello,

in a recent issue I noticed some problem with package
#97

I submited a PR #98
which have been merged by @davidastephens

I was replacing

packages=['pandas_datareader'],

packages=['pandas_datareader',
    'pandas_datareader.google', 'pandas_datareader.yahoo'
],

I wonder why not doing

 from setuptools import setup, find_packages

and

packages=find_packages(exclude=['contrib', 'docs', 'tests*']),

So if we add other package we won't have to modify setup.py

Your opinion ?

Kind regards

pandas v0.17.0rc1

we have some fairly significant changes and would like you to test and provide any feedback: https://github.com/pydata/pandas/releases/tag/v0.17.0rc1

thanks!

Remove Yahoo in _retry_read_url

Hi,

I noticed in utils.py in _retry_read_url this

    # Yahoo! Finance sometimes does this awesome thing where they
    # return 2 rows for the most recent business day
    if len(rs) > 2 and rs.index[-1] == rs.index[-2]:  # pragma: no cover
        rs = rs[:-1]

as it's Yahoo specific it shouldn't be there.

Kind regards

TST: make sure tests work on python 2.6

Retrieving Fama-French results in HTTP 404

Not look into detail because I'm out, filenames seem to be changed. It fails in current pandas master also. See pandas-dev/pandas#10591.

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

Pandas issues that may need migration/support

pandas-dev/pandas#9571

pandas is going to deprecate pandas.io.data fully in 0.17.0

see the issue here

we did a warning in the docs since 0.16.1 IIRC.

Ideally pandas-datareader can do a release soonish would be great.

0.17.0 will probably release mid-Sept.

Google Finance Options chains

Adding support for Google Finance Options chains will be a great idea.

Here is a Matlab script
http://www.mathworks.com/matlabcentral/fileexchange/50368-stock-option-chain-downloader--via-google-

Some Python code is also available here
https://github.com/femtotrader/pandas_datareaders_unofficial/blob/master/pandas_datareaders_unofficial/datareaders/google_finance_options.py

0.15.2 causing problems with pandas.io.data.Options

Identical with #22, as the issue isn't fixed. Are the changes reflected in a version that I can specify to be sure to pick them up? When I install from pip it looks as if the code hasn't been changed. See my latest comments under #22.

A better url builder

Hello,

I noticed that each data reader have it's own way to build URL.

That's sometimes very ugly.

For example there is things like (yahoo/components.py)

idx_mod = idx_sym.replace('^', '@%5E')

the use of urlencode should probably be prefered

pylint doesn't like in yahoo/actions.py (tabulations)

url = (_URL + 's=%s' % symbol + \
            '&a=%s' % (start.month - 1) + \
            '&b=%s' % start.day + \
            '&c=%s' % start.year + \
            '&d=%s' % (end.month - 1) + \
            '&e=%s' % end.day + \
            '&f=%s' % end.year + \
            '&g=v')

and many others... (a global lint would be required)

Maybe we should add in utils a function to build url from a base URL (and endpoint) and a dictionary of parameters and encourage each data reader developer to use it.

Such a function could be

def _encode_url(url, params):
    """
    Return encoded url with parameters
    """
    s_params = urlencode(params)
    if len(s_params)!=0:
        return url + '?' + s_params
    else:
        return url

So in google/daily.py we could have

params = {
    'q': sym,
    'startdate': start.strftime('%b %d, %Y'),
    'enddate': end.strftime('%b %d, %Y'),
    'output': "csv"
}
url = _encode_url(_URL, params)
return _retry_read_url(url, retry_count, pause, 'Google')

instead of

url = "%s%s" % (_URL,
                urlencode({"q": sym,
                           "startdate": start.strftime('%b %d, ' '%Y'),
                           "enddate": end.strftime('%b %d, %Y'),
                           "output": "csv"}))

(I think '%b %d, ' '%Y' is a typo it should be '%b %d, %Y')

or in yahoo/daily.py instead of

url = (_URL + 's=%s' % sym +
       '&a=%s' % (start.month - 1) +
       '&b=%s' % start.day +
       '&c=%s' % start.year +
       '&d=%s' % (end.month - 1) +
       '&e=%s' % end.day +
       '&f=%s' % end.year +
       '&g=%s' % interval +
       '&ignore=.csv')

we could have

params = {
    's': sym,
    'a': start.month - 1,
    'b': start.day,
    'c': start.year,
    'd': end.month - 1,
    'e': end.day,
    'f': end.year,
    'g': interval,
    'ignore': '.csv'
}
url = _encode_url(_URL, params)

WorldBank API url create could also be improved:

in _get_data, instead of

url = ("http://api.worldbank.org/countries/" + countries + "/indicators/" +
       indicator + "?date=" + str(start) + ":" + str(end) +
       "&per_page=25000&format=json")

"http://api.worldbank.org" is often repeat

in such a url there is 3 parts:

base url: "http://api.worldbank.org"
endpoint: "/countries/US/indicators/NY.GNS.ICTR.GN.ZS" (for example)
parameters: date=2012:2015&per_page=25000&format=json

We shouldn't repeat ourself
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself

we could have

global

_BASE_URL = 'http://api.worldbank.org'

in get_data

    endpoint = "/countries/{countries}/indicators/{indicator}"\
        .format(countries=countries, indicator=indicator)
    params = {
        'date': "%s:%s" % (start, end),
        'per_page': 25000,
        'format': 'json'
    }
    url = _encode_url(_BASE_URL + endpoint, params)

Maybe we might define how data reader "should" build URLs.

Kind regards

Enable Sourcegraph

I want to use Sourcegraph code search and code review with pandas-datareader. A project maintainer needs to enable it to set up a webhook so the code is up-to-date there.

Could you please enable pandas-datareader on @sourcegraph by going to https://sourcegraph.com/github.com/pydata/pandas-datareader and clicking on Settings? (It should only take 15 seconds.)

Thank you!

One file per datareader

Hello,

data.py is a very (too) big file.

I think we should have one file per datareader.

Moreover this will help #48

I've done on my fork a branch where each datareader is a file.

$ nosetests -s -v

shows

Ran 58 tests in 83.806s
FAILED (SKIP=7, errors=10, failures=3)

that's exactly same result with this master branch.

I will send a PR.

Kind regards

Pause in _retry_read_url should only happens from the second time

Hi,

I noticed in utils.py in _retry_read_url this

def _retry_read_url(url, retry_count, pause, name):
    """
    Open url (and retry)
    """
    for _ in range(retry_count):
        time.sleep(pause)
        ...

So an unnecessary pause occured from begining (first request)

we might have something like:

def _retry_read_url(url, retry_count, pause, name):
    """
    Open url (and retry)
    """
    for i in range(retry_count):
        if i != 0:
            time.sleep(pause)
        ...

Free intraday stock data with Barcharts

Hello,

http://www.barchart.com/ seems to provide free intraday stock data according
http://www.blackarbs.com/blog/how-to-get-free-intraday-stock-data-with-python-and-barcharts-ondemand-api/9/22/2015
https://gist.github.com/BlackArbsCEO/2394808dcb1f7c1bdd4e#file-free_intraday_stockdata_test_barchart_api_gist-py

It may be worth to add this data source to Pandas DataReader.

@twiecki might be interested

Kind regards

pydata / pandas-datareader Goto Github PK

pandas-datareader's People

Contributors

Stargazers

Watchers

Forkers

pandas-datareader's Issues

Recommend Projects

Recommend Topics

Recommend Org