claytonjy / tidyquandl Goto Github PK

View Code? Open in Web Editor NEW

2.0 3.0 1.0 85 KB

A tidy interface to the Quandl API

Home Page: https://claytonjy.github.io/tidyquandl

License: Other

R 100.00%

r quandl quandl-api api-wrapper tidy-interface

tidyquandl's People

Stargazers

Watchers

Forkers

karagul

tidyquandl's Issues

Fix Issue template

It's a little too tidyverse-specific, e.g.

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/ or https://community.rstudio.com/.

Instruct users how to grab latest release

I think I'm going to continue to do "development" on master, e.g. master's latest commit will often have a version like x.y.z.9000, but I don't plan on submitting to CRAN. We need a way to tell users to "install latest release", hopefully without updating a command like devtools::install_github("claytonjy/[email protected]") each time we bump the release.

From this SO post, sounds like I we can suggest users to do

devtools::install_github("claytonjy/tidyquandl@*release")

which seems pretty dope. Assuming this works, let's update the README to match.

Replace checkmate-validation with rlang

I love the checkmate package, but I'm becoming less convinced it's a good tool for argument validation inside user-facing functions; the error messages just aren't good.

Here's an example as it appears now, with some alternatives:

x <- c("foo", "bar")

tidyquandl::quandl_datatable(x)
#> Error in withCallingHandlers({: Assertion on 'code' failed: Must have length 1.

stopifnot(rlang::is_string(x))
#> Error: rlang::is_string(x) is not TRUE

if (!rlang::is_string(x)) stop("`code` must be a single string")
#> Error in eval(expr, envir, enclos): `code` must be a single string

Not actually sure which of the last two I prefer; hard to write a super-clear error message, and the stopifnot version provides the user a path to dig deeper (e.g. "what does is_string check for? ... ?rlang::is_string")

Could keep using checkmate in tests, just move it to Suggests, so users don't install it as a dependency.

Warning from readr when subsetting columns.

Need to subset col_types internally when user subsets columns via qopts.columns.

library(tidyquandl)

quandl_key_set()

quandl_datatable("WIKI/PRICES", ticker = "AAPL", date = "2018-01-02", qopts.columns = c("ticker", "date", "close"))
#> Warning: The following named parsers don't match the column names: open,
#> high, low, volume, ex-dividend, split_ratio, adj_open, adj_high, adj_low,
#> adj_close, adj_volume
#> # A tibble: 1 x 3
#>   ticker date       close
#>   <chr>  <date>     <dbl>
#> 1 AAPL   2018-01-02  172.

^{Created on 2019-01-13 by the reprex package (v0.2.1)}

Session info

devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  os       Linux Mint 18.3             
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/Detroit             
#>  date     2019-01-13                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                        
#>  assertthat    0.2.0      2017-04-11 [1] CRAN (R 3.4.4)                
#>  backports     1.1.3      2018-12-14 [1] CRAN (R 3.4.4)                
#>  bindr         0.1.1      2018-03-13 [1] CRAN (R 3.4.4)                
#>  bindrcpp      0.2.2      2018-03-29 [1] CRAN (R 3.4.4)                
#>  callr         3.1.1      2018-12-21 [1] CRAN (R 3.4.4)                
#>  cli           1.0.1      2018-09-25 [1] CRAN (R 3.4.4)                
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.4.4)                
#>  curl          3.3        2019-01-10 [1] CRAN (R 3.4.4)                
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.4.4)                
#>  devtools      2.0.1      2018-10-26 [1] CRAN (R 3.4.4)                
#>  digest        0.6.18     2018-10-10 [1] CRAN (R 3.4.4)                
#>  dplyr         0.7.8      2018-11-10 [1] CRAN (R 3.4.4)                
#>  evaluate      0.12       2018-10-09 [1] CRAN (R 3.4.4)                
#>  fansi         0.4.0      2018-10-05 [1] CRAN (R 3.4.4)                
#>  fs            1.2.6      2018-08-23 [1] CRAN (R 3.4.4)                
#>  glue          1.3.0      2018-07-17 [1] CRAN (R 3.4.4)                
#>  highr         0.7        2018-06-09 [1] CRAN (R 3.4.4)                
#>  hms           0.4.2.9000 2018-07-03 [1] Github (tidyverse/hms@2e0a39a)
#>  htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.4.4)                
#>  httr          1.4.0      2018-12-11 [1] CRAN (R 3.4.4)                
#>  jsonlite      1.6        2018-12-07 [1] CRAN (R 3.4.4)                
#>  knitr         1.21       2018-12-10 [1] CRAN (R 3.4.4)                
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.4.4)                
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.4.4)                
#>  pillar        1.3.1      2018-12-15 [1] CRAN (R 3.4.4)                
#>  pkgbuild      1.0.2      2018-10-16 [1] CRAN (R 3.4.4)                
#>  pkgconfig     2.0.2      2018-08-16 [1] CRAN (R 3.4.4)                
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.4.4)                
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.4.4)                
#>  processx      3.2.1      2018-12-05 [1] CRAN (R 3.4.4)                
#>  ps            1.3.0      2018-12-21 [1] CRAN (R 3.4.4)                
#>  purrr         0.2.5      2018-05-29 [1] CRAN (R 3.4.4)                
#>  R6            2.3.0      2018-10-04 [1] CRAN (R 3.4.4)                
#>  Rcpp          1.0.0      2018-11-07 [1] CRAN (R 3.4.4)                
#>  readr         1.3.1      2018-12-21 [1] CRAN (R 3.4.4)                
#>  remotes       2.0.2      2018-10-30 [1] CRAN (R 3.4.4)                
#>  rlang         0.3.1      2019-01-08 [1] CRAN (R 3.4.4)                
#>  rmarkdown     1.11       2018-12-08 [1] CRAN (R 3.4.4)                
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.4.4)                
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.4.4)                
#>  stringi       1.2.4      2018-07-20 [1] CRAN (R 3.4.4)                
#>  stringr       1.3.1      2018-05-10 [1] CRAN (R 3.4.4)                
#>  testthat      2.0.1      2018-10-13 [1] CRAN (R 3.4.4)                
#>  tibble        2.0.1      2019-01-12 [1] CRAN (R 3.4.4)                
#>  tidyquandl  * 0.1.2.9000 2019-01-13 [1] local                         
#>  tidyselect    0.2.5      2018-10-11 [1] CRAN (R 3.4.4)                
#>  usethis       1.4.0      2018-08-14 [1] CRAN (R 3.4.4)                
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 3.4.4)                
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.4.4)                
#>  xfun          0.4        2018-10-23 [1] CRAN (R 3.4.4)                
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.4.4)                
#> 
#> [1] /home/claytonjy/R/x86_64-pc-linux-gnu-library/3.4
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

Create batches in quandl_datatable

The Quandl API can only handle a certain number of parameters in a request, which means you can't specify e.g. 1000 tickers as a filter in one call. This should be abstracted from the user so that any number of tickers (or whatever) can be given, and quandl_datatable will break that into multiple requests (batches) and return results as if it was a single request.

Transpose metadata output

Instead of taking one table-name and producing a long list with a data.frame in it, quandl_datatable_meta() should take a vector of table-names and return a tibble with one row per table and nested tibbles as appropriate.

Type inference can fail from guessing

library(tidyquandl)

quandl_key_set()

quandl_datatable("ZACKS/P", ticker = "AAPL")
#> Warning: 16666 parsing failures.
#>  row  col           expected actual         file
#> 2093 high 1/0/T/F/TRUE/FALSE 1.5759 literal data
#> 2093 low  1/0/T/F/TRUE/FALSE 1.5089 literal data
#> 2094 high 1/0/T/F/TRUE/FALSE 1.5848 literal data
#> 2094 low  1/0/T/F/TRUE/FALSE 1.5536 literal data
#> 2095 high 1/0/T/F/TRUE/FALSE 1.5804 literal data
#> .... .... .................. ...... ............
#> See problems(...) for more details.
#> # A tibble: 8,140 x 12
#>    m_ticker ticker comp_name comp_name_2 exchange currency_code date      
#>    <chr>    <chr>  <chr>     <lgl>       <chr>    <chr>         <date>    
#>  1 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-24
#>  2 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-25
#>  3 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-26
#>  4 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-27
#>  5 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-30
#>  6 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-31
#>  7 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-01
#>  8 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-02
#>  9 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-03
#> 10 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-06
#> # ... with 8,130 more rows, and 5 more variables: open <lgl>, high <lgl>,
#> #   low <lgl>, close <dbl>, volume <dbl>

library(Quandl)
#> Loading required package: xts
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric

tibble::as_tibble(Quandl.datatable("ZACKS/P", ticker = "AAPL", paginate = TRUE))
#> # A tibble: 8,140 x 12
#>    m_ticker ticker comp_name comp_name_2 exchange currency_code date      
#>    <chr>    <chr>  <chr>     <chr>       <chr>    <chr>         <date>    
#>  1 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-24
#>  2 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-25
#>  3 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-26
#>  4 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-27
#>  5 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-30
#>  6 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-31
#>  7 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-01
#>  8 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-02
#>  9 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-03
#> 10 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-06
#> # ... with 8,130 more rows, and 5 more variables: open <dbl>, high <dbl>,
#> #   low <dbl>, close <dbl>, volume <dbl>

^{Created on 2018-09-25 by the reprex package (v0.2.1)}

Session info

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/Detroit             
#>  date     2018-09-25
#> Packages -----------------------------------------------------------------
#>  package    * version    date       source                          
#>  assertthat   0.2.0      2017-04-11 CRAN (R 3.4.4)                  
#>  backports    1.1.2      2017-12-13 CRAN (R 3.4.4)                  
#>  base       * 3.4.4      2018-03-16 local                           
#>  cli          1.0.0      2017-11-05 CRAN (R 3.4.4)                  
#>  compiler     3.4.4      2018-03-16 local                           
#>  crayon       1.3.4      2017-09-16 CRAN (R 3.4.4)                  
#>  curl         3.2        2018-03-28 CRAN (R 3.4.4)                  
#>  datasets   * 3.4.4      2018-03-16 local                           
#>  devtools     1.13.6     2018-06-27 CRAN (R 3.4.4)                  
#>  digest       0.6.17     2018-09-12 cran (@0.6.17)                  
#>  evaluate     0.11       2018-07-17 CRAN (R 3.4.4)                  
#>  fansi        0.3.0      2018-08-13 CRAN (R 3.4.4)                  
#>  glue         1.3.0      2018-07-17 CRAN (R 3.4.4)                  
#>  graphics   * 3.4.4      2018-03-16 local                           
#>  grDevices  * 3.4.4      2018-03-16 local                           
#>  grid         3.4.4      2018-03-16 local                           
#>  hms          0.4.2.9000 2018-07-03 Github (tidyverse/hms@2e0a39a)  
#>  htmltools    0.3.6      2017-04-28 CRAN (R 3.4.4)                  
#>  httr         1.3.1      2017-08-20 CRAN (R 3.4.4)                  
#>  jsonlite     1.5        2017-06-01 CRAN (R 3.4.4)                  
#>  knitr        1.20       2018-02-20 CRAN (R 3.4.4)                  
#>  lattice      0.20-35    2017-03-25 CRAN (R 3.3.3)                  
#>  magrittr     1.5        2014-11-22 CRAN (R 3.4.4)                  
#>  memoise      1.1.0      2017-04-21 CRAN (R 3.4.4)                  
#>  methods    * 3.4.4      2018-03-16 local                           
#>  pillar       1.3.0      2018-07-14 CRAN (R 3.4.4)                  
#>  pkgconfig    2.0.2      2018-08-16 CRAN (R 3.4.4)                  
#>  purrr        0.2.5      2018-05-29 CRAN (R 3.4.4)                  
#>  Quandl     * 2.9.1      2018-08-14 CRAN (R 3.4.4)                  
#>  R6           2.2.2      2017-06-17 CRAN (R 3.4.4)                  
#>  Rcpp         0.12.18    2018-07-23 CRAN (R 3.4.4)                  
#>  readr        1.2.0      2018-07-06 Github (tidyverse/readr@4b2e93a)
#>  rlang        0.2.2      2018-08-16 cran (@0.2.2)                   
#>  rmarkdown    1.10       2018-06-11 CRAN (R 3.4.4)                  
#>  rprojroot    1.3-2      2018-01-03 CRAN (R 3.4.4)                  
#>  stats      * 3.4.4      2018-03-16 local                           
#>  stringi      1.2.4      2018-07-20 CRAN (R 3.4.4)                  
#>  stringr      1.3.1      2018-05-10 CRAN (R 3.4.4)                  
#>  tibble       1.4.2      2018-01-22 CRAN (R 3.4.4)                  
#>  tidyquandl * 0.1.2.0    2018-09-26 local                           
#>  tools        3.4.4      2018-03-16 local                           
#>  utf8         1.1.4      2018-05-24 CRAN (R 3.4.4)                  
#>  utils      * 3.4.4      2018-03-16 local                           
#>  withr        2.1.2      2018-03-15 CRAN (R 3.4.4)                  
#>  xts        * 0.11-0     2018-07-16 CRAN (R 3.4.4)                  
#>  yaml         2.2.0      2018-07-25 CRAN (R 3.4.4)                  
#>  zoo        * 1.8-3      2018-07-16 CRAN (R 3.4.4)

Guess I need to use Quandl's type info after all.

Add support for Timeseries API?

Self-explanatory. Not very high on the list, as the Tables API is generally much better, and fewer and fewer bundles are Timeseries-only.

batch_parameters fails when the longest param's length is the same as the batch_size

There's an indexing problem when the biggest parameter is exactly as long as the batch_size

library(tidyquandl)

tidyquandl:::batch_parameters(list(x = letters), batch_size = length(letters) - 1)
#> $`1`
#> $`1`$x
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
#> [18] "r" "s" "t" "u" "v" "w" "x" "y"
#> 
#> 
#> $`2`
#> $`2`$x
#> [1] "z"
tidyquandl:::batch_parameters(list(x = letters), batch_size = length(letters))
#> Error in long_params[[1]]: subscript out of bounds
tidyquandl:::batch_parameters(list(x = letters), batch_size = length(letters) + 1)
#> [[1]]
#> [[1]]$x
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
#> [18] "r" "s" "t" "u" "v" "w" "x" "y" "z"

Created on 2018-06-13 by the reprex package (v0.2.0).

Only retry for HTTP errors 500+

httr::RETRY retries for 400+, which means it retries for user-side errors. Would be better to only retry for 500+ (Quandl-side errors), though that seems to mean we can't use httr::RETRY.

https://docs.quandl.com/docs/error-codes

Add CI w/ Travis & covr

Can set this up with usethis. Need to do some account-level setup on both services.

Add more references to documentation.

There's a @references tag to Quandl API documentation in the roxygen for quandl_key_set(); should add similar to quandl_datatable as well.

batch size is too big

It's hard to reproduce, but I've had some issues where Quandl returns an "unexpected error", but making the batches smaller fixes it. This is hacky, but I think bringing the batch size down a bit, to e.g. 50, should be better.

Expand batching functionality to multiple parameters

To simplify implementation, batch_parameters() in R/utils.R only works if up to one parameter is longer than batch_size. It should be possible to have this work on multiple parameters; if the batch_size is 10, and you pass in 20 tickers and 20 dates, that can be split into 4 calls:

tickers[1:10] & dates[1:10]
tickers[1:10] & dates[11:20]
tickers[11:20] & dates[1:10]
tickers[11:20] & dates[11:20]

e.g. split each long-param, create batches via cross-product.

Seems like it should be here for completeness, but honestly it seems a bit unlikely for this to be needed in practice.

Allow user control of column types on read

Expose the col_types argument of readr::read_csv() in the signature of quandl_datatable(), so users can specify how they want things to be read.

Overlaps with #22 in that if renaming with qopts.columns, then the col_types should respect those names, which i think means the names in qopts.columns should become the col_names arg to readr::read_csv().

Add a NEWS.md

Start it with usethis and add something, anything to it.

Update DESCRIPTION & README

Pretty bland right now; should at least flesh out the example(s), add some badges, and perhaps opine a bit about the goals of this package and compare to Quandl.

Make pkgdown site.

Should be straightforward: usethis + pkgdown.

Expand auth workflow

quandl_api_key is essentially a clone of Quandl::Quandl.api_key; I think it could be more.

It would be nice to support unsetting and validating of keys. To make this possible, I think we'd need multiple functions, rather than one; will need to be careful to not overcomplicate.

Would also be cool to support better key-import workflows, like reading from an environment variable ("QUANDL_API_KEY") or storing somewhere safer (e.g. w/ keyring).

API key function(s)

Options

Basically the same as Quandl::Quandl.api_key, but don't use missing
separate getter/setter functions. Could have better validation in the setter, e.g. throw out a cheap query that requires a key, but not any particular subscription

Should probably use same option name for backwards-compatibility, at least as long as Quandl::Quandl.datatable is being used underneath.

Add vignette comparing Quandl and tidyquandl.

Demonstrate the differences side-by-side.

Could throw in some speed tests, too, though they seem roughly equal.

Allow for filter-style expressions in query

Wouldn't it be cool if instead of

quandl_datatable("ZACKS/P", ticker = c("AAPL", "GOOGL"), date.gte = "2018-06-01")

we could do

quandl_datatable("ZACKS/P", ticker %in% c("AAPL", "GOOGL"), date >= "2018-06-01")

I have no idea how to do this right now, or how difficult it is, or how useful it is...but I bet it's possible!

Make retry-settings options

Rather than arg-level defaults, tidyquandl should set options like tidyquandl.max_attempts and tidyquandl.timeout, and use those for the argument defaults. Then it's easier for a user to make changes that apply to all their calls, while still overriding on a per-call basis if needed.

Not sure if these need to be set on package-load, or if they can be completely optional, using defaults when not present.

Update NEWS.md

Pretty far behind: http://claytonjy.com/tidyquandl/news/index.html

Add tests of Quandl-equivalence in `test-tables.R`

In the key tests, there's a test that ensures compatibility with Quandl::Quandl.api_key(); similarly, quandl_datatable() should be compared against Quandl::Quandl.datatable in a test or two.

May not be able to compare types, but dimension should be good enough.

Use only free datasets in tests & docs

For reproducibility, examples and tests should run for anyone, even those without any premium subscription.

Unfortunately, this is a short list when restricted to the Tables API, and search seems to only return Timeseries links, though some may also be available via Tables.

The only one I currently know works is "WIKI/PRICES": https://www.quandl.com/databases/WIKIP. Seems like a lot of recent data is missing, but nothing we can do about that.

Add function for accessing table metadata.

Documented here: https://docs.quandl.com/docs/in-depth-usage-1#section-get-table-metadata

Call it quandl_datatable_meta(), use the right path against quandl_api. CSV format looks weird; force it to JSON.

Raw API function(s)

Being a thin wrapper around Quandl::Quandl.datatable is limiting; in particular, we can't easily fix the bad error handling done there (e.g. json-looking error messages with curly braces in them). Need our own lower-level function, probably a bit like Quandl::Quandl.api.

Should we request json, and turn that into a list, then a tibble, or request CSV's and read them with readr? Should do some performance testing, and consider if one or the other does better type inference.

Allow renaming in qopts.column

Related-to/blocked-by #4.

It would be pretty sweet if specifying qopts.columns("ticker", price = "close") (or whatever that arg becomes) would return the close column with price as it's name, like how most dplyr functions do.

claytonjy / tidyquandl Goto Github PK

tidyquandl's People

Stargazers

Watchers

Forkers

tidyquandl's Issues

Recommend Projects

Recommend Topics

Recommend Org