Giter VIP home page Giter VIP logo

tidyquandl's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

karagul

tidyquandl's Issues

Instruct users how to grab latest release

I think I'm going to continue to do "development" on master, e.g. master's latest commit will often have a version like x.y.z.9000, but I don't plan on submitting to CRAN. We need a way to tell users to "install latest release", hopefully without updating a command like devtools::install_github("claytonjy/[email protected]") each time we bump the release.

From this SO post, sounds like I we can suggest users to do

devtools::install_github("claytonjy/tidyquandl@*release")

which seems pretty dope. Assuming this works, let's update the README to match.

Replace checkmate-validation with rlang

I love the checkmate package, but I'm becoming less convinced it's a good tool for argument validation inside user-facing functions; the error messages just aren't good.

Here's an example as it appears now, with some alternatives:

x <- c("foo", "bar")

tidyquandl::quandl_datatable(x)
#> Error in withCallingHandlers({: Assertion on 'code' failed: Must have length 1.

stopifnot(rlang::is_string(x))
#> Error: rlang::is_string(x) is not TRUE

if (!rlang::is_string(x)) stop("`code` must be a single string")
#> Error in eval(expr, envir, enclos): `code` must be a single string

Not actually sure which of the last two I prefer; hard to write a super-clear error message, and the stopifnot version provides the user a path to dig deeper (e.g. "what does is_string check for? ... ?rlang::is_string")

Could keep using checkmate in tests, just move it to Suggests, so users don't install it as a dependency.

Warning from readr when subsetting columns.

Need to subset col_types internally when user subsets columns via qopts.columns.

library(tidyquandl)

quandl_key_set()

quandl_datatable("WIKI/PRICES", ticker = "AAPL", date = "2018-01-02", qopts.columns = c("ticker", "date", "close"))
#> Warning: The following named parsers don't match the column names: open,
#> high, low, volume, ex-dividend, split_ratio, adj_open, adj_high, adj_low,
#> adj_close, adj_volume
#> # A tibble: 1 x 3
#>   ticker date       close
#>   <chr>  <date>     <dbl>
#> 1 AAPL   2018-01-02  172.

Created on 2019-01-13 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  os       Linux Mint 18.3             
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/Detroit             
#>  date     2019-01-13                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                        
#>  assertthat    0.2.0      2017-04-11 [1] CRAN (R 3.4.4)                
#>  backports     1.1.3      2018-12-14 [1] CRAN (R 3.4.4)                
#>  bindr         0.1.1      2018-03-13 [1] CRAN (R 3.4.4)                
#>  bindrcpp      0.2.2      2018-03-29 [1] CRAN (R 3.4.4)                
#>  callr         3.1.1      2018-12-21 [1] CRAN (R 3.4.4)                
#>  cli           1.0.1      2018-09-25 [1] CRAN (R 3.4.4)                
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.4.4)                
#>  curl          3.3        2019-01-10 [1] CRAN (R 3.4.4)                
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.4.4)                
#>  devtools      2.0.1      2018-10-26 [1] CRAN (R 3.4.4)                
#>  digest        0.6.18     2018-10-10 [1] CRAN (R 3.4.4)                
#>  dplyr         0.7.8      2018-11-10 [1] CRAN (R 3.4.4)                
#>  evaluate      0.12       2018-10-09 [1] CRAN (R 3.4.4)                
#>  fansi         0.4.0      2018-10-05 [1] CRAN (R 3.4.4)                
#>  fs            1.2.6      2018-08-23 [1] CRAN (R 3.4.4)                
#>  glue          1.3.0      2018-07-17 [1] CRAN (R 3.4.4)                
#>  highr         0.7        2018-06-09 [1] CRAN (R 3.4.4)                
#>  hms           0.4.2.9000 2018-07-03 [1] Github (tidyverse/hms@2e0a39a)
#>  htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.4.4)                
#>  httr          1.4.0      2018-12-11 [1] CRAN (R 3.4.4)                
#>  jsonlite      1.6        2018-12-07 [1] CRAN (R 3.4.4)                
#>  knitr         1.21       2018-12-10 [1] CRAN (R 3.4.4)                
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.4.4)                
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.4.4)                
#>  pillar        1.3.1      2018-12-15 [1] CRAN (R 3.4.4)                
#>  pkgbuild      1.0.2      2018-10-16 [1] CRAN (R 3.4.4)                
#>  pkgconfig     2.0.2      2018-08-16 [1] CRAN (R 3.4.4)                
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.4.4)                
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.4.4)                
#>  processx      3.2.1      2018-12-05 [1] CRAN (R 3.4.4)                
#>  ps            1.3.0      2018-12-21 [1] CRAN (R 3.4.4)                
#>  purrr         0.2.5      2018-05-29 [1] CRAN (R 3.4.4)                
#>  R6            2.3.0      2018-10-04 [1] CRAN (R 3.4.4)                
#>  Rcpp          1.0.0      2018-11-07 [1] CRAN (R 3.4.4)                
#>  readr         1.3.1      2018-12-21 [1] CRAN (R 3.4.4)                
#>  remotes       2.0.2      2018-10-30 [1] CRAN (R 3.4.4)                
#>  rlang         0.3.1      2019-01-08 [1] CRAN (R 3.4.4)                
#>  rmarkdown     1.11       2018-12-08 [1] CRAN (R 3.4.4)                
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.4.4)                
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.4.4)                
#>  stringi       1.2.4      2018-07-20 [1] CRAN (R 3.4.4)                
#>  stringr       1.3.1      2018-05-10 [1] CRAN (R 3.4.4)                
#>  testthat      2.0.1      2018-10-13 [1] CRAN (R 3.4.4)                
#>  tibble        2.0.1      2019-01-12 [1] CRAN (R 3.4.4)                
#>  tidyquandl  * 0.1.2.9000 2019-01-13 [1] local                         
#>  tidyselect    0.2.5      2018-10-11 [1] CRAN (R 3.4.4)                
#>  usethis       1.4.0      2018-08-14 [1] CRAN (R 3.4.4)                
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 3.4.4)                
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.4.4)                
#>  xfun          0.4        2018-10-23 [1] CRAN (R 3.4.4)                
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.4.4)                
#> 
#> [1] /home/claytonjy/R/x86_64-pc-linux-gnu-library/3.4
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

Create batches in quandl_datatable

The Quandl API can only handle a certain number of parameters in a request, which means you can't specify e.g. 1000 tickers as a filter in one call. This should be abstracted from the user so that any number of tickers (or whatever) can be given, and quandl_datatable will break that into multiple requests (batches) and return results as if it was a single request.

Transpose metadata output

Instead of taking one table-name and producing a long list with a data.frame in it, quandl_datatable_meta() should take a vector of table-names and return a tibble with one row per table and nested tibbles as appropriate.

Type inference can fail from guessing

library(tidyquandl)

quandl_key_set()

quandl_datatable("ZACKS/P", ticker = "AAPL")
#> Warning: 16666 parsing failures.
#>  row  col           expected actual         file
#> 2093 high 1/0/T/F/TRUE/FALSE 1.5759 literal data
#> 2093 low  1/0/T/F/TRUE/FALSE 1.5089 literal data
#> 2094 high 1/0/T/F/TRUE/FALSE 1.5848 literal data
#> 2094 low  1/0/T/F/TRUE/FALSE 1.5536 literal data
#> 2095 high 1/0/T/F/TRUE/FALSE 1.5804 literal data
#> .... .... .................. ...... ............
#> See problems(...) for more details.
#> # A tibble: 8,140 x 12
#>    m_ticker ticker comp_name comp_name_2 exchange currency_code date      
#>    <chr>    <chr>  <chr>     <lgl>       <chr>    <chr>         <date>    
#>  1 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-24
#>  2 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-25
#>  3 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-26
#>  4 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-27
#>  5 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-30
#>  6 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-03-31
#>  7 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-01
#>  8 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-02
#>  9 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-03
#> 10 AAPL     AAPL   APPLE INC NA          NSDQ     USD           1987-04-06
#> # ... with 8,130 more rows, and 5 more variables: open <lgl>, high <lgl>,
#> #   low <lgl>, close <dbl>, volume <dbl>

library(Quandl)
#> Loading required package: xts
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric

tibble::as_tibble(Quandl.datatable("ZACKS/P", ticker = "AAPL", paginate = TRUE))
#> # A tibble: 8,140 x 12
#>    m_ticker ticker comp_name comp_name_2 exchange currency_code date      
#>    <chr>    <chr>  <chr>     <chr>       <chr>    <chr>         <date>    
#>  1 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-24
#>  2 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-25
#>  3 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-26
#>  4 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-27
#>  5 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-30
#>  6 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-03-31
#>  7 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-01
#>  8 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-02
#>  9 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-03
#> 10 AAPL     AAPL   APPLE INC <NA>        NSDQ     USD           1987-04-06
#> # ... with 8,130 more rows, and 5 more variables: open <dbl>, high <dbl>,
#> #   low <dbl>, close <dbl>, volume <dbl>

Created on 2018-09-25 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/Detroit             
#>  date     2018-09-25
#> Packages -----------------------------------------------------------------
#>  package    * version    date       source                          
#>  assertthat   0.2.0      2017-04-11 CRAN (R 3.4.4)                  
#>  backports    1.1.2      2017-12-13 CRAN (R 3.4.4)                  
#>  base       * 3.4.4      2018-03-16 local                           
#>  cli          1.0.0      2017-11-05 CRAN (R 3.4.4)                  
#>  compiler     3.4.4      2018-03-16 local                           
#>  crayon       1.3.4      2017-09-16 CRAN (R 3.4.4)                  
#>  curl         3.2        2018-03-28 CRAN (R 3.4.4)                  
#>  datasets   * 3.4.4      2018-03-16 local                           
#>  devtools     1.13.6     2018-06-27 CRAN (R 3.4.4)                  
#>  digest       0.6.17     2018-09-12 cran (@0.6.17)                  
#>  evaluate     0.11       2018-07-17 CRAN (R 3.4.4)                  
#>  fansi        0.3.0      2018-08-13 CRAN (R 3.4.4)                  
#>  glue         1.3.0      2018-07-17 CRAN (R 3.4.4)                  
#>  graphics   * 3.4.4      2018-03-16 local                           
#>  grDevices  * 3.4.4      2018-03-16 local                           
#>  grid         3.4.4      2018-03-16 local                           
#>  hms          0.4.2.9000 2018-07-03 Github (tidyverse/hms@2e0a39a)  
#>  htmltools    0.3.6      2017-04-28 CRAN (R 3.4.4)                  
#>  httr         1.3.1      2017-08-20 CRAN (R 3.4.4)                  
#>  jsonlite     1.5        2017-06-01 CRAN (R 3.4.4)                  
#>  knitr        1.20       2018-02-20 CRAN (R 3.4.4)                  
#>  lattice      0.20-35    2017-03-25 CRAN (R 3.3.3)                  
#>  magrittr     1.5        2014-11-22 CRAN (R 3.4.4)                  
#>  memoise      1.1.0      2017-04-21 CRAN (R 3.4.4)                  
#>  methods    * 3.4.4      2018-03-16 local                           
#>  pillar       1.3.0      2018-07-14 CRAN (R 3.4.4)                  
#>  pkgconfig    2.0.2      2018-08-16 CRAN (R 3.4.4)                  
#>  purrr        0.2.5      2018-05-29 CRAN (R 3.4.4)                  
#>  Quandl     * 2.9.1      2018-08-14 CRAN (R 3.4.4)                  
#>  R6           2.2.2      2017-06-17 CRAN (R 3.4.4)                  
#>  Rcpp         0.12.18    2018-07-23 CRAN (R 3.4.4)                  
#>  readr        1.2.0      2018-07-06 Github (tidyverse/readr@4b2e93a)
#>  rlang        0.2.2      2018-08-16 cran (@0.2.2)                   
#>  rmarkdown    1.10       2018-06-11 CRAN (R 3.4.4)                  
#>  rprojroot    1.3-2      2018-01-03 CRAN (R 3.4.4)                  
#>  stats      * 3.4.4      2018-03-16 local                           
#>  stringi      1.2.4      2018-07-20 CRAN (R 3.4.4)                  
#>  stringr      1.3.1      2018-05-10 CRAN (R 3.4.4)                  
#>  tibble       1.4.2      2018-01-22 CRAN (R 3.4.4)                  
#>  tidyquandl * 0.1.2.0    2018-09-26 local                           
#>  tools        3.4.4      2018-03-16 local                           
#>  utf8         1.1.4      2018-05-24 CRAN (R 3.4.4)                  
#>  utils      * 3.4.4      2018-03-16 local                           
#>  withr        2.1.2      2018-03-15 CRAN (R 3.4.4)                  
#>  xts        * 0.11-0     2018-07-16 CRAN (R 3.4.4)                  
#>  yaml         2.2.0      2018-07-25 CRAN (R 3.4.4)                  
#>  zoo        * 1.8-3      2018-07-16 CRAN (R 3.4.4)

Guess I need to use Quandl's type info after all.

Add support for Timeseries API?

Self-explanatory. Not very high on the list, as the Tables API is generally much better, and fewer and fewer bundles are Timeseries-only.

batch_parameters fails when the longest param's length is the same as the batch_size

There's an indexing problem when the biggest parameter is exactly as long as the batch_size

library(tidyquandl)

tidyquandl:::batch_parameters(list(x = letters), batch_size = length(letters) - 1)
#> $`1`
#> $`1`$x
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
#> [18] "r" "s" "t" "u" "v" "w" "x" "y"
#> 
#> 
#> $`2`
#> $`2`$x
#> [1] "z"
tidyquandl:::batch_parameters(list(x = letters), batch_size = length(letters))
#> Error in long_params[[1]]: subscript out of bounds
tidyquandl:::batch_parameters(list(x = letters), batch_size = length(letters) + 1)
#> [[1]]
#> [[1]]$x
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
#> [18] "r" "s" "t" "u" "v" "w" "x" "y" "z"

Created on 2018-06-13 by the reprex package (v0.2.0).

batch size is too big

It's hard to reproduce, but I've had some issues where Quandl returns an "unexpected error", but making the batches smaller fixes it. This is hacky, but I think bringing the batch size down a bit, to e.g. 50, should be better.

Expand batching functionality to multiple parameters

To simplify implementation, batch_parameters() in R/utils.R only works if up to one parameter is longer than batch_size. It should be possible to have this work on multiple parameters; if the batch_size is 10, and you pass in 20 tickers and 20 dates, that can be split into 4 calls:

  • tickers[1:10] & dates[1:10]
  • tickers[1:10] & dates[11:20]
  • tickers[11:20] & dates[1:10]
  • tickers[11:20] & dates[11:20]

e.g. split each long-param, create batches via cross-product.

Seems like it should be here for completeness, but honestly it seems a bit unlikely for this to be needed in practice.

Allow user control of column types on read

Expose the col_types argument of readr::read_csv() in the signature of quandl_datatable(), so users can specify how they want things to be read.

Overlaps with #22 in that if renaming with qopts.columns, then the col_types should respect those names, which i think means the names in qopts.columns should become the col_names arg to readr::read_csv().

Add a NEWS.md

Start it with usethis and add something, anything to it.

Update DESCRIPTION & README

Pretty bland right now; should at least flesh out the example(s), add some badges, and perhaps opine a bit about the goals of this package and compare to Quandl.

Expand auth workflow

quandl_api_key is essentially a clone of Quandl::Quandl.api_key; I think it could be more.

It would be nice to support unsetting and validating of keys. To make this possible, I think we'd need multiple functions, rather than one; will need to be careful to not overcomplicate.

Would also be cool to support better key-import workflows, like reading from an environment variable ("QUANDL_API_KEY") or storing somewhere safer (e.g. w/ keyring).

API key function(s)

Options

  1. Basically the same as Quandl::Quandl.api_key, but don't use missing
  2. separate getter/setter functions. Could have better validation in the setter, e.g. throw out a cheap query that requires a key, but not any particular subscription

Should probably use same option name for backwards-compatibility, at least as long as Quandl::Quandl.datatable is being used underneath.

Allow for filter-style expressions in query

Wouldn't it be cool if instead of

quandl_datatable("ZACKS/P", ticker = c("AAPL", "GOOGL"), date.gte = "2018-06-01")

we could do

quandl_datatable("ZACKS/P", ticker %in% c("AAPL", "GOOGL"), date >= "2018-06-01")

?

I have no idea how to do this right now, or how difficult it is, or how useful it is...but I bet it's possible!

Make retry-settings options

Rather than arg-level defaults, tidyquandl should set options like tidyquandl.max_attempts and tidyquandl.timeout, and use those for the argument defaults. Then it's easier for a user to make changes that apply to all their calls, while still overriding on a per-call basis if needed.

Not sure if these need to be set on package-load, or if they can be completely optional, using defaults when not present.

Add tests of Quandl-equivalence in `test-tables.R`

In the key tests, there's a test that ensures compatibility with Quandl::Quandl.api_key(); similarly, quandl_datatable() should be compared against Quandl::Quandl.datatable in a test or two.

May not be able to compare types, but dimension should be good enough.

Use only free datasets in tests & docs

For reproducibility, examples and tests should run for anyone, even those without any premium subscription.

Unfortunately, this is a short list when restricted to the Tables API, and search seems to only return Timeseries links, though some may also be available via Tables.

The only one I currently know works is "WIKI/PRICES": https://www.quandl.com/databases/WIKIP. Seems like a lot of recent data is missing, but nothing we can do about that.

Raw API function(s)

Being a thin wrapper around Quandl::Quandl.datatable is limiting; in particular, we can't easily fix the bad error handling done there (e.g. json-looking error messages with curly braces in them). Need our own lower-level function, probably a bit like Quandl::Quandl.api.

Should we request json, and turn that into a list, then a tibble, or request CSV's and read them with readr? Should do some performance testing, and consider if one or the other does better type inference.

Allow renaming in qopts.column

Related-to/blocked-by #4.

It would be pretty sweet if specifying qopts.columns("ticker", price = "close") (or whatever that arg becomes) would return the close column with price as it's name, like how most dplyr functions do.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.