Giter VIP home page Giter VIP logo

pipelearner's People

Contributors

drsimonj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pipelearner's Issues

Issue running example code

I have another issue. This looks to be triggered by the contains("rsquare")) part.

results %>%

  • add_rsquare() %>% 
    
  • select(cv_pairs.id, contains("rsquare")) %>% 
    
  • gather(source, rsquare, contains("rsquare")) %>%
    
  • mutate(source = gsub("rsquare_", "", source)) %>% 
    
  • ggplot(aes(cv_pairs.id, rsquare, color = source)) +
    
  • geom_point() +
    
  • labs(x = "Fold", y = "R Squared")
    

Error in .p(.x[[i]], ...) : argument ".y" is missing, with no default

traceback()
20: .p(.x[[i]], ...)
19: isTRUE(.p(.x[[i]], ...))
18: some(.x, identical, .y)
17: contains("rsquare")
16: eval(expr, envir, enclos)
15: eval(x$expr, data, x$env)
14: FUN(X[[i]], ...)
13: lapply(x, lazy_eval, data = data)
12: lazyeval::lazy_eval(args, names_list)
11: select_vars_(names(.data), dots)
10: select_.data.frame(.data, .dots = lazyeval::lazy_dots(...))
9: select_(.data, .dots = lazyeval::lazy_dots(...))
8: select(., cv_pairs.id, contains("rsquare"))
7: function_list[i]
6: freduce(value, _function_list)
5: _fseq(_lhs)
4: eval(expr, envir, enclos)
3: eval(quote(_fseq(_lhs)), env, env)
2: withVisible(eval(quote(_fseq (_lhs)), env, env))
1: results %>% add_rsquare() %>% select(cv_pairs.id, contains("rsquare")) %>%
gather(source, rsquare, contains("rsquare")) %>% mutate(source = gsub("rsquare_",
"", source)) %>% ggplot(aes(cv_pairs.id, rsquare, color = source))

What do you think is causing this?`

predict shouldn't expand results

Currently, predict.pipelearner assumes that the output of predict on each fit will return a single vector of values. However, this isn't always the case. For example, default settings on predict.rpart when a classification tree is run will return a data frame of predicted probabilities.

Error

pl <- d %>% pipelearner(rpart, am ~ .,
                        minsplit = c(2, 20),
                        maxdepth = c(2, 5),
                        xval     = c(5, 10))
pl %>%
  learn() %>% 
  mutate(
    minsplit = map_dbl(params, "minsplit"),
    maxdepth = map_dbl(params, "maxdepth"),
    xval     = map_dbl(params, "xval"),
    accuracy_train = pmap_dbl(list(fit, train, target), accuracy),
    accuracy_test  = pmap_dbl(list(fit, test,  target), accuracy)
  ) %>% select(minsplit, maxdepth, xval, contains("accuracy"))`
Error in select(., minsplit, maxdepth, xval, contains("accuracy")) : 
  unused arguments (minsplit, maxdepth, xval, contains("accuracy"))

R3.3.2/Win7

Fix deprecated purrr function

Current version of purrr (0.2.3) results in following error being thrown:

`cross_d()` is deprecated; please use `cross_df()` instead. 

Version limits required for dependencies - tibble

As I was going through your example I ran into this issue?
'> results <- d %>%

  • pipelearner(lm, visib ~ .) %>% 
    
  • learn_cvpairs(k = 10) %>% 
    
  • learn()
    

Error: 'as_tibble' is not an exported object from 'namespace:tibble'

traceback()
12: stop(gettextf("'%s' is not an exported object from 'namespace:%s'",
name, getNamespaceName(ns)), call. = FALSE, domain = NA)
11: getExportedValue(pkg, name)
10: tibble::as_tibble
9: pipelearner.data.frame(., lm, visib ~ .)
8: pipelearner(., lm, visib ~ .)
7: function_list[i]
6: freduce(value, _function_list)
5: _fseq(_lhs)
4: eval(expr, envir, enclos)
3: eval(quote(_fseq(_lhs)), env, env)
2: withVisible(eval(quote(_fseq (_lhs)), env, env))
1: d %>% pipelearner(lm, visib ~ .) %>% learn_cvpairs(k = 10) %>%
learn()`

It turned out that I needed a new version of tibble. I thought I got that when I installed tidyverse but I guess I was incorrect.

predict needs to be adjustable

Currently, predict.pipelearner() applies default predict() to each fit. However, sometimes this needs to be adjusted. For example, changing predict to produce probabilities or classes.

Feature: wrapper functions to "predict" and "score"

In general, a user will want to predict values and score/evaluated their fit after learning all models via learn(). The exact functions to do this are many. However, pipeable functions could be written that takes the tibble coming from learn() as well as a function that will take the relevant columns (e.g., test, target, and fit), and output the predicted values. It will then be the responsibility of the user to create a function that accepts these arguments.

e.g.,...

pl %>% learn() %>%
pl_predict("test_hat", FUN = function(test, target, fit) {
   # etc... to produce vectors of fitted values
}) %>%
pl_score("test_rsqr", FUN = function(test, target, test_hat) {
   # etc...
})

Add additional cross validation methods

The functions in resamplr also use modelr::resample objects and includes all cross validation methods from scikit-learn.

It looks like learn_cvpairs can be written the same way as learn_models with arbitrary cross validation functions as long as the cross validation function returns a df with train, test, and .id columns (which is the case with resamplr).

Using pipelearner in functions where "data =" is required

I ran into an issue using gam from the mcgv package with pipelearner. To illustrate, lm can be used like this:

iris %>%
  lm(Sepal.Length ~ Sepal.Width, .)

and so can be piped into pipelearner:

iris %>%
  pipelearner(lm, Sepal.Length ~ Sepal.Width) %>%
  learn()

However, gam requires "data = " explicitly. The analogous expression to lm fails:

iris %>%
  gam(Sepal.Length ~ s(Sepal.Width), .)
Error in eval(expr, envir, enclos) : object 'Sepal.Length' not found

iris %>%
  pipelearner(gam, Sepal.Length ~ s(Sepal.Width)) %>%
  learn()
Error in eval(expr, envir, enclos) : object 'Sepal.Length' not found

This works:

iris %>%
  gam(Sepal.Length ~ s(Sepal.Width), data = .)

Is there a syntax which will allow pipelearner to run gam, or does pipelearner require modification?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.