Giter VIP home page Giter VIP logo

rsafe's Issues

for multiclass task safely_select_variable() gives error

hi, for my multiclass task safely_select_variables() gives following error:
Error in [.data.frame(data, , var_best) : undefined columns selected

Following is a dummy code (isomorphic to my original problem), would you please check my last 5 lines. I think I have messed them up. Thanks

library(tidyverse)
library(mlr3verse)
library(DALEX)
library(DALEXtra)
library(rSAFE)

df=data.frame(v=c(3.4,5.6,1.3,9.8,7.3, 4.6,5.5,2.3,8.9,7.1, 4.9,6.5,2.3,4.1,3.37, 3.4,6.0,2.3,7.8,3.7),
w=c(34,65,23,78,37, 34,65,23,78,37, 34,65,23,78,37, 34,65,23,78,37),
x=c('a','b','a','c','c', 'a','b','a','c','c', 'a','b','a','c','c', 'a','b','a','c','c'),
y=c(TRUE,FALSE,TRUE,TRUE,FALSE, TRUE,FALSE,TRUE,TRUE,FALSE, TRUE,FALSE,TRUE,TRUE,FALSE, TRUE,FALSE,TRUE,TRUE,FALSE),
z=c('alpha','alpha','delta','delta','phi', 'alpha','alpha','delta','delta','phi', 'alpha','alpha','delta','delta','phi', 'alpha','alpha','delta','delta','phi')
)

df_task <- TaskClassif$new(id = "my_df", backend = df, target = "z")
lrn_rf <- GraphLearner$new(po('encode') %>>% lrn("classif.ranger", predict_type = "prob"))
lrn_rf$train(df_task)

lrn_rf_exp <- explain_mlr3(lrn_rf,
data = df,
y = df$z,
label = "rf_exp")
safe_extractor <- safe_extraction(lrn_rf_exp, penalty = 25, verbose = FALSE)
sf_trafo_data <- safely_transform_data(safe_extractor, df, verbose = FALSE)
vars <- safely_select_variables(safe_extractor, sf_trafo_data, which_y = "z", class_pred = 'alpha', verbose = FALSE)

data2 <- safely_transform_data(safe_extractor, df, verbose = FALSE)[,c("z", vars)]

model_lm2 <- lm(z ~ ., data = data2)

Variable roles in tidymodels recipe and workflow... are they respected by rSAFE?

Example (I am playing with bicycle demand data from Kaggle

bike_recipe <- recipe(count ~ . , data = bike_training) %>%
  step_date(datetime, features = c("doy", "dow", "month", "year"), abbr = TRUE) %>%
   update_role("datetime", new_role = "id_variable") %>%
    step_rm("atemp")

will create time features out of the datetime index and then datetime will not take part in modelling.
I also removed "atemp" variable altogether (temp and atemp were strongly correlated). It is not taking part in the modelling either.

Next I run the explainer:

explainer <- explain_tidymodels(bike_final_fit, data = bike_all %>% select(-count), y = bike_all$count)
safe_extractor <- safe_extraction(explainer)

Safe extractor seems to ignore the lack of datetime and atemp in modelling process and proposes:

 Variable 'datetime' - selected intervals:
	(-Inf, 2011-02-16 23:00:00]
 	(2011-02-16 23:00:00, 2011-06-17 23:00:00]
 	(2011-06-17 23:00:00, 2012-04-15 23:00:00]
 	(2012-04-15 23:00:00, 2012-07-08 23:00:00]
 	(2012-07-08 23:00:00, Inf)
Variable 'season' - selected intervals:
	(-Inf, 3]
 	(3, Inf)
Variable 'holiday' - no transformation suggested.
Variable 'workingday' - no transformation suggested.
Variable 'weather' - selected intervals:
	(-Inf, 1]
 	(1, Inf)
Variable 'temp' - selected intervals:
	(-Inf, 12.3]
 	(12.3, 22.96]
 	(22.96, Inf)
Variable 'atemp' - selected intervals:
	(-Inf, 24.24]
 	(24.24, Inf)
Variable 'humidity' - selected intervals:
	(-Inf, 30]
 	(30, 48]
 	(48, 67]
 	(67, 84]
 	(84, Inf)
Variable 'windspeed' - selected intervals:
	(-Inf, 7.0015]
 	(7.0015, Inf)

How to tell rSAFE these two vars (one is time index another has been removed in the bake) are not taking part?
I am attaching my quick and dirty workflow:

timeseries_modelling_xgboost_short.zip
@agosiewska

missing trans function

I've got an error after following lines:

library(rSAFE)
library(randomForest)
library(DALEX)
set.seed(111)
model_rf1 <- randomForest(survived ~ ., data = titanic_imputed)
explainer_rf1 <- explain(model_rf1, data = titanic_imputed, y = titanic_imputed$survived == "yes", label = "rf1")
safe_extractor <- safe_extraction(explainer_rf1, penalty = 25, verbose = TRUE)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.