Hi,
Anther week, another issue by me. Hopefully you guys are ok with that?
To the issue.
I tried running some xgboost tuning where I tried to tune lambda and alpha parameters for the gbtree booster.
library(mlr3)
library(mlr3learners)
library(mlr3tuning)
library(paradox)
lrn_xgboost <- lrn("classif.xgboost")
lrn_xgboost$predict_type <- "prob"
cv5 <- rsmp("cv", folds = 5)
tsk <- mlr_tasks$get("sonar")
xgb_ps <- ParamSet$new(list(
ParamFct$new("booster", levels = c("gbtree")),
ParamDbl$new("eta", lower = 0.003, upper = 0.3),
ParamDbl$new("gamma", lower = 0, upper = 10),
ParamInt$new("max_depth", lower = 3, upper = 20),
ParamDbl$new("colsample_bytree", lower = 0.5, upper = 1),
ParamDbl$new("colsample_bylevel", lower = 0.5, upper = 1),
ParamDbl$new("lambda", lower = 0, upper = 10),
ParamDbl$new("alpha", lower = 0, upper = 10),
ParamDbl$new("subsample", lower = 0.5, upper = 1),
ParamInt$new("nrounds", lower = 20, upper = 100)
))
instance <- TuningInstance$new(
task = tsk,
learner = lrn_xgboost,
resampling = cv5,
measures = msr("classif.auc"),
param_set = xgb_ps,
terminator = term("evals", n_evals = 20)
)
tuner <- TunerRandomSearch$new()
tuner$tune(instance)
this results in error
INFO [11:56:14.498] Starting to tune 10 parameters with '<TunerRandomSearch>' and '<TerminatorEvals>'
INFO [11:56:14.499] Terminator settings: n_evals=20
INFO [11:56:14.534] Evaluating 1 configurations
INFO [11:56:14.536] booster eta gamma max_depth colsample_bytree colsample_bylevel lambda alpha subsample nrounds
INFO [11:56:14.536] gbtree 0.03936435 0.2437546 5 0.6446646 0.7196681 6.194857 9.013528 0.961457 87
Error in (function (xs) :
Assertion on 'xs' failed: Condition for 'lambda' not ok: booster equal gblinear; instead: booster=gbtree.
Clearly lambda and alpha parameters are reserved for the linear booster by mlr3. This is probably because the xgboost function help states they are parameters of the linear booster
2.2. Parameter for Linear Booster
• lambda L2 regularization term on weights. Default: 0
• lambda_bias L2 regularization term on bias. Default: 0
• alpha L1 regularization term on weights. (there is no L1 reg on bias because it is not important). >Default: 0
However this source: https://xgboost.readthedocs.io/en/latest/parameter.html#parameters-for-tree-booster mentioned them as parameters for the tree and dart boosters also.
When a test is run to see if these parameters have an effect on xgboost "gbtree" models run in R:
library(xgboost)
data(agaricus.train, package='xgboost')
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
dtest <- xgb.DMatrix(agaricus.test$data, label = agaricus.test$label)
watchlist <- list(train = dtrain, eval = dtest)
param <- list(booster = "gbtree",
max_depth = 2,
eta = 1,
verbose = 0,
objective = "binary:logistic",
eval_metric = "auc")
set.seed(1)
bst <- xgb.train(param,
dtrain,
nrounds = 2,
watchlist)
[1] train-auc:0.958228 eval-auc:0.960373
[2] train-auc:0.981413 eval-auc:0.979930
param2 <- list(booster = "gbtree",
max_depth = 2,
eta = 1,
verbose = 0,
objective = "binary:logistic",
eval_metric = "auc",
alpha = 100)
set.seed(1)
bst2 <- xgb.train(param2,
dtrain,
nrounds = 2,
watchlist)
[1] train-auc:0.979337 eval-auc:0.980196
[2] train-auc:0.996274 eval-auc:0.995977
param3 <- list(booster = "gbtree",
max_depth = 2,
eta = 1,
verbose = 0,
objective = "binary:logistic",
eval_metric = "auc",
lambda = 1000)
set.seed(1)
bst3 <- xgb.train(param3,
dtrain,
nrounds = 2,
watchlist)
[1] train-auc:0.957067 eval-auc:0.958731
[2] train-auc:0.986000 eval-auc:0.986332
It can be observed they do have an effect on the trained models.
Could you change the dependencies for xgboost learner so that lambda and alpha parameters can be tuned regardless of the booster?
For instance autoxgboost has no such constraints.
Kind regards,
Milan