tidymodels / probably Goto Github PK

View Code? Open in Web Editor NEW

110.0 9.0 12.0 15.59 MB

Tools for post-processing class probability estimates

Home Page: https://probably.tidymodels.org/

License: Other

R 100.00%

probably's Introduction

probably

Introduction

probably contains tools to facilitate activities such as:

Conversion of probabilities to discrete class predictions.
Investigating and estimating optimal probability thresholds.
Calibration assessments and remediation for classification and regression models.
Inclusion of equivocal zones where the probabilities are too uncertain to report a prediction.

Installation

You can install probably from CRAN with:

install.packages("probably")

You can install the development version of probably from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/probably")

Examples

Good places to look for examples of using probably are the vignettes.

vignette("equivocal-zones", "probably") discusses the new class_pred class that probably provides for working with equivocal zones.
vignette("where-to-use", "probably") discusses how probably fits in with the rest of the tidymodels ecosystem, and provides an example of optimizing class probability thresholds.

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

For questions and discussions about tidymodels packages, modeling, and machine learning, please post on RStudio Community.
If you think you have encountered a bug, please submit an issue.
Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.
Check out further details on contributing guidelines for tidymodels packages and how to get help.

probably's People

Contributors

Stargazers

Watchers

Forkers

guhjy iannimuliterno szego dgrtwo otavioacm astamm edgararuiz egarpor jameshwade han-tun minghao2016 matteo-fontana

probably's Issues

Add length checks to class_pred constructor

Upkeep for probably

2023

Necessary:

Update copyright holder in DESCRIPTION: person(given = "Posit Software, PBC", role = c("cph", "fnd"))
Double check license file uses '[package] authors' as copyright holder. Run use_mit_license()
~~Update logo (https://github.com/rstudio/hex-stickers); run use_tidy_logo()~~
usethis::use_tidy_coc()
usethis::use_tidy_github_actions()

Optional:

Review 2022 checklist to see if you completed the pkgdown updates
Prefer pak::pak("org/pkg") over devtools::install_github("org/pkg") in README
Consider running use_tidy_dependencies() and/or replace compat files with use_standalone()
~~use_standalone("r-lib/rlang", "types-check") instead of home grown argument checkers~~
Add alt-text to pictures, plots, etc; see https://posit.co/blog/knitr-fig-alt/ for examples

make required_pkgs methods for calibration estimation functions

We should add S3 methods for required_pkgs for our estimation objects. Here's an example from recipes. We'll need to import that generic from the generics pack and then reexport

Optimize the prediction threshold so that minimizes the costs associated with type I and II errors

There is an equation that associated type I and II error rates with the cost of committing them.
For example, in a customer churn problem, the cost of type-I error could be $10 and the cost of type-II error be $200, because of the promotions and missed income, respectively. There is a short equation which its derivative, when projected to a corner of the ROC curve, gives the optimal threshold for minimizing the aforementioned cost.

I've read about this method in a paper few years ago, but can't find it now. Is this equation familiar to anybody?

cal_validate_* needs a better message for tuning objects

We deliberately restrict tune objects from being used here (an exception) but we need a better error message:

library(tidymodels)
library(probably)
#> 
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#> 
#>     as.factor, as.ordered
library(bonsai)

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

set.seed(1345)
cls_train <- sim_classification(1000)
cls_test  <- sim_classification( 500)
cls_calib <- sim_classification( 500)

set.seed(7378)
cls_rs <- vfold_cv(cls_train)

lgb_spec <- boost_tree() %>% set_mode("classification") %>% set_engine("lightgbm")

cls_metrics <- metric_set(brier_class, roc_auc)

set.seed(6929)
lgb_tune_res <-
  boost_tree(min_n = tune()) %>%
  set_mode("classification") %>%
  set_engine("lightgbm") %>%
  tune_grid(
    class ~ .,
    resamples = cls_rs,
    control = control_resamples(save_pred = TRUE),
    metrics = cls_metrics,
    grid = tibble(min_n = c(2, 50))
  )

lgb_tune_res %>% cal_validate_logistic()
#> Error in UseMethod("cal_validate_logistic"): no applicable method for 'cal_validate_logistic' applied to an object of class "c('tune_results', 'tbl_df', 'tbl', 'data.frame')"

^{Created on 2023-03-21 by the reprex package (v2.0.1)}

Move `master` branch to `main`

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

Help us firm up the list of targetted repositories
Make sure all maintainers are aware of what's coming
Give us an issue to close when the job is done
Give us a place to put advice for collaborators re: how to adapt

message id: euphoric_snowdog

Estimation with resample_results objects

When we run cal_estimate_*() on an object produced by fit_resamples(), it records the single value of .config since that object also inherits as "tune_results".

When cal_apply() is run on this object using a single set of predictions, there will be no .config column.

We should take a different approach for objects with class "resample_results" by first running collect_predictions() and then running cal_apply() on that.

use a 'parameters' argument for some of the tune_results calibration methods

If we have group_by for tune objects, we might not want to calibrate all of the model configurations. collect_predictions() has a parameters argument and we might want to add a similar argument to the cal_estimate_*() functions.

add a smooth option for multinomial estimation

Analogous for what we did in cal_estimate_logistic(), we can use mgcv for multinomial data.

So if we we had classes a, b, and c, the gam formula would be outcome ~ s(a) + s(b).

Can threshold_perf() be used for multiclass?

https://twitter.com/dvaughan32/status/1076299165083287552

convert resamples when passing in resampling results to the validate functions

If we've already resampled a model, we current;y have to collect the predictions and generate another rsample object to pass to cal_validate_*().

We can convert the original rsample results so that the data in the splits are the predictions instead of the original training data.

Release probably 0.0.3

Prepare for release:

Submit to CRAN:

usethis::use_version('patch')
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()
Tweet

Release probably 0.1.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

For `two_class_pred`, buffer should be able to take a vector of length 2

This way we dont need range

calibration print methods for tuning results

Right now, we should have the print output reflect that there are multiple models being calibrated. Also, the sample size is not reported correctly:

library(tidymodels)
library(probably)
#> 
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#> 
#>     as.factor, as.ordered

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

set.seed(1)
tr_dat <- sim_classification(1000)
te_dat <- sim_classification(1000)

set.seed(2)
folds <- vfold_cv(tr_dat)

svm_spec <- svm_rbf(rbf_sigma = tune()) %>% set_mode("classification")

ctrl <- control_grid(save_pred = TRUE)
cls_metrics <- metric_set(brier_class, roc_auc, mcc)
grid <- tibble(rbf_sigma = 10^c(-5:-3))

set.seed(3)
svm_res <-
  svm_spec %>%
  tune_grid(
    class ~ .,
    resamples = folds,
    control = ctrl,
    metrics = cls_metrics,
    grid = grid
  )

best_two <-
  show_best(svm_res, metric = "brier_class", n = 2) %>%
  select(rbf_sigma, .config)

svm_beta_res <-
  svm_res %>%
  cal_estimate_beta(class, parameters = best_two)

svm_beta_res
#> 
#> ── Probability Calibration
#> Method: Beta
#> Type: Binary
#> Train set size: 2,000
#> Truth variable: `class`
#> Estimate variables:
#> `.pred_class_1` ==> class_1
#> `.pred_class_2` ==> class_2

^{Created on 2023-01-02 by the reprex package (v2.0.1)}

Rename threshold_data()?

I don't think this describes what it does very well. Maybe threshold_performance() or threshold_perf()? Something that tells the user it is for calculating performance at different values

Might need ord_class_pred after all

Want to implement:

vec_proxy_equal.class_pred so you can do .class_pred == "[EQ]"

vec_proxy_compare.class_pred throws an error because you should not be able to do things like <, or min() with a class_pred

but should there be:

vec_proxy_compare.ord_class_pred that does allow you to do those comparisons?

Update vignette for recent releases of probably and yardstick

The where-to-use vignette is now a bit out-of-date in how it discusses probably and yardstick working together and needs to be updated, both in code and narrative.

For example, comments like this:

probably/vignettes/where-to-use.Rmd

Lines 137 to 138 in 98fbe4a

 # Currently yardstick can't deal with the class_pred objects that come from 

 # probably, but it will be able to soon!

issue with make_two_class_pred() with glm model

I am using glm model from parsnip / workflow scheme.

To make it works properly, I did the following things:

data <- read_csv("aha.csv", col_types = cols()) %>% 
  # (1) glm requires a failure to be the 1st level
  mutate(truth = factor(buyer_90d, levels = c(0, 1), labels = c("NB", "Buyer")))

library(yardstick)
# (2) reconfigure yardstick to take success from the 2nd level
options(yardstick.event_first = FALSE)

But when I tried to make hard predictions via make_two_class_pred() it's failed.
It seems like it used 1st level as a success despite the statement in point (2).

To fix it I wrote code lie this:

data_pred <- data_pred %>%
  mutate(.pred = make_two_class_pred(estimate = .pred_Buyer, 
                                     levels = levels(truth), 
                                     threshold = max_j_index_thre) %>% probably::as.factor(),
         .pred2 = if_else(.pred_Buyer > max_j_index_thre, "Buyer", "NB") %>% fct_relevel("NB"))

no trend line in regression plots with smooth = FALSE

library(tidymodels)
library(probably)
#> 
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#> 
#>     as.factor, as.ordered

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

solubility_test %>% cal_plot_regression(solubility, estimate = prediction, smooth = TRUE)

solubility_test %>% cal_plot_regression(solubility, estimate = prediction, smooth = FALSE)

^{Created on 2023-04-29 with reprex v2.0.2}

Other things probably will eventually do

Pulled from the current readme:

exploration and visualization of class probability distributions
recalibration of class probabilities

A general function for equivocal zones

Right now the functions that make class_pred objects assume that we are using the predicted class probabilities. There are some cases where a different column could be used (e.g. the standard error of predictions).

Can we have a general function to take a column (one, I think) and apply some rule to indicate if the same should be reported or not.

move to testthat edition 3

Thoughts on when probability calibration should occur

Really excited to see the post-prediction probability calibration coming to tidymodels soon. As a user, I'm curious when post-prediction calibration would/should occur.

I'm incline to think that it would be available within the model resampling workflow e.g. 'step_calibrate()'

Or is this solely to be intended for use after a finalized model has been established?

Cheers, and thanks for the awesome work.

Add a model "calibration" step.

Hello Max,

Firstly, thanks to both of you for this excellent reference material about tidymodels.

I have just checked if there is any section about model calibration but I could not find any. Since you covered that in caret, are you considering to include it as another possible step in a model workflow ?.

Also, it would be fantastic if you could include some kind of suggestion to calibrate multi-class models, in the same was as you already covered performance metrics for this kind of models.

Thanks again!
Carlos Ortega

Upkeep for probably

Pre-history

usethis::use_readme_rmd()
usethis::use_roxygen_md()
usethis::use_github_links()
usethis::use_pkgdown_github_pages()
usethis::use_tidy_github_labels()
usethis::use_tidy_style()
usethis::use_tidy_description()
urlchecker::url_check()

2020

usethis::use_package_doc()
Consider letting usethis manage your @importFrom directives here.
usethis::use_import_from() is handy for this.
usethis::use_testthat(3) and upgrade to 3e, testthat 3e vignette
Align the names of R/ files and test/ files for workflow happiness.
The docs for usethis::use_r() include a helpful script.
usethis::rename_files() may be be useful.

2021

usethis::use_tidy_dependencies()
usethis::use_tidy_github_actions() and update artisanal actions to use setup-r-dependencies
Remove check environments section from cran-comments.md
Bump required R version in DESCRIPTION to 3.5
Use lifecycle instead of artisanal deprecation messages, as described in Communicate lifecycle changes in your functions
Make sure RStudio appears in Authors@R of DESCRIPTION like so, if appropriate:
person("RStudio", role = c("cph", "fnd"))

2022

usethis::use_tidy_coc()
Handle and close any still-open master --> main issues
Update README badges, instructions in r-lib/usethis#1594
Update errors to rlang 1.0.0. Helpful guides:
https://rlang.r-lib.org/reference/topic-error-call.html
https://rlang.r-lib.org/reference/topic-error-chaining.html
https://rlang.r-lib.org/reference/topic-condition-formatting.html
Update pkgdown site using instructions at https://tidytemplate.tidyverse.org
Ensure pkgdown development is mode: auto in pkgdown config
Re-publish released site; see How to update a released site

don't use .config for grouping when there are zero or one groups

So if a resampling object is used in a calibration estimation function, it will group by .config and that means that

plots have a single facet.
.config is required for cal_apply() for new data.

Release probably 0.0.6

Prepare for release:

devtools::build_readme()
Check current CRAN check results
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
revdepcheck::revdep_check(num_workers = 4)
Update cran-comments.md
Polish NEWS
Review pkgdown reference index for, e.g., missing topics

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()

Release probably 0.0.4

Prepare for release:

Check current CRAN check results
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
revdepcheck::revdep_check(num_workers = 4)
Update cran-comments.md
Polish NEWS
Review pkgdown reference index for, e.g., missing topics

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()

split out docs for calibration plot functions

There are a lot of S3 methods for each and currently very little details on what they do.

Release probably 0.0.5

Prepare for release:

devtools::build_readme()
Check current CRAN check results
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
revdepcheck::revdep_check(num_workers = 4)
Update cran-comments.md
Polish NEWS
Review pkgdown reference index for, e.g., missing topics

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()

Is there a way to take our `tune_results` object and coerce it into the right format to use here?

    Is there a way to take our `tune_results` object and coerce it into the right format to use here?

In the tune_results structure, the data object in the elements of splits are the training data and these new validate functions requires the predictors to be there.

We should talk about this in the Details section (that they are not the same) and maybe have a helper function to convert the original results to something that cal_validate_ * can consume.

I don't see a simple way to keep the same resampling structure as the original tune_results object.

Originally posted by @topepo in #63 (comment)

update calibration functions so that regression can also be used

Open the way for calibration of regression and censored regression models.

In some cases (e.g. cal_apply()), we will need to look at the outcome data to discern what type of calibration is being done and add switches internally to direct the code to the appropriate internal functions. In most other cases, the function name will indicator what they are used for (e.g. cal_*_logistic()).

The internal functions should have something in their name indicating the model of model that is being calibrated (say _cls, _reg, or _cens).

Interpolating classifiers

Linking to this as a potential idea as discussed previously

http://people.inf.elte.hu/kiss/12dwhdm/roc.pdf
Section 10

move away from using stop()

In favor of rlang::abort()

Is it possible to specify a metric_set within threshold_perf?

By default, it looks at sens, spec, j_index, and distance. Is it possible to look at other measures across a range of threshold probabilities?

For example, accuracy, positive predictive value (precision), negative predictive value would be great to be able to access.

One way to accomplish this would be to embed a metric_set argument within threshold_perf().

Thanks for your consideration!

Karandeep

Make a .by argument for plots

Currently, the plots can get different panels via group_by(). It would be better to have a .by argument to get the same results.

(see #92 also)

group-by on numeric data with calibration

I've only tested this with cal_plot_breaks() but grouping by a numeric variable fails.

library(tidymodels)
library(probably)
#> 
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#> 
#>     as.factor, as.ordered

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

set.seed(1)
tr_dat <- sim_classification(1000)
te_dat <- sim_classification(1000)

set.seed(2)
folds <- vfold_cv(tr_dat)

svm_spec <- svm_rbf(rbf_sigma = tune()) %>% set_mode("classification")

ctrl <- control_grid(save_pred = TRUE)
cls_metrics <- metric_set(brier_class, roc_auc, mcc)
grid <- tibble(rbf_sigma = 10^c(-5:-3))

set.seed(3)
svm_res <-
  svm_spec %>%
  tune_grid(
    class ~ .,
    resamples = folds,
    control = ctrl,
    metrics = cls_metrics,
    grid = grid
  )

best_two <-
  show_best(svm_res, metric = "brier_class", n = 2) %>%
  select(rbf_sigma, .config)

svm_pred <- collect_predictions(svm_res, parameters = best_two)
svm_pred
#> # A tibble: 2,000 × 8
#>    id     .pred_class_1 .pred_class_2  .row rbf_sigma .pred_class class  .config
#>    <chr>          <dbl>         <dbl> <int>     <dbl> <fct>       <fct>  <chr>  
#>  1 Fold01         0.804       0.196      11   0.00001 class_2     class… Prepro…
#>  2 Fold01         0.114       0.886      15   0.00001 class_2     class… Prepro…
#>  3 Fold01         0.356       0.644      21   0.00001 class_2     class… Prepro…
#>  4 Fold01         0.676       0.324      25   0.00001 class_2     class… Prepro…
#>  5 Fold01         0.726       0.274      41   0.00001 class_2     class… Prepro…
#>  6 Fold01         0.978       0.0217     54   0.00001 class_2     class… Prepro…
#>  7 Fold01         0.992       0.00806    57   0.00001 class_2     class… Prepro…
#>  8 Fold01         0.156       0.844      66   0.00001 class_2     class… Prepro…
#>  9 Fold01         0.981       0.0194     80   0.00001 class_2     class… Prepro…
#> 10 Fold01         0.969       0.0309     89   0.00001 class_2     class… Prepro…
#> # … with 1,990 more rows

svm_pred %>%
  group_by(.config) %>%   # <- works
  cal_plot_breaks(class, .pred_class_1)

svm_pred %>%
  group_by(rbf_sigma) %>%   # <- doesn't
  cal_plot_breaks(class, .pred_class_1)
#> Error in `geom_ribbon()` at probably/R/cal-plot.R:425:4:
#> ! Problem while converting geom to grob.
#> ℹ Error occurred in the 4th layer.
#> Caused by error in `draw_group()`:
#> ! Aesthetics can not vary along a ribbon

^{Created on 2023-01-02 by the reprex package (v2.0.1)}

Remove `warn_lossy_cast()` use

For the short term:

Add a remove on vctrs
Remove warn_lossy_cast() and replace with maybe_lossy_cast() or allow_lossy_cast() (not sure what to use yet)
Bump required Imports on vctrs to current dev version

When ready for CRAN, after the vctrs version has been submit:

Remove remote on vctrs
Bump required Imports version on vctrs to new cran version

Error: package or namespace load failed for ‘probably’: object ‘vec_unspecified_cast’ is not exported by 'namespace:vctrs'

Hi,

A note to say that a recent install from GitHub leads to the following error on load

library(probably)
Error: package or namespace load failed for ‘probably’:
object ‘warn_lossy_cast’ is not exported by 'namespace:vctrs'

Package vctrs is up-to-date

Cheers,

John

better detection of groups/tuning parameters

When using data frames generated from the tune_*() functions, we silently produce a single plot/analysis if the user doesn't correctly specify what they want.

We should detect this (when there is more than one config) and produce a meaningful error.

Also, the plot functions have a group argument and the estimation functions require group_by(). That's confusing.

Example:

library(tidymodels)
library(probably)
#> 
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#> 
#>     as.factor, as.ordered
library(bonsai)

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

set.seed(1345)
cls_train <- sim_classification(1000)
cls_test  <- sim_classification( 500)
cls_calib <- sim_classification( 500)

set.seed(7378)
cls_rs <- vfold_cv(cls_train)

lgb_spec <- boost_tree() %>% set_mode("classification") %>% set_engine("lightgbm")

cls_metrics <- metric_set(brier_class, roc_auc)

set.seed(6929)
lgb_tune_res <-
  boost_tree(min_n = tune()) %>%
  set_mode("classification") %>%
  set_engine("lightgbm") %>%
  tune_grid(
    class ~ .,
    resamples = cls_rs,
    control = control_resamples(save_pred = TRUE),
    metrics = cls_metrics,
    grid = tibble(min_n = c(2, 50))
  )

df_pred_res  <- lgb_res %>% collect_predictions()
#> Error in collect_predictions(.): object 'lgb_res' not found
df_pred_tune_res  <- lgb_tune_res %>% collect_predictions()

df_new <- df_pred_res[1:5,]
#> Error in eval(expr, envir, enclos): object 'df_pred_res' not found
df_tune_new <- df_pred_tune_res %>% dplyr::slice(1:5, .by = .config)

# Plotting issues

# This produces 1 plot; should be two
df_pred_tune_res %>%
  cal_plot_windowed(truth = class, estimate = .pred_class_1,
                    window_size = 0.1, step_size = 0.025)

# Using `group` makes two plots
df_pred_tune_res %>%
  cal_plot_windowed(truth = class, estimate = .pred_class_1, group = .config,
                    window_size = 0.1, step_size = 0.025)

# Estimation issues

# Should have two groups
df_pred_tune_res %>%
  cal_estimate_logistic(truth = class)
#> 
#> ── Probability Calibration
#> Method: Logistic Spline
#> Type: Binary
#> Source class: Data Frame
#> Data points: 2,000
#> Truth variable: `class`
#> Estimate variables:
#> `.pred_class_1` ==> class_1
#> `.pred_class_2` ==> class_2

# Has two groups via a different "by" mechanism:
df_pred_tune_res %>%
  group_by(.config) %>% 
  cal_estimate_logistic(truth = class)
#> 
#> ── Probability Calibration
#> Method: Logistic Spline
#> Type: Binary
#> Source class: Data Frame
#> Data points: 2,000, split in 2 groups
#> Truth variable: `class`
#> Estimate variables:
#> `.pred_class_1` ==> class_1
#> `.pred_class_2` ==> class_2

^{Created on 2023-03-21 by the reprex package (v2.0.1)}

Reliability Diagrams, Calibration Error, and Temperature Scaling

Hi there,

great work on the package so far! I was wondering if you are planning to extend the types of calibration diagnostics:

Reliability Diagrams (as outlined in On Calibration of Modern Neural Networks)
Expected Calibration Error (same paper, related to / derived from Reliability Diagrams)

EDIT: Now that I think about it, isn't a reliability diagram exactly the same plot (using bars instead of a geom_line())?

The former should be easily implemented, right?

segment_logistic %>% 
  mutate(bin = cut(.pred_good, seq(0, 1, 0.1), labels = F),
         .pred_class = if_else(.pred_good > 0.5, "good", "poor")) %>% 
  summarise(accuracy =mean(if_else(Class == "good", 1L, 0L)), .by = bin) %>% 
  ggplot(aes(x = bin, y = accuracy)) +
  geom_col() +
  geom_abline(intercept = 0, slope = 0.1, lty = "dashed") +
  scale_y_continuous(limits = c(0L, 1L)) +
  theme_minimal()

Moreover, do you plan to integrate actual postprocessing steps in the package as well? I am currently working with temperature scaling (in the context of neural networks). I guess its conceptually related to what you were already outlining in the blog post.

Keep on with the great work!

Best,
Simon

uber calibration groups issue

There are group arguments to the cal_plot_* and cal_estimate_* functions. There are overlaping issues regarding groups (#79, #92,#98, #100). Looking at these, we should have a more systematic approach rather than multiple refactors that might overlap.

I think that we should only enable group-by processing when the input is a data frame (and not an object generated by the tune package). We should default to using .config for tune objects when there are multiple configurations (and NULL otherwise).

Change 1

In the many functions it is set in both the S3 generic and also in an internal helper called tune_results_args().

For non-tune S3 generics, allow the user to set it as an argument and validate it in the generic with a helper function.
For tune generics, give no group argument and set group to .config if there are multiple configurations.

The logic for group should not be in multiple places so it should be taken out of tune_results_args().

Change 2

The user-facing argument should be called .by to be consistent with new dplyr syntax. It should only accept a single, categorical column in the data.

self contained unit tests

To be more consistent with how test that 3E expects things.

Add `event_level` argument to `make_two_class_pred()`

Considering that yardstick.event_first global option will get hard deprecated (see tidymodels/yardstick#173) in favor of the explicit event_level argument in yardstick, it would be consistent to add that argument to make_two_class_pred() for binary classification.

This will affect the following line:

probably/R/make_class_pred.R

Line 158 in 2fca966

x <- ifelse(estimate >= threshold, levels[1], levels[2])

where the levels can be reordered depending on the level to be considered as the "event".

On a related note, for a multi-class problem, it may be useful to allow min_prob argument in make_class_pred() to take a numeric vector of nlevels elements, where nlevels is the number of classes.

Release probably 0.0.2

Prepare for release:

Submit to CRAN:

usethis::use_version()
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()
Tweet

bad error message for the classification estimation

If users do not fully specify the estimate argument, it gives a misleading error message:

library(tidymodels)
library(probably)
#> 
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#> 
#>     as.factor, as.ordered
library(bonsai)

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

set.seed(1345)
cls_train <- sim_classification(1000)
cls_test  <- sim_classification( 500)
cls_calib <- sim_classification( 500)

set.seed(7378)
cls_rs <- vfold_cv(cls_train)

lgb_spec <- boost_tree() %>% set_mode("classification") %>% set_engine("lightgbm")

cls_metrics <- metric_set(brier_class, roc_auc)

set.seed(6929)
lgb_res <-
  lgb_spec %>%
  fit_resamples(
    class ~ .,
    resamples = cls_rs,
    control = control_resamples(save_pred = TRUE),
    metrics = cls_metrics
  )

df_pred_res  <- lgb_res %>% collect_predictions()

df_est <-   
  df_pred_res %>%
  cal_estimate_logistic(truth = class, estimate = .pred_class_1)
#> Error in `stop_multiclass()` at probably/R/cal-estimate-logistic.R:139:4:
#> ! Multiclass not supported...yet

^{Created on 2023-03-21 by the reprex package (v2.0.1)}

use rsample in vignette

set.seed(123)
n <- nrow(lending_club)
prop_train <- .7
train_idx <- sample(seq_len(n), floor(.7 * n))

😢

probably 0.0.1

Prepare for release:

devtools::check_win_devel()
rhub::check_for_cran()
revdepcheck::revdep_check(num_workers = 4)
Polish NEWS
If new failures, update email.yml then revdepcheck::revdep_email_maintainers()

Perform release:

Create RC branch (for bigger releases)
Bump version (in DESCRIPTION and NEWS)
devtools::check_win_devel() (again!)
devtools::submit_cran()
~~pkgdown::build_site()~~ automatic
Approve email

Wait for CRAN...

Template from r-lib/usethis#338

replacement has lenght zero


Error: Problem with `mutate()` column `predicted`.
i `predicted = make_two_class_pred(...)`.
x replacement has length zero
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Problem with `mutate()` column `predicted`.
i `predicted = make_two_class_pred(...)`.
i 'x' is NULL so the result will be NULL

Reproducible example

dm <- structure(list(pred_convert = c(0.1283158659935, 0.147706210613251, 
                                0.124467730522156, 0.137504696846008,
                                0.115710318088531, 0.0456064343452454),
            class = c("non_convert", "convert",
             "non_convert", "non_convert", 
             "non_convert", "non_convert"),
package = structure(c(2L, 3L, 2L,
                      2L, 2L, 2L),
                    .Label = c("111", "112", "121",
                               "131", "211", "221"),
                    class = "factor"),
acc = c(0L, 0L, 0L, 0L, 0L, 0L)),
row.names = c(1L, 2L, 3L, 4L, 5L, 6L),
class = "data.frame")


lvls <- levels(as.factor(dm$class))

dm %>% 
  dplyr::mutate(
    predicted = make_two_class_pred(
      estimate = pred_convert,
      levels = levels(lvls),
      threshold = .12
    )
  )

	# Currently yardstick can't deal with the class_pred objects that come from
	# probably, but it will be able to soon!

tidymodels / probably Goto Github PK

probably's Introduction

probably

Introduction

Installation

Examples

Contributing

probably's People

Contributors

Stargazers

Watchers

Forkers

probably's Issues

Change 1

Change 2

Reproducible example

Recommend Projects

Recommend Topics

Recommend Org