modeloriented / xai2shiny Goto Github PK

Create Shiny application with model exploration from explainers

Home Page: https://modeloriented.github.io/xai2shiny/

R 99.62% Dockerfile 0.38%

xai2shiny's Introduction

xai2shiny

Overview

The xai2shiny R package creates a Shiny application for Explainers (adapters for machine learning models created using the DALEX package). Turn your model into an interactive application containing model's prediction, performance and many XAI methods with just one function. Furthermore, with xai2shiny you can simply export your application to the cloud and share it with others.

Installation

# Install the development version from GitHub:
devtools::install_github("ModelOriented/xai2shiny")

Example

Package usage example will be based on the titanic dataset, including GLM and Random Forest models. The final application created using the scipt below. First, it is necessary to have any explainers created whatsoever:

library("xai2shiny")
library("ranger")
library("DALEX")

# Creating ML models
model_rf <- ranger(survived ~ .,
                   data = titanic_imputed,
                   classification = TRUE, 
                   probability = TRUE)
model_glm <- glm(survived ~ .,
                 data = titanic_imputed,
                 family = "binomial")

# Creating DALEX explainers
explainer_rf <- explain(model_rf,
                     data = titanic_imputed[,-8],
                     y = titanic_imputed$survived)

explainer_glm <- explain(model_glm,
                     data = titanic_imputed[,-8],
                     y = titanic_imputed$survived)

Then all is left to do is to run:

xai2shiny::xai2shiny(explainer_glm, explainer_rf, 
                     directory = './',
                     selected_variables = c('gender', 'age'),
                     run = FALSE)

Above, in xai2shiny function, apart from explainers, following attributes were provided:

directory - a location indicator where to create whole xai2shiny directory and place there required files (an app and explainers),
selected_variables - a vector containing variables list chosen at an app start-up (used for modification and local explanations research),
run - whether to run an app immediately after creating.

Cloud deployment

Further cloud deployment can be performed. In order to do so, there are just three steps necessary to enjoy your new xai2shiny application in the cloud.

If you don't have an account on DigitalOcean, create one here and get $100 free credit.
Create an SSH key if you don't have one yet.
Deploy the SSH key to DigitalOcean

And that's it, you are ready to get back to R and deploy your application. In order to create a new cloud instance, called a droplet by DigitalOcean, running Docker on Ubuntu with all prerequisities installed, just run:

xai2shiny::cloud_setup(size)

size - ram size desired for the droplet, defaults to 1GB. It can be modified later through DigitalOceans website.

Now that your droplet is setup, just deploy the created xai2shiny application with one function.

deploy_shiny(droplet = <your_droplet_id>, path = './xai2shiny', packages = "ranger")

droplet - the droplet object/droplet's ID that can be read from running analogsea::droplets().
path - path to the xai2shiny application
packages - packages used to create or run the model, they will be installed on the droplet.

And that's it, the xai2shiny application is running and will automatically open in your default web browser, now all that's left is to share it!

Functionality

The main function is called xai2shiny which creates the Shiny app.R file and runs it converting your models into an interactive application.

At the time it supports such functionalities for multiple models in one application:

Model prediction
Model performance (with text descriptions of measures)
Local explanations: (with text descriptions)
- Break Down plot
- SHAP values plot
- Ceteris Paribus plot
Global explanations:
- Feature importance plots
- Partial Dependence plots

Acknowledgments

Work on this package was financially supported by the Polish National Science Centre under Opus Grant number 2017/27/B/ST6/0130.

xai2shiny's People

Contributors

Stargazers

Watchers

Forkers

han-tun abson-dev

xai2shiny's Issues

more verbose xai2shiny::xai2shiny

after the dir is created there is no message where it was created and if the process was successful
be more verbose and let the user know that the process was OK and where is the outpur
(see verbose parameter in DALEX::explain)

Research theme selection feature

Assigned: Mateusz

warn before ovverride

if the dir already exists issue an warning before any file will be overridden

Write chapters: PDP, Feature Importance, Break-down

Assigned: Mateusz

Provide better prediction text placeholder

Assigned: Adam

you CANNOT DELETE the content of the 'directory' (if present)

without explicit permission of the user

such behaviour leads to very unpleasant situations

(I manage to recover data only thanks to the dropbox)

Correct internal xai2shiny functions

Comments to make them visible

Update README

Assigned: Mateusz

add example

in the README there is a link to example application
https://adamr.shinyapps.io/xai2shiny/

but there is no link to source code of this example

Develop 'Learn more about XAI' tab

As a new HTML file included in package

Assigned: Mateusz

Research dashboard tabs collapsing

Assigned: Mateusz

Write chapters: Shapley Values, XAI Intro

Assigned: Adam

Error running xai2shiny

Hi @Adamoso it's me again! I've updated your package to the most recent version and now I'm getting the following error:

Error in withSpinner(uiOutput("textPred"), hide.ui = FALSE) : 
  unused argument (hide.ui = FALSE)

Reproducible example:

library(xai2shiny) # devtools::install_github("ModelOriented/xai2shiny")
library(lares) # devtools::install_github("laresbernardo/lares")
ignore <- c("PassengerId","Ticket","Cabin")
model <- h2o_automl(dft, Survived, ignore = ignore, quiet = FALSE)
explainer <- h2o_explainer(model$datasets$test, model = model$model, y = "Survived", ignore = ignore)
xai2shiny(explainer)

Update sample cloud app with the newest version of the package

Add tooltips next to XAI plots

Assigned: Mateusz

Create vignette (example + h2o)

Assigned: Mateusz

allow for other dir names than xai2shiny

right now the name of the dir is hardcoded to 'xai2shiny'
it is a good default but allow for something else as well

Add further 'Learn more about XAI' parts

Assigned: Mateusz

add footer with information about site generator

see for example https://mi2datalab.github.io/modelDown_example/

on the bottom there is an information when and how given app was created

information about funding in readme

as in https://github.com/ModelOriented/ArenaR/#acknowledgments

Write chapters: Web infrastructure intro, Shiny Server

Assigned: Mateusz

issues with layout if only global explanations are selected

See an attachement
something is wrong with the layout

Add further 'Learn more about XAI' parts

Adding into created HTML file

Assigned: Adam

make sure that plots have consistent titles

now BreakDown has a title,
SHAP nas no title
CeterisParibus and both title and subtitle

Write chapters: Cloud computing intro, Docker

Assigned: Adam

add pkgdown documentation

either through travis or github actions
see for example https://github.com/ModelOriented/modelStudio

AUC in regressions models

I have tested the code with two regressions models: xgboost and glm. The results produce an AUC plot, which is meaningless for regressions.

Add cloud deployment tests

Assigned: Adam

Error rendering - missing comma

Error running xai2shiny function, something about a missing comma?

Load libraries

Data transformations

suppressMessages(library(tidyverse))
suppressMessages(library(data.table))

Saving data to disk

suppressMessages(library(feather))
suppressMessages(library(arrow))
suppressMessages(library(here))

Feature engineering

suppressMessages(library(recipes))
suppressMessages(library(yardstick))

Machine learning

suppressMessages(library(tidymodels))
suppressMessages(library(themis))

Explainer

suppressMessages(library(DALEX))

dir <- "/home/paulc/projects_Paul/31_user_journey"

Load models

lasso_model <- readRDS(paste0(dir, "/R/models/02_glmnet/", "final_model_smote.Rds"))
ranger_model <- readRDS(paste0(dir, "/R/models/03_randomForest/", "final_model_smote.Rds"))

Read data

path <- paste0(dir, "/R/data/processed/", "data_final.parquet")
df.data <- setDT(read_parquet(path))

Delete col_to_del

col_to_del <- c("username", "user_id", "start_activity",
"end_activity", "cohort", "min_to_purchase",
"token_bonus_ratio", "first_purchase")
df.data[, (col_to_del) := NULL]

Split the data into training and testing sets

set.seed(2020)
train_test_split <- df.data %>%
initial_split(prop = 0.8, strata = label_fct)

Set recipie

recipie_num <- training(train_test_split) %>%
recipe(label_fct ~. ) %>% # Fomula
step_mutate(label_fct = as.factor(label_fct)) %>%
step_normalize(all_predictors()) %>%
step_smote(label_fct) %>%
prep()

create the final data

df.train <- as.data.frame(juice(recipie_num))
df.test <- as.data.frame(bake(recipie_num, new_data = testing(train_test_split)))

binary variable for explainer

df.testing_original <- testing(train_test_split)
yTest <- as.integer(ifelse(df.testing_original$label_fct == "yes", 1, 0))

df.test <- df.test %>%
select(-label_fct)

custom_predict <- function(object, newdata) {pred <- predict(object, newdata, type = "prob")
response <- pred$.pred_yes
return(response)}

lasso_explainer <- DALEX::explain(model = lasso_model,
data = df.test,
y = yTest,
predict_function = custom_predict,
label = "Lasso",
colorize = FALSE)
#> Preparation of a new explainer is initiated
#> -> model label : Lasso
#> -> data : 20514 rows 5 cols
#> -> target variable : 20514 values
#> -> predict function : custom_predict
#> -> predicted values : numerical, min = 0.1982451 , mean = 0.35857 , max = 0.9535754
#> -> model_info : package parsnip , ver. 0.1.3 , task classification ( default )
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -0.9535754 , mean = -0.3346352 , max = 0.8017549
#> A new explainer has been created!

ranger_explainer <- DALEX::explain(model = ranger_model,
data = df.test,
y = yTest,
predict_function = custom_predict,
label = "Random Forest",
colorize = FALSE)
#> Preparation of a new explainer is initiated
#> -> model label : Random Forest
#> -> data : 20514 rows 5 cols
#> -> target variable : 20514 values
#> -> predict function : custom_predict
#> -> predicted values : numerical, min = 0.2152264 , mean = 0.3666207 , max = 0.9012534
#> -> model_info : package parsnip , ver. 0.1.3 , task classification ( default )
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -0.9012534 , mean = -0.3426858 , max = 0.7847736
#> A new explainer has been created!
library(xai2shiny)
xai2shiny(lasso_explainer, ranger_explainer)
#> Loading required package: shiny
#> Error in parse(file, keep.source = FALSE, srcfile = src, encoding = enc) :
#> /tmp/RtmpdRkELf/xai2shiny/app.R:11:1: unexpected ','
#> 10: library(parsnip)
#> 11: ,
#> ^
#> Possible missing comma at:
#> 30: if(!is.null(header)) tags$li(class="header",header),
#> ^
#> Possible extra comma at:
#> 127: column(width = 3, uiOutput("pdpvariable"),),
#> ^
#> Possible missing comma at:
#> 153: nulls <- sapply(obs, function(x) length(x) == 0)
#> ^
#> Error in sourceUTF8(fullpath, envir = new.env(parent = sharedEnv)): Error sourcing /tmp/RtmpdRkELf/xai2shiny/app.R
sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: CentOS Linux 7 (Core)
#>
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices datasets utils methods base
#>
#> other attached packages:
#> [1] shiny_1.5.0 xai2shiny_0.1.0 DALEX_2.0
#> [4] themis_0.1.2 workflows_0.2.0 tune_0.1.1
#> [7] rsample_0.0.8 parsnip_0.1.3 modeldata_0.0.2
#> [10] infer_0.5.3 dials_0.0.9 scales_1.1.1
#> [13] broom_0.7.0 tidymodels_0.1.1.9000 yardstick_0.0.7
#> [16] recipes_0.1.13 here_0.1 arrow_1.0.0.20200728
#> [19] feather_0.3.5 data.table_1.12.8 forcats_0.5.0
#> [22] stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4
#> [25] readr_1.4.0 tidyr_1.1.2 tibble_3.0.4
#> [28] ggplot2_3.3.2 tidyverse_1.3.0
#>
#> loaded via a namespace (and not attached):
#> [1] readxl_1.3.1 mlr_2.17.1 backports_1.1.10
#> [4] fastmatch_1.1-0 plyr_1.8.6 shinydashboard_0.7.1
#> [7] splines_4.0.2 listenv_0.8.0 digest_0.6.26
#> [10] foreach_1.5.0 htmltools_0.5.0 fansi_0.4.1
#> [13] magrittr_1.5 checkmate_2.0.0 BBmisc_1.11
#> [16] unbalanced_2.0 doParallel_1.0.15 globals_0.13.0
#> [19] modelr_0.1.8 gower_0.2.2 colorspace_1.4-1
#> [22] blob_1.2.1 rvest_0.3.5 haven_2.3.1
#> [25] xfun_0.17 crayon_1.3.4 jsonlite_1.7.1
#> [28] survival_3.1-12 iterators_1.0.12 glue_1.4.2
#> [31] gtable_0.3.0 ipred_0.9-9 shape_1.4.4
#> [34] DBI_1.1.0 Rcpp_1.0.5 xtable_1.8-4
#> [37] GPfit_1.0-8 bit_1.1-15.2 lava_1.6.8
#> [40] prodlim_2019.11.13 glmnet_4.0-2 httr_1.4.1
#> [43] sourcetools_0.1.7 FNN_1.1.3 ellipsis_0.3.1
#> [46] pkgconfig_2.0.3 ParamHelpers_1.14 nnet_7.3-14
#> [49] dbplyr_1.4.4 tidyselect_1.1.0 rlang_0.4.8
#> [52] DiceDesign_1.8-1 later_1.1.0.1 munsell_0.5.0
#> [55] cellranger_1.1.0 tools_4.0.2 cli_2.1.0
#> [58] generics_0.0.2 ranger_0.12.1 evaluate_0.14
#> [61] fastmap_1.0.1 yaml_2.2.1 knitr_1.30
#> [64] bit64_0.9-7 fs_1.4.2 shinycssloaders_1.0.0
#> [67] RANN_2.6.1 future_1.19.1 whisker_0.4
#> [70] mime_0.9 xml2_1.3.2 compiler_4.0.2
#> [73] rstudioapi_0.11 reprex_0.3.0 lhs_1.0.2
#> [76] stringi_1.5.3 highr_0.8 lattice_0.20-41
#> [79] Matrix_1.2-18 shinyjs_2.0.0 vctrs_0.3.4
#> [82] pillar_1.4.6 lifecycle_0.2.0 furrr_0.1.0
#> [85] httpuv_1.5.4 R6_2.4.1 promises_1.1.1
#> [88] renv_0.12.0-12 codetools_0.2-16 MASS_7.3-51.6
#> [91] assertthat_0.2.1 rprojroot_1.3-2 shinyWidgets_0.5.4
#> [94] ROSE_0.0-3 withr_2.3.0 parallel_4.0.2
#> [97] hms_0.5.3 grid_4.0.2 rpart_4.1-15
#> [100] timeDate_3043.102 class_7.3-17 rmarkdown_2.3
#> [103] parallelMap_1.5.0 pROC_1.16.2 lubridate_1.7.9
Created on 2020-10-22 by the reprex package (v0.3.0)

# TODO: create observation based on average data for each variable
chosen_observation <- data[1,-8]

modeloriented / xai2shiny Goto Github PK

xai2shiny's Introduction

xai2shiny

Overview

Installation

Example

Cloud deployment

Functionality

Acknowledgments

xai2shiny's People

Contributors

Stargazers

Watchers

Forkers

xai2shiny's Issues

Load libraries

Data transformations

Saving data to disk

Feature engineering

Machine learning

Explainer

Load models

Read data

Delete col_to_del

Split the data into training and testing sets

Set recipie

create the final data

binary variable for explainer

Recommend Projects

Recommend Topics

Recommend Org