I've been perusing various sites that describe how to determine approximate values with fastshap for a binomial glm model, but so far have been unsuccessful in making it work. Here's what I have been using:
x1 <- c(1,1,1,0,0,0,0,0,0,0)
x2 <- c(1,0,0,1,1,1,0,0,0,0)
x3 <- c(3,2,1,3,2,1,3,2,1,3)
x4 <- c(1,0,1,1,0,1,0,1,0,1)
y <- c(1,0,1,0,1,1,0,0,0,1)
df <- data.frame(x1, x2, x3, x4, y)
fit <- glm(y ~ ., data=df, family=binomial)
X <- model.matrix(y ~., df)[,-1]
pfun <- function(object, newdata) {
predict(object, type="response")
}
shap <- explain(fit , X = X, pred_wrapper = pfun, nsim = 10)
> summary(shap)
x1 x2 x3 x4
Min. :0 Min. :0 Min. :0 Min. :0
1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.:0
Median :0 Median :0 Median :0 Median :0
Mean :0 Mean :0 Mean :0 Mean :0
3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
Max. :0 Max. :0 Max. :0 Max. :0
Obviously not what I expect. With the exact method, I do get values that make sense:
shap <- explain(fit , X = X, exact=TRUE, nsim = 10)
summary(shap)
x1 x2 x3 x4
Min. :-0.3659 Min. :-0.8149 Min. :-0.62699 Min. :-1.0497
1st Qu.:-0.3659 1st Qu.:-0.8149 1st Qu.:-0.62699 1st Qu.:-1.0497
Median :-0.3659 Median :-0.8149 Median : 0.06967 Median : 0.6998
Mean : 0.0000 Mean : 0.0000 Mean : 0.00000 Mean : 0.0000
3rd Qu.: 0.5489 3rd Qu.: 1.2223 3rd Qu.: 0.59215 3rd Qu.: 0.6998
Max. : 0.8538 Max. : 1.2223 Max. : 0.76632 Max. : 0.6998
but I cannot use the exact method on my actual data because the model features are not independent. Any help getting this to work would be appreciated!