Hey I am performing a logistic regression and I noticed that, when c

No. Since for this code line: <div class="highlight highlight-source-r notranslate

Issue in insight . <div class="highlight highlight-source-r notranslate po

Tagging <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-

I think the implementation (from you?) is: Pull per-observatio

<div class="highlight highlight-source-r notranslate position-relative overflow-auto" dir="auto" dat

AIC varies with order of factors in binomial? about insight HOT 8 CLOSED

andrebolerbarros commented on June 2, 2024

AIC varies with order of factors in binomial?

from insight.

Comments (8)

strengejacke commented on June 2, 2024 1

No. Since for this code line:

stats::dbinom(round(n * resp), round(n), predicted, log = TRUE) * w

The predicted values are p for one model, and 1-p for the other model, we need to correct the response value, which should be "inversed" for the two models. This should be fixed on the PR #835

d <- mtcars
d$zero <- factor(d$vs, levels = c(0, 1))
d$ones <- factor(d$vs, levels = c(1, 0))

ml_zero <- glm(zero ~ mpg, family = binomial, data = d)
ml_ones <- glm(ones ~ mpg, family = binomial, data = d)

logLik(ml_zero)
#> 'log Lik.' -12.76667 (df=2)
insight::get_loglikelihood(ml_zero)
#> 'log Lik.' -12.76667 (df=2)

logLik(ml_ones)
#> 'log Lik.' -12.76667 (df=2)
insight::get_loglikelihood(ml_ones)
#> 'log Lik.' -12.76667 (df=2)

performance::compare_performance(ml_zero, ml_ones)
#> When comparing models, please note that probably not all models were fit
#>   from same data.
#> # Comparison of Model Performance Indices
#> 
#> Name    | Model | AIC (weights) | AICc (weights) | BIC (weights) | Tjur's R2 |  RMSE | Sigma | Log_loss | Score_log | Score_spherical |   PCP
#> ---------------------------------------------------------------------------------------------------------------------------------------------
#> ml_zero |   glm |  29.5 (0.500) |   29.9 (0.500) |  32.5 (0.500) |     0.474 | 0.361 | 1.000 |    0.399 |   -14.308 |           0.093 | 0.741
#> ml_ones |   glm |  29.5 (0.500) |   29.9 (0.500) |  32.5 (0.500) |     0.474 | 0.361 | 1.000 |    2.104 |   -18.429 |           0.079 | 0.259

^{Created on 2023-11-27 with reprex v2.0.2}

from insight.

strengejacke commented on June 2, 2024

Issue in insight.

df <- mtcars
df$zero <- factor(df$vs, levels = c(0, 1))
df$ones <- factor(df$vs, levels = c(1, 0))

ml_zero <- glm(zero ~ mpg, family = binomial, data = df)
ml_ones <- glm(ones ~ mpg, family = binomial, data = df)

logLik(ml_zero)
#> 'log Lik.' -12.76667 (df=2)
insight::get_loglikelihood(ml_zero)
#> 'log Lik.' -12.76667 (df=2)

logLik(ml_ones)
#> 'log Lik.' -12.76667 (df=2)
insight::get_loglikelihood(ml_ones)
#> 'log Lik.' -67.33844 (df=2)

^{Created on 2023-11-27 with reprex v2.0.2}

from insight.

strengejacke commented on June 2, 2024

Tagging @bwiernik @DominiqueMakowski

from insight.

DominiqueMakowski commented on June 2, 2024

doesn't get_loklikelihood pull lokLik when it can? 🤔

from insight.

strengejacke commented on June 2, 2024

I think the implementation (from you?) is:

Pull per-observation log-likelihoods, if possible (

insight/R/get_loglikelihood.R

Line 212 in 09551e9

.get_loglikelihood_glm <- function(x, info, verbose = TRUE, ...) {

)
Only fall-back to logLik() if we don't have per-observation lls (

insight/R/get_loglikelihood.R

Line 406 in 09551e9

.loglikelihood_prep_output <- function(x,

)

If I recall right, your intention was:

attr(out, "per_obs") <- lls # This is useful for some models comparison tests

from insight.

DominiqueMakowski commented on June 2, 2024

library(insight)

df <- mtcars
df$zero <- factor(df$vs, levels = c(0, 1))
df$ones <- factor(df$vs, levels = c(1, 0))

ml_zero <- glm(zero ~ mpg, family = binomial, data = df)
ml_ones <- glm(ones ~ mpg, family = binomial, data = df)


get_response(ml_zero, as_proportion = TRUE)
#>  [1] 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1
#> Levels: 0 1
get_response(ml_ones, as_proportion = TRUE)
#>  [1] 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1
#> Levels: 1 0

^{Created on 2023-11-27 with reprex v2.0.2}

The response is the same

from insight.

strengejacke commented on June 2, 2024

Yeah, it's get_predicted(), which is different here:

df <- mtcars
df$zero <- factor(df$vs, levels = c(0, 1))
df$ones <- factor(df$vs, levels = c(1, 0))

ml_zero <- glm(zero ~ mpg, family = binomial, data = df)
ml_ones <- glm(ones ~ mpg, family = binomial, data = df)

logLik(ml_zero)
#> 'log Lik.' -12.76667 (df=2)
insight::get_loglikelihood(ml_zero)
#> 'log Lik.' -12.76667 (df=2)
stats::deviance(ml_zero)
#> [1] 25.53334
insight::get_predicted(ml_zero, ci = NULL, verbose = FALSE)
#> Predicted values:
#> 
#>  [1] 0.55122251 0.55122251 0.72717879 0.59333677 0.31338533 0.26065097
#>  [7] 0.06427450 0.84144476 0.72717879 0.36143691 0.23654569 0.14500953
#> [13] 0.19990013 0.09188885 0.01265744 0.01265744 0.07543906 0.99401399
#> [19] 0.98595713 0.99685235 0.60367900 0.10324638 0.09188885 0.04275502
#> [25] 0.36143691 0.94869098 0.91354266 0.98595713 0.11582865 0.41243065
#> [31] 0.08495358 0.59333677
#> 
#> NOTE: Confidence intervals, if available, are stored as attributes and can be accessed using `as.data.frame()` on this output.

logLik(ml_ones)
#> 'log Lik.' -12.76667 (df=2)
insight::get_loglikelihood(ml_ones)
#> 'log Lik.' -67.33844 (df=2)
stats::deviance(ml_ones)
#> [1] 25.53334
insight::get_predicted(ml_ones, ci = NULL, verbose = FALSE)
#> Predicted values:
#> 
#>  [1] 0.448777493 0.448777493 0.272821214 0.406663228 0.686614674 0.739349027
#>  [7] 0.935725503 0.158555240 0.272821214 0.638563093 0.763454313 0.854990468
#> [13] 0.800099872 0.908111148 0.987342561 0.987342561 0.924560943 0.005986014
#> [19] 0.014042865 0.003147652 0.396321000 0.896753623 0.908111148 0.957244981
#> [25] 0.638563093 0.051309021 0.086457341 0.014042865 0.884171352 0.587569354
#> [31] 0.915046417 0.406663228
#> 
#> NOTE: Confidence intervals, if available, are stored as attributes and can be accessed using `as.data.frame()` on this output.

^{Created on 2023-11-27 with reprex v2.0.2}

And since later:

lls <- stats::dbinom(round(n * resp), round(n), predicted, log = TRUE) * w

the per-observation lls are different (because the predictions are p for ml_zero, and 1-p for ml_ones), and again later we have:

out <- sum(lls)

That's where the difference comes from.

from insight.

andrebolerbarros commented on June 2, 2024

If I am understanding this right, is it the standard R value that is wrong/incorrect?

from insight.

AIC varies with order of factors in binomial? about insight HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent