Giter VIP home page Giter VIP logo

Comments (8)

larmarange avatar larmarange commented on August 22, 2024 1

Thank you very much @vincentarelbundock for your feedback.

Knowing that, I will try to sort the result of avg_predictions() then.

from broom.helpers.

larmarange avatar larmarange commented on August 22, 2024

Thanks for the report. I will explore it later this week

from broom.helpers.

larmarange avatar larmarange commented on August 22, 2024

The order of the factor levels are not respected by marginaleffects::avg_predictions().

library(tidyverse)
#> Warning: le package 'ggplot2' a été compilé avec la version R 4.3.3
#> Warning: le package 'tidyr' a été compilé avec la version R 4.3.3
library(labelled)
library(gtsummary)
library(marginaleffects)

data(hdv2003, package = "questionr")

d <-
  hdv2003 |> 
  mutate(
    sexe = sexe |> fct_relevel("Femme"),
    groupe_ages = age |>
      cut(
        c(18, 25, 45, 65, 99),
        right = FALSE,
        include.lowest = TRUE,
        labels = c("18-24 ans", "25-44 ans",
                   "45-64 ans", "65 ans et plus")
      ),
    etudes = nivetud |> 
      fct_recode(
        "Primaire" = "N'a jamais fait d'etudes",
        "Primaire" = "A arrete ses etudes, avant la derniere annee d'etudes primaires",
        "Primaire" = "Derniere annee d'etudes primaires",
        "Secondaire" = "1er cycle",
        "Secondaire" = "2eme cycle",
        "Technique / Professionnel" = "Enseignement technique ou professionnel court",
        "Technique / Professionnel" = "Enseignement technique ou professionnel long",
        "Supérieur" = "Enseignement superieur y compris technique superieur"
      ) |> 
      fct_na_value_to_level("Non documenté")  
  ) |> 
  set_variable_labels(
    sport = "Pratique un sport ?",
    sexe = "Sexe",
    groupe_ages = "Groupe d'âges",
    etudes = "Niveau d'études",
    heures.tv = "Heures de télévision / jour"
  )

mod <- glm(
  sport ~ sexe + groupe_ages + etudes + heures.tv,
  family = binomial,
  data = d
)

mod |> 
  tbl_regression(
    tidy_fun = broom.helpers::tidy_marginal_predictions,
    type = "response",
    estimate_fun = scales::label_percent(accuracy = 0.1)
  ) |> 
  as_kable()
Characteristic Average Marginal Predictions 95% CI p-value
Sexe
Femme 32.5% 29.9%, 35.0% <0.001
Homme 40.4% 37.5%, 43.2% <0.001
Groupe d’âges
25-44 ans 42.7% 39.3%, 46.2% <0.001
18-24 ans 51.2% 42.2%, 60.1% <0.001
45-64 ans 29.9% 26.6%, 33.2% <0.001
65 ans et plus 24.9% 19.7%, 30.0% <0.001
Niveau d’études
Supérieur 53.2% 48.4%, 57.9% <0.001
Non documenté 59.2% 47.0%, 71.5% <0.001
Primaire 16.1% 11.9%, 20.4% <0.001
Technique / Professionnel 34.0% 30.3%, 37.7% <0.001
Secondaire 31.8% 27.2%, 36.4% <0.001
Heures de télévision / jour
0 41.0% 37.6%, 44.3% <0.001
1 38.6% 36.2%, 41.0% <0.001
2 36.3% 34.3%, 38.2% <0.001
3 34.0% 31.7%, 36.2% <0.001
12 16.8% 8.6%, 25.1% <0.001
levels(mod$data$etudes)
#> [1] "Primaire"                  "Secondaire"               
#> [3] "Technique / Professionnel" "Supérieur"                
#> [5] "Non documenté"

avg_predictions(mod, variables = "etudes", by = "etudes")
#> 
#>                     etudes Estimate Pr(>|z|)    S 2.5 % 97.5 %
#>  Supérieur                    0.534    0.193  2.4 0.483  0.584
#>  Non documenté                0.599    0.144  2.8 0.466  0.718
#>  Primaire                     0.149   <0.001 90.5 0.113  0.193
#>  Technique / Professionnel    0.329   <0.001 46.8 0.291  0.370
#>  Secondaire                   0.307   <0.001 39.9 0.262  0.357
#> 
#> Columns: etudes, estimate, p.value, s.value, conf.low, conf.high 
#> Type:  invlink(link)

Created on 2024-03-19 with reprex v2.1.0

from broom.helpers.

larmarange avatar larmarange commented on August 22, 2024

It seems related to vincentarelbundock/marginaleffects#1039

from broom.helpers.

vincentarelbundock avatar vincentarelbundock commented on August 22, 2024

FYI, the order of categories in the output of avg_predictions() respects the order in which the levels appear in the dataset. The issue linked sounds related, but it is not actually related (it is about the datagrid() function which plays no role in marginalization).

The logic for respecting the order of appearance in avg_predictions is to be consistent with most aggregation functions in R. For example:

dat <- transform(mtcars, cyl = factor(cyl, c(4, 6, 8)))

data.table::data.table(dat)[, .(mean(mpg)), by = cyl]

          cyl       V1
       <fctr>    <num>
    1:      6 19.74286
    2:      4 26.66364
    3:      8 15.10000

dat |> dplyr::summarize(mpg_avg = mean(mpg), .by = cyl)

      cyl  mpg_avg
    1   6 19.74286
    2   4 26.66364
    3   8 15.10000

# but not this one
aggregate(mpg ~ cyl, FUN = mean, data = dat)

      cyl      mpg
    1   4 26.66364
    2   6 19.74286
    3   8 15.10000

One user-level solution is to sort the dataset beforehand.

The linked issue will not solve the present issue.

from broom.helpers.

larmarange avatar larmarange commented on August 22, 2024

Hi @nicolas-robette

could you have a look at #246 ?

from broom.helpers.

larmarange avatar larmarange commented on August 22, 2024
library(tidyverse)
library(labelled)
library(gtsummary)
library(marginaleffects)

data(hdv2003, package = "questionr")

d <-
  hdv2003 |> 
  mutate(
    sexe = sexe |> fct_relevel("Femme"),
    groupe_ages = age |>
      cut(
        c(18, 25, 45, 65, 99),
        right = FALSE,
        include.lowest = TRUE,
        labels = c("18-24 ans", "25-44 ans",
                   "45-64 ans", "65 ans et plus")
      ),
    etudes = nivetud |> 
      fct_recode(
        "Primaire" = "N'a jamais fait d'etudes",
        "Primaire" = "A arrete ses etudes, avant la derniere annee d'etudes primaires",
        "Primaire" = "Derniere annee d'etudes primaires",
        "Secondaire" = "1er cycle",
        "Secondaire" = "2eme cycle",
        "Technique / Professionnel" = "Enseignement technique ou professionnel court",
        "Technique / Professionnel" = "Enseignement technique ou professionnel long",
        "Supérieur" = "Enseignement superieur y compris technique superieur"
      ) |> 
      fct_na_value_to_level("Non documenté")  
  ) |> 
  set_variable_labels(
    sport = "Pratique un sport ?",
    sexe = "Sexe",
    groupe_ages = "Groupe d'âges",
    etudes = "Niveau d'études",
    heures.tv = "Heures de télévision / jour"
  ) 

mod <- glm(
  sport ~ sexe + groupe_ages + etudes + heures.tv,
  family = binomial,
  data = d
)

mod |> 
  tbl_regression(
    tidy_fun = broom.helpers::tidy_marginal_predictions,
    type = "response",
    estimate_fun = scales::label_percent(accuracy = 0.1)
  ) |> 
  as_kable()
Characteristic Average Marginal Predictions 95% CI p-value
Sexe
Femme 32.5% 29.9%, 35.0% <0.001
Homme 40.4% 37.5%, 43.2% <0.001
Groupe d’âges
18-24 ans 51.2% 42.2%, 60.1% <0.001
25-44 ans 42.7% 39.3%, 46.2% <0.001
45-64 ans 29.9% 26.6%, 33.2% <0.001
65 ans et plus 24.9% 19.7%, 30.0% <0.001
Niveau d’études
Primaire 16.1% 11.9%, 20.4% <0.001
Secondaire 31.8% 27.2%, 36.4% <0.001
Technique / Professionnel 34.0% 30.3%, 37.7% <0.001
Supérieur 53.2% 48.4%, 57.9% <0.001
Non documenté 59.2% 47.0%, 71.5% <0.001
Heures de télévision / jour
0 41.0% 37.6%, 44.3% <0.001
1 38.6% 36.2%, 41.0% <0.001
2 36.3% 34.3%, 38.2% <0.001
3 34.0% 31.7%, 36.2% <0.001
12 16.8% 8.6%, 25.1% <0.001

Created on 2024-04-05 with reprex v2.1.0

from broom.helpers.

nicolas-robette avatar nicolas-robette commented on August 22, 2024

This is great, thanks for the fix !
Bests,
Nicolas

from broom.helpers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.