kim0sun / glca Goto Github PK

An R Package for Multiple-Group Latent Class Analysis

License: GNU General Public License v3.0

R 63.06% C++ 36.94%

latent-class-analysis multilevel-models r-package cran r

glca's Introduction

`glca`: An R Package for Multiple-Group Latent Class Analysis

Fits multiple-group latent class analysis (LCA) for exploring differences between populations in the data with a multilevel structure. There are two approaches to reflect group differences in glca: fixed-effect LCA (Bandeen-Roche et al, 1997 doi:10.1080/01621459.1997.10473658; Clogg and Goodman, 1985 doi:10.2307/270847) and nonparametric random-effect LCA (Vermunt, 2003 doi:10.1111/j.0081-1750.2003.t01-1-00131.x).

Introduction

Latent class analysis (LCA) is one of the most popular discrete mixture models for classifying individuals based on their responses to multiple manifest items. When there are existing subgroups in the data representing different populations, researchers are often interested in comparing certain aspects of latent class structure across these groups in LCA approach. In multiple-group LCA models, individuals are dependent owing to multilevel data structure, where observation units (i.e., individuals) are nested within a higher-level unit (i.e., group). This paper describes the implementation of multiple-group LCA in the R package glca for exploring differences in latent class structure between populations, taking multilevel data structure into account. The package glca deals with the fixed effect LCA and the random effect LCA; the former can be applied in the situation where populations are segmented by the observed group variable itself, whereas the latter can be used when there are too many levels in the group variable to make a meaningful group comparisons.

Installation

You can install the released version of glca from CRAN with:

install.packages("glca")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("kim0sun/glca")

glca's People

Contributors

Stargazers

Watchers

Forkers

hyunsooseol

glca's Issues

Ordinal level of variables in LCA?

Dear Mr. Kim,

I tried using your great package for multiple-group LCA. However, my manifest variables entering the model are not binary, but ordinal with five levels. When trying to compare the models with gofglca(), the output gives the following:

In gofglca(own_celkem_lca2, own_celkem_lca3, test = "boot", seed = 1) :
Since responses are different, deviance table does not printed.

Is it somehow possible to inlude non-binary variables in the models so the function can print a deviance table?
Or is there any other way to decide which model (without groups; with groups and measure.inv = T; or measure.inv = F) is the appropriate one?
Thank you.

Support for the three-step (bias adjusted) method for auxiliary variables

Hello,

Thanks for providing an excellent module.

However, I was wondering if you would be interested in adding support for the three step method of handling covariates or distal outcome variables? (See this article for a good overview). This method has the advantage that the covariates don't alter the latent class measurement model, which generally makes it easier to use and allows for a greater number of covariates to be added.

The first step of the method involves fitting a LCA model without covariates.

The second step requires calculating probabilities for the "most likely class". These can be fairly straightforwardly calculated from the posterior class probabilities provided by glca.

The third step involves fitting a LCA model with the covariates, but based on the most likely class probabilities calculated in step three. From my understanding (which is undoubtedly not complete!) this requires fixing the class probabilities to specified values. I don't believe glca currently supports this.

Would you have any interest in adding this functionality? I'm happy to help where possible.

Thanks.

customize plot function

Hello,

I would like to know how to fully customize the plot function. I'm trying to change the axis labels, rotate the x-axis tick marks by 90 degrees, and translate the title and legend into another language. I have read the function's documentation and noticed that it mentions "further arguments passed to or from other methods," but unfortunately, I couldn't find more information or examples to help me. Is it possible to get more guidance on how to proceed? Thank you in advance.

beta weight

Hi, Mr. Kim

I have a question regarding the beta weights.
Please see the attached image.
Can beta be negative? Isn't it standardized logistic regression coefficient?

Thank you!

gofglca() can't be provided a list of models

It would nice to not have to multiple arguments for each model to be compared with the gofglca() function.
Given that it is standard practice to test in increment the number of class as an enumeration procedure, it would be nice if one could just provide a list of models like this

LCA <- map(2:6, function(nclass){ items %>% glca(item(names(.)) ~ 1, data = ., nclass = nclass, verbose = T) }) gofglca(LCA)

gofglca() return this error Error in gofglca(LCA) : All objects should be glca outputs.

but if I split it like this

gofglca(LCA[[1]], LCA[[2]], LCA[[3]])
Then it works.

How can I create bivariate residuals and classification error?

poLCA models have these measures (http://daob.nl/wp-content/uploads/2015/07/ESRA-course-slides.pdf)

Is it possible to calculate them in glca?

Warning message: Since responses are different, deviance table does not printed.

Hi,

I am receiving the following issue "Since responses are different, deviance table does not printed". Has the bug been fixed in the most recent release or is there a workaround for the released version? For reference, I have 22 items and am unable to use the development version on Github.

Thanks so much,

Barry

Plotting

Hi, Mr.Kim

I have a question regarding the plot function in LCA.
When I run the LCA, it seems like that "Item Response Probabilities by Class" draw the plot using Rho(Y=1) value.
Am I right?

I want to know Rho (Y=2) plot.
Do you have an option for that?
Also, when creating "Class prevalences by Group" it seems like the labeling box on the right is not correct (color for Class 1 and Class 3 are the same. See the attached pic).
Also, it seems like that the axis on the left is not stretching to 1. It doesn't have to be stretching to 1, but at least my class 1 has more than 50% of the people, I want to show them roughly how much percent of the people are classified as class 1. (See the second attached pic)

Thank you!

Euijin

parameters multiple group lca

Hello,

I am not sure if this question is appropriate but I will try.
I am running a multipe group LCA and was asked to replicate my results in Mplus.The results for a lca without grouping variable are identical. But now I noticed that when running a multiple group LCA, Mplus has always one more parameter. I guess therefore the Log-likelihood as well as BIC, AIC are different. However, the class prevalences in Mplus and glca are always identical.
I am running a LCA with 6 variables (3 levels). Having three classes, there are 38 parameters. When adding a grouping variable with two groups, I get 40 parameters in glca and 41 parameters in Mplus.
I am not that good in statistics. I am able to calculate the parameters for a model without a grouping variable but I am confused when adding a grouping variable.

Does somebody have any ideas or can help me calcuate the parameters?

Install error

Hi, Dr. Kim

I can't download this package. I am having this error:

install.packages("glca")
Installing package into ‘C:/Users/Documents/R/win-library/3.5’
(as ‘lib’ is unspecified)
Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘glca’
These will not be installed

Can you please help me to fix it?

Thank you!

Warning message: "Since responses are different, deviance table does not print"

Professor Kim: I am testing the measurement invariance of a latent class analysis of 26 binary items. I am first running a series of LCA models to determine the optimal number of latent classes. For theoretical reasons, we believe the optimal number is either 6 or 7 classes.

I have successfully run the competing models.

lca6 <-glca(f, data=df, nclass=6, seed=1)
lca7 <-glca(f, data=df, nclass=7, seed=1)

I am now trying to use "gofglca" using test="boot".
goftest <-gofglca(lca6, lca7, test="boot", seed=1)

I get a Goodness of Fit Table. However, I get the following warning message: "Since responses are different, deviance table does not print".

I am confused because these are the exact same samples. Could you please help me understand this warning message?

Thank you

Chi-square p value of gofglca()

Hi Mr. Kim,

I have a problem with measurement invariance.
When I used gofglca() function to measure the invariance across groups, I compared mglca1 and mglca2,
mglca1 <- glca(f, group = year, data = data, nclass = 4,seed = 1, verbose = FALSE)
mglca2 <- glca(f, group = year, data = data, nclass = 4,seed = 1, measure.inv = FALSE, verbose = FALSE)

and the Pr (>Chi) in the Analysis of Deviance Table showed that p < .001 (comparison of mglca1 and mglca2), and the goodness of fit table showed that mglca1 was better fitted than mglca2 (comparing CAIC and BIC).

So, I wonder whether this result means measurement invariance assumption can be supported (because mglca1 is better) or measurement invariance across groups is rejected (because the p-value is smaller than 0.001)?

Thanks!

MLCA standard errors

Dear kim0sun,

When I add two clusters to my two-class model, the standard errors of my covariates become extremely small. Is this normal?

Thank you!

Anders

Extracting Class or Clusters as dataframe

Hi Prof Sun,
Thank you for your wonderful package. After running the model is there a way of extracting the predicted class for observation ?
I tried using the max probability from the lca$posterior matrix to classify for each observation, it doesn't seem to be right.
Any points will be appreciated.

How do I apply the model generated to other dataframe?

poLCA models have the function posterior, like poLCA.posterior(trained_model, new.data).
Can glca apply the model by glca to other dataframe like poLCA?

How to change reference level for covariates in latent class analysis?

I'm using the glca package to estimate a latent class model with covariates. In the table of covariate coefficients, the reference value for the regiao variable is set to CENTRO-OESTE, but I would like to set it to SUDESTE.

However, I can't seem to find any argument in the glca function to set the reference levels for each covariate. I checked the package documentation but there's no mention of how to do this.

Here is an example of the coefficient table:

                                    Odds Ratio Coefficient  Std. Error  t value
(Intercept)                             2.8047      1.0313      0.5850    1.763
sexoM                                   1.1808      0.1662      0.1864    0.892
regiaoNORDESTE                          0.4893     -0.7147      0.4752   -1.504
regiaoNORTE                             0.2707     -1.3068      0.6937   -1.884
regiaoSUDESTE                           0.5940     -0.5209      0.4273   -1.219
regiaoSUL                               0.6329     -0.4575      0.4557   -1.004
gde_areaCiências Biológicas             1.9771      0.6816      0.3534    1.929
gde_areaCiências da Saúde               0.6264     -0.4677      0.3312   -1.412
gde_areaCiências Exatas e da Terra      1.8469      0.6135      0.3687    1.664
gde_areaCiências Humanas                0.5270     -0.6405      0.3110   -2.060
gde_areaCiências Sociais Aplicadas      0.5824     -0.5406      0.3591   -1.505
gde_areaEngenharias                     1.4810      0.3927      0.3889    1.010
gde_areaLingüística, Letras e Artes     0.8935     -0.1126      0.4627   -0.243
gde_areaOutra                           0.5580     -0.5834      0.5493   -1.062
cat_nivel1B                             1.5110      0.4128      0.4167    0.990
cat_nivel1C                             1.2448      0.2190      0.3806    0.575
cat_nivel1D                             1.1694      0.1565      0.3399    0.460
cat_nivel2                              1.8092      0.5929      0.2991    1.982
cat_nivelSR                             1.2244      0.2025      0.7557    0.268
                                     Pr(>|t|)  
(Intercept)                            0.0781 .
sexoM                                  0.3726  
regiaoNORDESTE                         0.1328  
regiaoNORTE                            0.0598 .
regiaoSUDESTE                          0.2231  
regiaoSUL                              0.3155  
gde_areaCiências Biológicas            0.0540 .
gde_areaCiências da Saúde              0.1581  
gde_areaCiências Exatas e da Terra     0.0963 .
gde_areaCiências Humanas               0.0396 *
gde_areaCiências Sociais Aplicadas     0.1325  
gde_areaEngenharias                    0.3127  
gde_areaLingüística, Letras e Artes    0.8078  
gde_areaOutra                          0.2883  
cat_nivel1B                            0.3221  
cat_nivel1C                            0.5651  
cat_nivel1D                            0.6454  
cat_nivel2                             0.0476 *
cat_nivelSR                            0.7888  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

kim0sun / glca Goto Github PK

glca's Introduction

glca: An R Package for Multiple-Group Latent Class Analysis

Introduction

Installation

glca's People

Contributors

Stargazers

Watchers

Forkers

glca's Issues

Recommend Projects

Recommend Topics

Recommend Org

`glca`: An R Package for Multiple-Group Latent Class Analysis