yrosseel / lavaan Goto Github PK
View Code? Open in Web Editor NEWan R package for structural equation modeling and more
Home Page: http://lavaan.org
an R package for structural equation modeling and more
Home Page: http://lavaan.org
The theta matrix from inspect has class "lavaan.vector", but class "lavaan.matrix.symmetric" would make more sense. Here is a test case:
´´´´
library(lavaan)
model <- '
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8
dem60 ~ ind60
dem65 ~ ind60 + dem60
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data=PoliticalDemocracy)
class(inspect(fit, "theta"))
´´´´
As near as I can tell, while the bootstrapping code supports parallelism, there is no way to pass the parallel
option through a call to sem
or lavaan
.
If you request R-Square values for a correlation without latent variables, you will get a rather cryptic error and no results at all. I guess a warning would be better:
model <- 'x1 ~~ x2'
fit <- cfa(model, data = HolzingerSwineford1939)
summary(fit, rsquare=TRUE)
Errror:
Error: !is.null(pt2$lhs) ist nicht TRUE
If works fine with summary(fit, rsquare=FALSE)
.
Version: lavaan_0.5-23.1097
Hi Yves,
I'm running a simulation study using a multiple group analysis of a single DV to test for model fit while allowing group variances to be freely estimated. The main interest is in whether different estimators are more effective than others (e.g., WLS, DWLS, ML, etc...you may recall responding to an Google thread submitted by Rob Cribbie a few weeks back explicitly asking for this feature).
However, while running the simulations a peculiar bug kept arising when group sizes were unequal and using any estimator with a weight matrix. It seems that I get perfect convergence when the sample sizes are equal, and often < 30% to even 2% when the sizes are unequal (even by as little as 1 group difference in sample size). Here is an example to reproduce the problem using the current lavaan dev version:
# instability example when using WLS, DWLS, and WLSMV estimators
library(lavaan)
dat <- HolzingerSwineford1939[c(1:10, 301:292), ]
model <- 'x1 ~~ x1'
fit <- sem(model, data=dat, group="school", group.equal = 'intercepts')
fitWLS <- sem(model, data=dat, group="school", group.equal = 'intercepts', estimator='WLS')
#fitWLS missing (unequal).
#14 not converged
mods1 <- vector('list', 20)
for(i in 1:20)
mods1[[i]] <- sem(model, data=dat[-i,], group="school", group.equal = 'intercepts', estimator='WLS')
#fitWLS missing (equal)
#all converged, despite smaller sample size
mods2 <- vector('list', 10)
for(i in 1:10)
mods2[[i]] <- sem(model, data=dat[-c(i, 21-i), ], group="school", group.equal = 'intercepts', estimator='WLS')
Other programs (such as EQS and Mplus) don't seem to have this problem when using these estimators, so I believe this may be a bug. Sorry for submitting the problem here and not in the Google group but to me this seemed more like a issue report rather than a general query (I prefer potential bug reports on github, but you may not!). Let me know if you need any more examples or clarification. Cheers!
Phil
I am estimating the following empirically under identified model
library(lavaan)
d <- diag(3)
colnames(d) <- rownames(d) <- c("y1","y2","y3")
cfa("F =~ y1 + y2 + y3", sample.cov = d, sample.nobs = 100)´´´
I know that this should not work, but the error message that I get is not particularly informative
Error in solve.default(S) :
Lapack routine dgesv: system is exactly singular: U[1,1] = 0
I'm trying to compare models to assess measurement invariance with ordinal indicator variables. In the first model, all parameters are free to vary between groups; the second model restrains the loadings to be equal. In the second model, all loadings are free to be estimated (via NA in the model syntax) and the lv metric is set to one in the reference group.
But, when I try to compare both models via the 'anova' function, I get the following warning:
Warning message:
In lav_partable_constraints_ceq(P0, txtOnly = TRUE) :
lavaan WARNING: non-free parameter(s) in equality constraint(s): .p1. .p7.
Both (internal) parameter refer to the loadings of the first indicator variables, but the syntax explicitly frees them (via 'NA') and adds the constraint (via parameter label). The LR table shows both DF and Chisq for both models, but offers no information on model comparison. If I simply free both parameters but don't constrain them, the model comparison displays as expected.
I tried to investigate and noticed that, when I estimate the first indicator parameter and constrain it, the 'lav_partable_constraints_ceq' function (or rather, the function that calls it, lav_test_diff_af_h1) returns this function:
function (.x., ...)
{
out <- numeric(0L)
.p1. <- NA
.p7. <- NA
.p2. <- .x.[1]
.p3. <- .x.[2]
.p4. <- .x.[3]
.p5. <- .x.[4]
.p6. <- .x.[5]
.p8. <- .x.[6]
.p9. <- .x.[7]
.p10. <- .x.[8]
.p11. <- .x.[9]
.p12. <- .x.[10]
out[1] <- .p1. - (.p7.)
out[2] <- .p2. - (.p8.)
out[3] <- .p3. - (.p9.)
out[4] <- .p4. - (.p10.)
out[5] <- .p5. - (.p11.)
out[6] <- .p6. - (.p12.)
out[7] = .x.[78] - 1
return(out)
}
<environment: 0x7609ce8>
But if I fit the model without the constraint, the function has no 'NA's:
function (.x., ...)
{
out <- numeric(0L)
.p2. <- .x.[1]
.p3. <- .x.[2]
.p4. <- .x.[3]
.p5. <- .x.[4]
.p6. <- .x.[5]
.p8. <- .x.[6]
.p9. <- .x.[7]
.p10. <- .x.[8]
.p11. <- .x.[9]
.p12. <- .x.[10]
out[1] <- .p2. - (.p8.)
out[2] <- .p3. - (.p9.)
out[3] <- .p4. - (.p10.)
out[4] <- .p5. - (.p11.)
out[5] <- .p6. - (.p12.)
out[6] = .x.[78] - 1
return(out)
}
<environment: 0x8b8d658>
And, indeed, if I call the lavTestLRT function with parameter A.method='delta', the Chisq Diff is computed.
Thanks,
Erikson
See __ for fill description
However, in looking through the code, in the portion of lav_partable_labels.R starting at 64 none of the options pertain to composite loadings. So, something like
# LOADINGS
if("composite_loadings" %in% group.equal)
g1.flag[ partable$op == "<~" & partable$group == 1L ] <- TRUE
would seem to be in order.
There might also need to be a change in lav_partable.R around line 312 and lav_options.R around 204.
Hi Yves,
A colleague of mine was playing around with adjusting the starting values in lavaan
and noticed a bug relating to starting values when defining residual correlations.
This occurs when covariance terms are defined as fixed parameters, while the starting values for the variance terms ignore this information and instead are defined too low (causing a non-PD Sigma matrix). They noticed that EQS doesn't seem to do this since it appears to adjust the variance starting values accordingly until the matrices are valid.
Here is some reproducible code demonstrating the problem.
C <- scan(nlines = 5)
0.52781600 0.25541680 0.37734312 0.3666932 0.05954574
0.25541680 0.48958089 0.29120484 0.3257644 0.07582144
0.37734312 0.29120484 0.78670514 0.5020104 0.08264549
0.36669320 0.32576439 0.50201044 0.7148545 0.10810947
0.05954574 0.07582144 0.08264549 0.1081095 0.24767669
C <- matrix(C,5,5)
colnames(C) <- rownames(C) <- c("Satis","Influenc", "Climate", "Respect", "sex1f2m")
model <- "
Satis~Influenc + Climate + Respect
Influenc~sex1f2m
Climate~sex1f2m
Respect~sex1f2m
#Covariances
Influenc~~ Climate
Influenc ~~ Respect
Climate~~ Respect
"
fit <- sem(model, sample.cov = C, sample.nobs = 200)
summary(fit)
#use new starting values
model2 <- "
Satis~Influenc + Climate + Respect
Influenc~sex1f2m
Climate~sex1f2m
Respect~sex1f2m
#Variances and Covariances
Influenc~~ .265*Climate
Influenc ~~ .291*Respect
Climate~~ .464*Respect
"
#error, variance starting value problem
fit2 <- sem(model2, sample.cov = C, sample.nobs = 200)
sv <- sem(model2, sample.cov = C, sample.nobs = 200, do.fit=FALSE)
cov2cor(sv@[email protected][[1]]) #illegal correlations
#fix the issue by changing the starting variance terms
model3 <- "
Satis~Influenc + Climate + Respect
Influenc~sex1f2m
Climate~sex1f2m
Respect~sex1f2m
#Variances and Covariances (suitable starting values)
Climate ~~ start(1)*Climate
Respect ~~ start(1)*Respect
Influenc~~ .265*Climate
Influenc ~~ .291*Respect
Climate~~ .464*Respect
"
fit3 <- sem(model3, sample.cov = C, sample.nobs = 200)
summary(fit3)
I think the fix should be easy enough by simply checking whether the matrices are PD, and adjusting the free parameters accordingly until the matrices are valid. Cheers.
Please update the documentation for the type argument for "ov.x", "ov.y", "eqs.x", and "eqs.y".
Hi Yves,
Related to the issue about unit variance identification in multiple groups, I noticed that std.lv=TRUE gives an odd behavior in multiple-groups analysis. Specifically, it sets the factor variances to 1 even when loadings are equated. A good default in this case might be to set variance to 1 in one group, but free in the others, per typical identification rules. Or maybe it's better just to warn the user about this behavior since it yields non-equivalent models in the multiple-groups case.
library(lavaan)
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
'
#these are not equivalent models because variances are not freed in the second group in the uvi model
mweak_uli <- sem(mmod_uvi, data = HolzingerSwineford1939, std.lv=FALSE, group = "school", group.equal="loadings", meanstructure=TRUE)
mweak_uvi <- sem(mmod_uvi, data = HolzingerSwineford1939, std.lv=TRUE, group = "school", group.equal="loadings", meanstructure=TRUE)
Thanks!
Michael
Feature request: Specify parameters to add in a particular group (block / cluster / level). Currently, a parameter (e.g., "f1=~x1") is freed for the first group in a multigroup model.
example(cfa)
mgfit <- update(fit, group = "school")
## this only adds parameter in group 1 (no EPCs in group 2)
lavTestScore(mgfit, add = 'visual =~ x5', epc = TRUE)
## error
lavTestScore(mgfit, add = 'visual =~ c(NA, NA)*x5')
## No error with parTable, but nothing freed
MIs <- modindices(mgfit)[c(2, 55), 1:3]
MIs$school <- 1:2
MIs
lavTestScore(mgfit, add = MIs)
In multi-group CFA, calls like inspect(fit, "cov.lv")
return a list with one entry per group. However, inspect(fit, 'est')
flattens the list, so that the 5 entries (lambda, theta, psi, nu, alpha) are repeated for each group. For example, with two groups, it will return a list with 10 entries, rather than 2.
The expected outcome seems like having one entry per subject, like 'cov.lv'.
example from lavaan docs:
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit <- cfa(HS.model,
data = HolzingerSwineford1939,
group = "school")
inspect(fit, 'est') # 10 entries in single list
This is the case with all the what options listed under "Model Entries" in the docs:
bad_whats <- c('est', 'dx.free', 'dx.all', 'std', 'std.lv', 'std.nox')
good_whats <- c('cov.lv', 'cor.lv', 'mean.lv', 'cov.ov', 'cor.ov', 'mean.ov', 'cov.all', 'cor.all', 'th')
# should return 2 for the HS data
test_what <- function(what) length(inspect(fit, what))
unlist(lapply(good_whats, test_what))
unlist(lapply(bad_whats, test_what))
I think it would be a quick fix. If returning one entry per subject seems like the correct behavior, just let me know and I'm happy to make the changes and submit a pull request.
Edit. Just as a note in case someone needs a workaround, you could use something like...
# flattened list with named entries for each component
est <- inspect(fit, 'est')
# vector of group labels
g_lab <- inspect(fit, 'group.label')
# number of unique components (e.g. lambda, theta, psi, nu)
# this is necessary because some components will be omitted
# if something like meanstructure=FALSE is used when fitting
n_comp <- length(unique(names(est)))
# assumes that order of elements in est corresponds to group vector
g_est <- split(est, rep(g_lab, each=n_comp))
# can verify on HS example above
cov1 <- inspect(fit, 'cov.ov')[['Grant-White']]
cov2 <- with(g_est[['Grant-White']], lambda %*% psi %*% t(lambda) + theta)
all(cov1 == cov2)
Hi Yves,
With anova() now being a wrapper to lavTestLRT(), it appears that some arguments are mistakenly repeated. For example, if we call anova(), the SB.classic argument can become both a named argument and a "dots" argument to lavTestLRT(). This results in an error:
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit <- cfa(HS.model, data=HolzingerSwineford1939)
## Error:
anova(fit, SB.classic=FALSE)
The above is using version 0.5-16; I apologize if it is already fixed in the devel version.
I believe the fix involves checking whether or not SB.classic, SB.H0, etc, are supplied as arguments to anova(), prior to calling lavTestLRT(). This way, we know what should be passed in as dots.
Best,
Ed
This happens in some models but not others, but I have a repeatable example for 0.5-20:
library(semTools)
library(lavaan)
borModel <- '
NDVI ~ nTot + T61 + Wet
nTot ~ T61
'
borFit <- sem(borModel, data=boreal, meanstructure=T)
> head(residuals(borFit, "casewise"))
NDVI nTot T61 Wet
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 0 0 0
[4,] 0 0 0 0
[5,] 0 0 0 0
[6,] 0 0 0 0
This is incorrect - there should be non-zero residuals for NDVI and nTot. This doesn't always happen, though, so I'm not sure what's going on here. FYI, this example worked in 0.5-17 just fine.
Hi Yves,
I would like to identify a multiple-groups measurement invariance model (e.g., weak) using the 'unit variance identification' approach (aka standardized factors). In the multiple-groups case, all loadings are free parameters that are equated between groups. The factor variances are fixed to 1 in one group, but free in the others. I tried to specify this using a parameter vector that contains a 1 for one group, but a label for the other. This led lavaan to apply the label to both groups as an equality constraint. Syntax follows:
#trying to mix constraints and labels to achieve ULI in two groups
mmod_uvi <-'
visual =~ NA*x1 + x2 + x3
textual =~ NA*x4 + x5 + x6
speed =~ NA*x7 + x8 + x9
#does not work -- applies an equality constraint
visual ~~ c(1, v2)*visual
textual ~~ c(1, t2)*textual
speed ~~ c(1, s3)*speed
'
mweak_uvi_bad <- sem(mmod_uvi, data = HolzingerSwineford1939, std.lv=FALSE, group = "school", group.equal="loadings", meanstructure=TRUE)
Thanks for continuing to develop this excellent package!
Michael
The function "simulateData" is currently not able to provide standardized output in the general case, as the following example demonstrates:
lav <- lavaanify( "
y ~ (-.5) * x + (.25) * z
x ~ .5*U
z ~ .5*U
")
diag(cov(simulateData( lav, standardized=TRUE, sample.nobs=100000 )))
The variance of y far exceeds 1.
When Lavaan converges to a non-positive definite residual covariance matrix, it issues a warning:
"lavaan WARNING: residual covariance matrix is not positive definite; use inspect(fit,"cov.ov") to investigate."
Inspecting fit for cov.ov works fine, but this option does not seem to be documented in lavaan-class documentation.
Moreover, the naming of cov.ov can be misleading. The documentation explains cov.all and cov.lv as follows:
"cov.all":
Model implied covariance matrix of both the observed and latent variables.
"cov.lv":
Model implied covariance matrix of the latent variables.
Just by reading the documentation, I would assume that cov.ov would be defined in the following way:
"cov.ov":
Model implied covariance matrix of the observed variables.
One way to avoid potential confusion would be to change the error message to be the following:
"lavaan WARNING: residual covariance matrix (theta) is not positive definite; use inspect(fit,"coef")$theta to investigate."
and then change inspect(..., "cov.ov") so that it returns the model implied matrix.
In the interest of reproducibility, could you add an argument iseed
to the bootstrapLavaan()
and bootstrapLRT()
functions, which could then be passed to parallel::clusterSetRNGStream(cl, iseed)
when you call it on line 351 of lav_bootstrap.R and on line 299 of lav_bootstrap_lrt.R? If so, your help page description of the argument iseed
might also warn the interested user to first set RNGkind()[1L] == "L'Ecuyer-CMRG"
.
Thanks!
Terry
I can get the "strictly positive" robust difference test of Satorra & Bentler (2010) to go negative (see attached). When I look at the result of line 227 or so of lav_test_diff.R, it looks as though m10 is not properly using the starting values from M0 (i.e., PE.M0.extended) even though it is specified as the "start" argument. I've seen this happen recently before in some of my other work. If both "model" and "start" arguments are specified and "model" happens to be a parameter table with certain columns (e.g., "est"), then the "start" argument appears to have no effect.
See attached example. Still can go negative when the model syntax has custom start values. I suppose this might mean that the test is also just not correct if the user has start values in their syntax.
SB2010NegativeAgain.pdf
WLS.VD is NULL when WLS.V is user-specified in the lavaan call. This generates a bug in lavaan.survey when DWLS is chosen as an estimator.
Minimal R example:
fitMeasures(fit, fit.measures = "bic2") does not deliver any information.
Workaround:
fitMeasures(fit)[22]
fitMeasures(fit, c("bic", "bic2"))["bic2"]
Lavaan version: 0.5.20
For a variety of situations it can be helpful to get maximum likelihood estimates of means and covariance matrices. lavaan can do this, but the functions are all internal and there is no convenient exported function to accomplish it. I wonder if the internal functions could be exported, or alternately the function below perhaps added to lavaan (uses roxygen2 style documentation):
#' Estimate the first and second moments
#'
#' This function relies on the \pkg{lavaan} package to use the
#' Expectation Maximization (EM) algorithm to estimate the first and
#' second moments (means and [co]variances) when there is missing data.
#'
#' @param data A data frame or an object coercable to a data frame.
#' The means and covariances of all variables are estimated.
#' @param \dots Additional arguments passed on to the \code{estimate.moments.EM}
#' function in \pkg{lavaan}. Note this is not an exported function.
#' @return A list containing the esimates from the EM algorithm.
#' \item{mu}{A named vector of the means.}
#' \item{sigma}{The covariance matrix.}
#' @seealso \code{\link{SEMSummary}}
#' @keywords multivariate
#' @importFrom lavaan lavaan
#' @export
#' @examples
#' # sample data
#' Xmiss <- as.matrix(iris[, -5])
#' # make 25% missing completely at random
#' set.seed(10)
#' Xmiss[sample(length(Xmiss), length(Xmiss) * .25)] <- NA
#' Xmiss <- as.data.frame(Xmiss)
#'
#' # true means and covariance
#' colMeans(iris[, -5])
#' # covariance with n - 1 divisor
#' cov(iris[, -5])
#'
#' # means and covariance matrix using list wise deletion
#' colMeans(na.omit(Xmiss))
#' cov(na.omit(Xmiss))
#'
#' # means and covariance matrix using EM
#' moments(Xmiss)
#' # clean up
#' rm(Xmiss)
moments <- function(data, ...) {
if (!is.data.frame(data)) {
data <- as.data.frame(data)
}
n <- colnames(data)
X <- lavaan:::lav_data_full(data, ov.names = n, missing = "fiml")
mpat <- lavaan:::getMissingPatternStats(X = X@X[[1L]], Mp = X@Mp[[1L]])
moments <- lavaan:::estimate.moments.EM(X = X@X[[1L]], M = mpat, verbose=FALSE, ...)
sigma <- moments$sigma
dimnames(sigma) <- list(n, n)
mu <- moments$mu
names(mu) <- n
return(list(mu = mu, sigma = sigma))
}
Hi Yves,
I'm copying a bug posted on the lavaan forum.
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit <- cfa(HS.model, data=HolzingerSwineford1939,
group = "school", group.label = c("Pasteur","Grant-White"))
On line 333 of lav_data.R, the user-supplied group.label
is compared to an all lowercase version of the group labels, so match()
returns NA
. I couldn't find tolower(group.label)
anywhere, but the labels are already lowercase in the lavoptions
argument of lavData()
.
I've been using lavaan fairly heavily for the past year, and have been interested in getting to know the code base better. One way that would help me get a sense for how everything is set up would be to work on unit tests.
Do you have ideas on how you might want unit tests implemented for lavaan? I'm familiar with testwhat, and have used it on travis, so they should be quick to set up. A small list of functions (or a group of functions) that would be simplest / most useful to start testing may be a good way to get things rolling.
I think model syntax (e.g. lavaanify, etc..) might be a good place to start. What do you think?
Hi Yves,
It seems that estfun.lavaan now adds a row of NAs to the bottom of the resulting matrix. This can be problematic when manipulating the result (say, multiplying it with other matrices).
This must be related to the new code on lines 29-30 of ctr_estfun.R, but I don't understand the need for the new code to be able to fix it. Example using the devel version:
library("lavaan")
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit <- cfa(HS.model, data=HolzingerSwineford1939)
scmat <- estfun.lavaan(fit)
## One extra row, compared to the data:
nrow(scmat); nrow(HolzingerSwineford1939)
## Last row is NA
scmat[nrow(scmat),]
"cor.all" option is missing from documentation for lavInspect function.
Here is a test case.
library(lavaan)
library(MASS)
set.seed(1)
Sigma <- matrix(.5,6,6)
diag(Sigma) <- 1
data <- mvrnorm(1000, rep(0,6), Sigma)
colnames(data) <- paste("x",1:6,sep="")
fit <- sem("
E1 <~ x1 + x2
E2 =~ x3 + x4
E3 =~ x5 + x6
E2 ~ E1
E3 ~ E1", data)
# This causes an error
inspect(fit,"cov.lv")
I checked the code in lab_object_inspect.R around line 615. The following line
colnames(OUT[[g]]) <- rownames(OUT[[g]]) <-
lavobject@pta$vnames$lv.regular[[g]]
should probably be changed to
colnames(OUT[[g]]) <- rownames(OUT[[g]]) <-
lavobject@pta$vnames$lv[[g]]
I have a CFA model and data that result in inadmissible estimates for indicator error variances. The model can be estimated normally, but applying the function standardizedSolution results in an error.
> standardizedSolution(fit1)
Error in class(VarCov) <- c("lavaan.matrix.symmetric", "matrix") :
attempt to set an attribute on NULL
In addition: Warning message:
In lav_model_vcov(lavmodel = object@Model, lavsamplestats = object@SampleStats, :
lavaan WARNING: could not compute standard errors!
It is possible to get standardized estimates by multiplying the estimated model matrices with the estimated LV variances and indicator variances. So instead of this error, I would expect one of the following:
or preferably
Here is a test case:
sample.cov <- matrix(c(0.919323977, 0.766740001, 0.066258725, -0.111425160, 0.007706098, -0.078521209, 0.012025736, 0.020127450, 0.071862143, 0.113682333, 0.134137419, 0.092911030,
0.101214337, 0.037114861, -0.105266578, -0.072374838, -0.001273495, -0.128151298, -0.012445515, -0.077614906, 0.766740001, 0.955523853, -0.063910111, -0.112417795,
-0.010720701, -0.060970584, -0.131011915, -0.065864023, -0.029530664, 0.087906957, 0.094355108, 0.017582240, 0.065034878, -0.009334287, -0.107692023, 0.002037855,
0.024592422, -0.103881277, 0.012206695, -0.037065708, 0.066258725, -0.063910111, 0.970552039, -0.025345686, 0.097430460, 0.242140260, 0.315607859, 0.238416500,
-0.043147193, 0.030302308, 0.100249935, 0.054952628, -0.163770721, 0.143041572, -0.140144810, -0.047422879, 0.128245794, 0.041148256, -0.021819288, 0.067675768,
-0.111425160, -0.112417795, -0.025345686, 1.011983891, 0.094446928, 0.156445551, 0.106725111, 0.223081246, 0.146865965, -0.031128505, 0.086008469, 0.079386433,
-0.132254476, -0.138498059, 0.115720314, 0.013446222, 0.150341646, -0.016805426, 0.161104601, 0.137577383, 0.007706098, -0.010720701, 0.097430460, 0.094446928,
0.731970201, 0.310613008, 0.239905460, 0.166641970, 0.090719750, -0.022220925, 0.023118535, -0.038252221, -0.102308806, -0.003940806, 0.034653455, -0.005238235,
-0.006395954, 0.059983330, 0.094904587, 0.050182402, -0.078521209, -0.060970584, 0.242140260, 0.156445551, 0.310613008, 1.005757893, 0.288441144, 0.325021274,
0.037701396, -0.022205750, 0.026278646, 0.015310866, -0.072382732, 0.046867411, -0.086929255, -0.023074425, 0.106481492, 0.033230197, 0.079465430, 0.114588374,
0.012025736, -0.131011915, 0.315607859, 0.106725111, 0.239905460, 0.288441144, 0.934088376, 0.351943251, -0.013773773, -0.016085897, -0.094281701, -0.018804459,
-0.263474126, -0.037801403, -0.018926066, -0.108902426, 0.074395209, 0.057898971, 0.040098640, 0.118445293, 0.020127450, -0.065864023, 0.238416500, 0.223081246,
0.166641970, 0.325021274, 0.351943251, 1.018343335, 0.007725244, 0.030409892, 0.022724372, 0.070403013, -0.144695491, 0.032331337, -0.015813722, -0.011483605,
0.114931752, 0.092925413, 0.134563805, 0.208856685, 0.071862143, -0.029530664, -0.043147193, 0.146865965, 0.090719750, 0.037701396, -0.013773773, 0.007725244,
0.836785836, 0.288802640, 0.503568392, 0.458172400, 0.052599380, -0.023244875, 0.015009319, -0.049752846, -0.070717923, 0.046889647, -0.002734165, -0.081784423,
0.113682333, 0.087906957, 0.030302308, -0.031128505, -0.022220925, -0.022205750, -0.016085897, 0.030409892, 0.288802640, 0.894978101, 0.565979912, 0.552308550,
-0.018698189, 0.028933718, 0.002242280, -0.277685495, -0.099664560, -0.089573867, -0.047466352, -0.080352177, 0.134137419, 0.094355108, 0.100249935, 0.086008469,
0.023118535, 0.026278646, -0.094281701, 0.022724372, 0.503568392, 0.565979912, 1.074109093, 0.759997208, 0.134103386, 0.009447939, 0.123764451, -0.143701074,
0.010553272, -0.002661356, 0.074887536, -0.066821944, 0.092911030, 0.017582240, 0.054952628, 0.079386433, -0.038252221, 0.015310866, -0.018804459, 0.070403013,
0.458172400, 0.552308550, 0.759997208, 1.003489078, 0.034688074, -0.003945250, 0.016964838, -0.092256520, -0.056024247, -0.105947514, -0.081623179, -0.109934600,
0.101214337, 0.065034878, -0.163770721, -0.132254476, -0.102308806, -0.072382732, -0.263474126, -0.144695491, 0.052599380, -0.018698189, 0.134103386, 0.034688074,
0.805637818, 0.315801061, -0.001715445, -0.030595632, -0.115177988, -0.125230032, -0.141002405, -0.220819320, 0.037114861, -0.009334287, 0.143041572, -0.138498059,
-0.003940806, 0.046867411, -0.037801403, 0.032331337, -0.023244875, 0.028933718, 0.009447939, -0.003945250, 0.315801061, 0.810955673, -0.068634844, -0.120377973,
-0.126023397, -0.028793192, -0.064456696, -0.136589147, -0.105266578, -0.107692023, -0.140144810, 0.115720314, 0.034653455, -0.086929255, -0.018926066, -0.015813722,
0.015009319, 0.002242280, 0.123764451, 0.016964838, -0.001715445, -0.068634844, 0.920171462, 0.360986283, 0.372790971, 0.500823316, 0.451052122, 0.443236525,
-0.072374838, 0.002037855, -0.047422879, 0.013446222, -0.005238235, -0.023074425, -0.108902426, -0.011483605, -0.049752846, -0.277685495, -0.143701074, -0.092256520,
-0.030595632, -0.120377973, 0.360986283, 0.919678344, 0.414889108, 0.427011483, 0.474684157, 0.448998532, -0.001273495, 0.024592422, 0.128245794, 0.150341646,
-0.006395954, 0.106481492, 0.074395209, 0.114931752, -0.070717923, -0.099664560, 0.010553272, -0.056024247, -0.115177988, -0.126023397, 0.372790971, 0.414889108,
0.872031833, 0.578301782, 0.583658195, 0.547883394, -0.128151298, -0.103881277, 0.041148256, -0.016805426, 0.059983330, 0.033230197, 0.057898971, 0.092925413,
0.046889647, -0.089573867, -0.002661356, -0.105947514, -0.125230032, -0.028793192, 0.500823316, 0.427011483, 0.578301782, 0.999107847, 0.684127263, 0.633731258,
-0.012445515, 0.012206695, -0.021819288, 0.161104601, 0.094904587, 0.079465430, 0.040098640, 0.134563805, -0.002734165, -0.047466352, 0.074887536, -0.081623179,
-0.141002405, -0.064456696, 0.451052122, 0.474684157, 0.583658195, 0.684127263, 0.842624341, 0.655509260, -0.077614906, -0.037065708, 0.067675768, 0.137577383,
0.050182402, 0.114588374, 0.118445293, 0.208856685, -0.081784423, -0.080352177, -0.066821944, -0.109934600, -0.220819320, -0.136589147, 0.443236525, 0.448998532,
0.547883394, 0.633731258, 0.655509260, 0.813095654), 20, 20)
rownames(sample.cov) <- colnames(sample.cov) <- c("i11", "i12", "i21", "i22", "i23", "i24", "i25", "i26", "i31", "i32", "i33", "i34", "i41", "i42", "i51", "i52", "i53", "i54", "i55", "i56")
model <- "f1 =~ i11 + i12
f2 =~ i21 + i22 + i23 + i24 + i25 + i26
f3 =~ i31 + i32 + i33 + i34
f4 =~ i41 + i42
f5 =~ i51 + i52 + i53 + i54 + i55 + i56"
# This works and produces unstandardized estimates
fit1 <- cfa(model, sample.cov = sample.cov, sample.nobs = 100)
summary(fit1)
# This works and produces standardized estimates
fit2 <- cfa(model, sample.cov = cov2cor(sample.cov), sample.nobs = 100, std.lv = TRUE)
summary(fit2)
# Getting standardized estimates from the unstandardized estimates does not work
standardizedSolution(fit1)
# But standardized estimates could nevertheless be calculated manually from the results
var.lv <- diag(inspect(fit1,"cov.lv"))
var.ov <- diag(sample.cov)
sd.lv <- sqrt(var.lv)
sd.ov <- sqrt(var.ov)
# Standardize the results matrices
psi <- inspect(fit1,"coef")$psi / (sd.lv %o% sd.lv)
lambda <- inspect(fit1,"coef")$lambda * ((1/sd.ov) %o% sd.lv)
theta <- inspect(fit1,"coef")$theta / (sd.ov %o% sd.ov)
# Compare the results from fit2 and manually standardized fit1 (these are close, but we should not expect them to be identical)
inspect(fit2,"coef")$psi
psi
inspect(fit2,"coef")$lambda
lambda
inspect(fit2,"coef")$theta
theta
Hi Yves,
I found that the chi-square values from a simple regression model was different. Here is the problematic model:
MODEL:
efficacy on years@0 cigs@0;
This should be a fixed.x = TRUE
model and Mplus provides df = 2. The number of parameter estimated is 2, which matches lavaan.
MODEL:
years cigs;
efficacy on years@0 cigs@0;
This should be a fixed.x = FALSE
model and Mplus provides df = 2 as well. The number of estimated parameter is 7, which matches lavaan as well.
However, Mplus provided different chi-square values but lavaan provides the same chi-square values in both models. The reported chi-squares from lavaan matches the fixed model in Mplus.
Beside that, the baseline.df values in lavaan do not match with Mplus. The df from fixed and free model from Mplus is both 2. However, the baseline.df values in lavaan are 0 and 3, respectively.
Here is the lavaan script:
regmodelfixed <- '
eff ~ 0*years + 0*cigs
'
out1 <- sem(regmodelfixed, data=dat, fixed.x = TRUE, meanstructure = TRUE)
out2 <- sem(regmodelfixed, data=dat, fixed.x = FALSE, meanstructure = TRUE)
inspect(out1, "fit")
inspect(out2, "fit")
Best,
Sunthud
Hi Yves,
it seems that constraining the residuals doesn't work correctly under the WLSMV-estimation.hen i have ordered ovs, residuals differ between groups but without ordered variables, those are the same (as i would expect when those are constrained). Would be glad if you could look into it!
set.seed(1)
library(mirt)
model = "x =~ Comfort + Work + Future + Benefit"
Science$group = 1:2
parameterestimates(cfa(model, data=Science, ordered = names(Science)[1:4], group="group", group.equal= c("loadings", "intercepts", "residuals")))
parameterestimates(cfa(model, data=Science, group="group", group.equal= c("loadings", "intercepts", "residuals")))
Greetings, Eves
We want you to put this warning message into an R warning, so we can catch it. This is the one we mentioned while you were visiting in Kansas
} else {
cat("\n[lavaan message:] could not compute standard errors!\n")
cat("\n You can still request a summary of the fit to inspect")
cat("\n the current estimates of the parameters.\n")
VarCov <- NULL
}
Paul Johnson
CRMDA, KU
This is not an "issue", - apologies for posting my question here.
How is the Minimum Function Test Statistic (Chi square) computed by lavaan, in terms of the fitted model objects?
Thanks
Amarnath
In the help file for the measurementInvariance function, I believe that the help file has the boolean action of the quiet argument reversed compared to actual behavior. Suggest changing 'if TRUE' to 'if FALSE' or rewriting the help file explanation to explain both states. The argument itself behaves as one would expect.
The help file reads as follows:
quiet
If TRUE, a summary is printed out containing an overview of the different models that are fitted, together with some model comparison tests.
quiet=TRUE would logically not produce output, which it does. quiet=FALSE (the default) would logically produce output, which it does. See below...
measurementInvariance(MODEL.invar, data=DATA_gender, group="Gender", quiet=TRUE)
measurementInvariance(MODEL.invar, data=DATA_gender, group="Gender", quiet=FALSE)Measurement invariance models:
Model 1 : fit.configural
Model 2 : fit.loadings
Model 3 : fit.intercepts
Model 4 : fit.means
....
I may have found a bug in lavTables(). When I provide data where one of the groups does not have observations in both categories, it returns 100% for both categories instead of 100% for one and 0% for the other. Maybe a situation for drop = FALSE when one dimension collapses down to length == 1? Here is an example with binary data
(temp <- data.frame(g = c(1,1,2,2), x = c(1,2,2,2)))
lavTables(temp, dimension = 1L, categorical = "x", group = "g")
When there are more categories, it spits out an error about differing number of rows.
(temp <- data.frame(g = c(1,1,1,2,2,2), x = c(1,2,2,1,2,3)))
lavTables(temp, dimension = 1L, categorical = "x", group = "g")
The documentation of standardizedSolution states
"Value
A data.frame containing both unstandardized and standardized model parameters."
but the function returns only the standardized model parameters.
I have a simple model with five observed variables (x1, x2, m, y1, y2). I generate data with lavaan, estimate the model, and then inspect cov.all from the result. The resulting matrix contains each observed indicator twice.
> inspect(sem1, "cov.all")
m y1 y2 x1 x2 m y1 y2 x1 x2
m 1.00
y1 0.70 1.00
y2 0.80 0.56 1.00
x1 0.70 0.49 0.56 1.00
x2 0.80 0.56 0.64 0.50 1.00
m 1.00 0.70 0.80 0.70 0.80 1.00
y1 0.70 1.00 0.56 0.49 0.56 0.70 1.00
y2 0.80 0.56 1.00 0.56 0.64 0.80 0.56 1.00
x1 0.70 0.49 0.56 1.00 0.50 0.70 0.49 0.56 1.00
x2 0.80 0.56 0.64 0.50 1.00 0.80 0.56 0.64 0.50 1.00
Here is the code that reproduces this. (The data are generated in a bit weird way, but that does not make a difference.)
library(lavaan)
b1 <- .4
b2 <- .6
b3 <- .7
b4 <- .8
r12 <- .5
# Derived paramete values
er1 <- 1-(b1^2 + b2^2 + 2*b1*b2*r12)
er2 <- 1-b3^2
er3 <- 1-b4^2
if(any(c(e1,e2,e3)<0)) stop("Parameterization results in negative variance")
MODEL <-"
x1 ~~ 1*x1
x2 ~~ 1*x2
x1 ~~ r12*x2
m ~ b1*x1 + b2*x2 + 1*e1
m~~0*m
e1 ~~ er1*e1
y1 ~ b3*m + 1*e2
y1 ~~ 0*y1
e2 ~~ er2*e2
y2 ~ b4*m + 1*e3
y2 ~~ 0*y2
e3 ~~ er3*e3
"
# Substitute the pameters into the model
for(param in c("b1","b2","b3","b4","er1","er2","er3","r12")){
MODEL <- sub(param,get(param),MODEL)
}
data <- simulateData(MODEL, empirical = TRUE, seed = 12345, sample.nobs = 10)
sem1 <- sem("
m ~ x1 + x2
y1 ~ m
y2 ~ m
", data[,1:5])
inspect(sem1, "cov.all")
I may be mistaken, but it would appear that line 507 in lav_predict.R has a minor error. Line 507 reads:
eta.i <- rep(as.numeric(NA), nfac2)
but I think nfac2
should actually be nfac
.
Reason: nfac2
stores the number of factors plus number of ov.y
dummy variables. nfac
stores just the former. The vector eta.i
should be the length of nfac
at line 511, because it then goes on to add an extra value for each ov.y
dummy in lines 511-514. If line 507 creates a vector of length nfac2
, then lines 511-514 will end up creating a vector eta.i
which is longer than the vector it replaces (FS[[g]][i,]
) in line 516.
Hi Yves,
running the example of simulateData() doesn't work for me (lavaan 0.5.17) and throws:
Fehler in lav_model_objective(lavmodel = lavmodel, lavsamplestats = lavsamplestats, :
unbenutztes Argument (link = lavoptions$link)
Can you replicate this? Best, Felix
population.model <- ' f1 =~ x1 + 0.8_x2 + 1.2_x3
f2 =~ x4 + 0.5_x5 + 1.5_x6
f3 =~ x7 + 0.1_x8 + 0.9_x9
f3 ~ 0.5_f1 + 0.6_f2'
myData <- simulateData(population.model, sample.nobs=100L)
This happens with both 0.5-20 and the latest development version:
example(cfa)
lavInspect(fit, "information.first.order")
Error in t(Delta[[g]]) %*% B1 : non-conformable arguments
The error occurs on line 272 of "lav_model_information.R"
I have a colleague who presented me with the following error message from Lavaan
Error in eigen(Sigma.hat[[g]], symmetric = TRUE, only.values = TRUE) :
infinite or missing values in 'x'
In addition: Warning messages:
1: In getDataFull(data = data, group = group, group.label = group.label, :
lavaan WARNING: some observed variances are (at least) a factor 100 times larger than others; please rescale
2: In lavSampleStatsFromData(Data = lavaanData, rescale = (lavaanOptions$estimator == :
lavaan WARNING: sample covariance can not be inverted
It turned out that the cause was that one of this observed variables had zero variance because he had chosen the data conditional on a particular value for that variable. Would it be possible to add a check for this (zero variance) and a more obvious error message?
Is there any way to get bias corrected bootstrap estimates for model parameters? At least for tests of indirect effects, I think these are supposed to be superior to naive bootstrap approaches. Are there plans to include different bootstrap methods (especially bias corrected and accelerated) in future releases?
It is currently possible to define a latent variable that is measured with itself. This is obviously an error by the modeler, but I would guess that checking that the LV names and IV names are distinct would not be difficult to add.
Here is an example:
lavaanify("#measurement model
IU =~IU1
USAGE =~USAGE
HABIT=USE2+USE3+USE4PRIEXP
PRIEXP=PRIEXPIU+HABIT
#regressions
USAGE
HABIT
#fix variance IU,USAGE,PRIEXP
IU0*IU0_USAGE
USAGE
PRIEXP~~0_PRIEXP")
Hi Yves,
I have been playing with the PML stuff in lav_objective.R and noticed a minor issue. Line 242 takes PSTAR[i,j], which is an entry above the diagonal. But all the nonzero entries of PSTAR are below the diagonal. So I think it should be
pstar.idx <- PSTAR[j,i]
or the loops could be rearranged, etc.
Hello,
How do I get specific values from a cfa fit. For example,
fit <- cfa.run(dataframe, cfa_structure)
summary(fit)
What I want to do, for example, is return values like the p-value.
Hi,
There's the following line in the varTable function
else if (class(object) == "data.frame")
would if be possible to change this to
else if (inherits(object,"data.frame"))
so that it works with data.table, which inherits from data.frame?
This is the first github issue I've raised so I'm not too familiar with it. I tried to search for the varTable function so I could raise this tagged to that line of code, but github didn't seem to find varTable in your project. So I hope this is the right place! Sorry if not.
Thanks, Matthew
Hi
When I run
load("~/Desktop/data.Rdata")
fit <- sem(modelSpecification, sample.cov=cov(data),
sample.nobs = nrow(data),
sample.cov.rescale = FALSE,
se = "none",
meanstructure = FALSE,
fixed.x = FALSE)
eigen(inspect(fit,"cov.lv"))
I get
Warning message:
In lavaan::lavaan(model = modelSpecification, meanstructure = FALSE, :
lavaan WARNING: covariance matrix of latent variables is not positive definite; use inspect(fit,"cov.lv") to investigate.
> eigen(inspect(fit,"cov.lv"))
$values
[1] 1.91571541 0.64419516 0.36182077 0.26453840 0.18083462 0.14478385 0.09256237 0.02027073
You can download the data and model specification here
https://www.dropbox.com/s/2teibj6v3bq8ofe/data.Rdata?dl=0
(The model does not make much sense, but that is unrelated to this report.)
Hi Yves,
The new way of handling equality constraints has broken the estfun/lavScores function in ctr_estfun.R:
model <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + a*y2 + b*y3 + c*y4
dem65 =~ y5 + a*y6 + b*y7 + c*y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data=PoliticalDemocracy)
scores <- lavScores(fit)
# parameters with equality constraints don't sum to 0:
colSums(scores)
It appears to be working with the following changes to ctr_estfun.R, but I'm uncertain that this works in general:
143a144,148
>
> # handle equality constraints
> if([email protected]) {
> Score.mat <- Score.mat %*% [email protected]
> }
146c151
< colnames(Score.mat) <- names(coef(object))
---
> colnames(Score.mat) <- unique(names(coef(object)))
Finally, it is not clear (to me) how to remove redundant rows/columns of the vcov() matrix in the "correct" way. It doesn't appear as simple as using duplicated(), and I think this has some implications for the measurement invariance code in strucchange. Thanks & sorry for more work.
Hopefully a quick question:
Is it possible to constrain standardized loadings to be equal to one another in a CFA model?
My understanding is that this is not possible, but I wanted to double-check. Thanks in advance.
Jonathan
Hi Yves,
I was browsing the codes for WLS estimation and found something which might be unintended.
On line 183 of lav_samplestats.R (https://github.com/yrosseel/lavaan/blob/master/R/lav_samplestats.R#L183) a vector is created with thresholds and means intertwined, corresponding to sigma_1 in the Muthen paper. Next on line 239 (https://github.com/yrosseel/lavaan/blob/master/R/lav_samplestats.R#L239) the means are made negative. This is done by checking ov.types for numeric. However, this vector is not of the same dimensions of TH if there are variables with more than one threshold. In particular this will cause the wrong parameters to be made negative.
For example:
set.seed(3)
N <- 1000
Data <- data.frame(
o1 = ordered(sample(0:3,N,TRUE)),
o2 = ordered(sample(0:3,N,TRUE)),
c1 = rnorm(N),
c2 = rnorm(N)
)
# Lavaan model:
model <- '
f1 =~ o1 + o2 + c1 + c2
'
# Run Lavaan:
fit <- lavaan::cfa(model, data=Data)
# Inspect:
pars <- parameterEstimates(fit)
subset(pars, op == "|")
Results in:
lhs op rhs est se z pvalue ci.lower ci.upper
1 o1 | t1 -0.710 0.044 -16.310 0.000 -0.795 -0.624
2 o1 | t2 -0.050 0.040 -1.264 0.206 -0.128 0.028
3 o1 | t3 -0.625 0.043 -14.670 0.000 -0.708 -0.541
4 o2 | t1 0.700 0.043 16.129 0.000 0.615 0.785
5 o2 | t2 -0.023 0.040 -0.569 0.569 -0.100 0.055
6 o2 | t3 0.719 0.044 16.490 0.000 0.634 0.805
Here threshold 3 of variable o1 is lower than threshold 2 and threshold 1 of o2 is higher than threshold 2.
It might be that the same thing is happening in lav_muthem1984.R (https://github.com/yrosseel/lavaan/blob/master/R/lav_muthen1984.R#L427), maybe that's why it isn't a problem as the rows and columns corresponding to these thresholds are also made negative in the weights matrix.
Should be easy to fix. e.g., by using:
ov.types.2 <- unlist(mapply(l=pmax(1,ov.levels-1),t=ov.types,FUN=function(l,t)rep(t,l)))
TH[ov.types.2 == "numeric"] <- -1*TH[ov.types.2 == "numeric"]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.