Giter VIP home page Giter VIP logo

lme4qtl's Introduction

lme4qtl

travis-ci build status

lme4qtl extends the lme4 R package for quantitative trait locus (qtl) mapping. It is all about the covariance structure of random effects. lme4qtl supports user-defined matrices for that, e.g. kinship or IBDs.

See slides bit.ly/1UiTZvQ introducing the lme4qtl R package or read our article / preprint.

Package Continuous response
stats lm(myTrait ~ myCovariate, myData)
lme4 lmer(myTrait ~ myCovariate + (1|myID), myData)
lme4qtl relmatLmer(myTrait ~ myCovariate + (1|myID), myData, relmat = list(myID = myMatrix))
Package Binary response
stats glm(myStatus ~ 1, myData, family = binomial)
lme4 glmer(myStatus ~ (1|myID), myData, family = binomial)
lme4qtl relmatGlmer(myStatus ~ (1|myID), myData, relmat = list(myID = myMatrix), family = binomial)

Note that rownames/colnames of myMatrix have to be values of myID variable, so matching between relationship matrix and grouping variable is possible. The order doesn't matter.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("variani/lme4qtl")

The official release on CRAN is pending.

Citation

To cite the lme4qtl package in publications use:

  Ziyatdinov et al., lme4qtl: linear mixed models with flexible
  covariance structure for genetic studies of related individuals, 
  BMC Bioinformatics (2018)

Contact

You are welcome to submit suggestions and bug-reports at https://github.com/variani/lme4qtl/issues.

Example

library(lme4)
library(lattice)
library(lme4qtl)
packageVersion("lme4qtl")
#> [1] '0.2.1'

Load simulated data, phenotypes dat40 and the kinship matrix kin2.

data(dat40, package = "lme4qtl")

dim(dat40)
#> [1] 234  10
dim(kin2)
#> [1] 234 234

head(dat40)
#>     ID  trait1  trait2 AGE FAMID  FA  MO SEX trait1bin trait2bin
#> 7  101 8.41954 9.67925  50    10   0   0   1         0         0
#> 14 102 5.47141 4.31886  44    10   0   0   2         0         0
#> 21 103 9.66547 7.00735  34    10 101 102   2         0         0
#> 28 104 6.27092 8.59257  41    10 101 102   1         0         0
#> 35 105 7.96814 7.60801  36    10 101 102   1         0         0
#> 42 106 8.29865 8.17634  37    10 101 102   2         0         0
kin2[1:5, 1:5] # nuclear family with 2 parents and 3 kids
#> 5 x 5 sparse Matrix of class "dsCMatrix"
#>     11  12  13  14  15
#> 11 1.0 .   0.5 0.5 0.5
#> 12 .   1.0 0.5 0.5 0.5
#> 13 0.5 0.5 1.0 0.5 0.5
#> 14 0.5 0.5 0.5 1.0 0.5
#> 15 0.5 0.5 0.5 0.5 1.0

Fit a model for continuous trait with two random effects, family-grouping (1|FAM) and additive genetic (1|ID).

m1 <- relmatLmer(trait1 ~ AGE + SEX + (1|FAMID) + (1|ID), dat40, relmat = list(ID = kin2))
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
m1
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: trait1 ~ AGE + SEX + (1 | FAMID) + (1 | ID)
#>    Data: dat40
#> REML criterion at convergence: 963.3853
#> Random effects:
#>  Groups   Name        Std.Dev.
#>  ID       (Intercept) 2.2988  
#>  FAMID    (Intercept) 0.0000  
#>  Residual             0.7856  
#> Number of obs: 224, groups:  ID, 224; FAMID, 39
#> Fixed Effects:
#> (Intercept)          AGE         SEX2  
#>    7.563248     0.008314    -0.364197  
#> convergence code 0; 1 optimizer warnings; 0 lme4 warnings

Get a point estimate of heritability (h2), the proportion of variance explained by (1|ID).

lme4::VarCorr(m1)
#>  Groups   Name        Std.Dev.
#>  ID       (Intercept) 2.29880 
#>  FAMID    (Intercept) 0.00000 
#>  Residual             0.78562
lme4qtl::VarProp(m1)
#>        grp        var1 var2      vcov     sdcor      prop
#> 1       ID (Intercept) <NA> 5.2845002 2.2988041 0.8954191
#> 2    FAMID (Intercept) <NA> 0.0000000 0.0000000 0.0000000
#> 3 Residual        <NA> <NA> 0.6172059 0.7856245 0.1045809

Profile the variance components (h2) to get the 95% confidence intervals. The method functions profile and confint are implemented in lme4. Note that a different model m2 is used, because profiling is prone to errors/warnings if model fit is poor.

m2 <- relmatLmer(trait2 ~ (1|ID), dat40, relmat = list(ID = kin2)) 
VarProp(m2)
#>        grp        var1 var2     vcov    sdcor      prop
#> 1       ID (Intercept) <NA> 5.573272 2.360778 0.7723589
#> 2 Residual        <NA> <NA> 1.642638 1.281654 0.2276411

prof <- profile(m2)
#> Warning in zetafun(np, ns): NAs detected in profiling
prof_prop <- lme4qtl::varpropProf(prof) # convert to proportions
confint(prof_prop)
#>                 2.5 %    97.5 %
#> .sigprop01  0.5292158 0.9157910
#> .sigmaprop  0.0655726 0.4652175
#> (Intercept) 7.2745157 8.4237085
densityplot(prof)

densityplot(prof_prop)

try(splom(prof)) 
#> Error in if (singfit) warning("splom is unreliable for singular fits") : 
#>   missing value where TRUE/FALSE needed

prof_clean <- na.omit(prof) # caution: NAs are indicators of poor fits
splom(prof_clean)

Fit a model with genetic and residual variances that differ by gender (sex-specificity model). The formula syntax with dummy (see ?lme4::dummy) is applied to the residual variance (1|RID) to cancel the residual correlation.

dat40 <- within(dat40, RID <- ID) # replicate ID 

m4 <- relmatLmer(trait2 ~ SEX + (0 + SEX|ID) + (0 + dummy(SEX)|RID), dat40, relmat = list(ID = kin2)) 
VarCorr(m4)
#>  Groups   Name       Std.Dev.   Corr 
#>  ID       SEX1       1.94400138      
#>           SEX2       2.64404940 0.826
#>  RID      dummy(SEX) 0.00050224      
#>  Residual            1.22780606

An example of parameter constraints that make the genetic variance between genders equal.

m4_vareq <- relmatLmer(trait2 ~ SEX + (0 + SEX|ID) + (0 + dummy(SEX)|RID), dat40, relmat = list(ID = kin2),
  vcControl = list(vareq = list(id = c(1, 2, 3)))) 
VarCorr(m4_vareq)
#>  Groups   Name       Std.Dev. Corr 
#>  ID       SEX1       2.47777       
#>           SEX2       2.47777  0.746
#>  RID      dummy(SEX) 0.95827       
#>  Residual            0.72728

Another example of parameter constraint that implies the genetic correlation between genders equal to 1.

m4_rhog1 <- relmatLmer(trait2 ~ SEX + (0 + SEX|ID) + (0 + dummy(SEX)|RID), dat40, relmat = list(ID = kin2),
  vcControl = list(rho1 = list(id = 3))) 
VarCorr(m4_rhog1)
#>  Groups   Name       Std.Dev.  Corr 
#>  ID       SEX1       1.7627785      
#>           SEX2       2.5782330 1.000
#>  RID      dummy(SEX) 0.0014823      
#>  Residual            1.4147793

Fit a model for binary trait.

m3 <- relmatGlmer(trait1bin ~ (1|ID), dat40, relmat = list(ID = kin2), family = binomial(probit))
m3
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( probit )
#> Formula: trait1bin ~ (1 | ID)
#>    Data: dat40
#>       AIC       BIC    logLik  deviance  df.resid 
#>  218.1325  225.0432 -107.0663  214.1325       232 
#> Random effects:
#>  Groups Name        Std.Dev.
#>  ID     (Intercept) 0.7669  
#> Number of obs: 234, groups:  ID, 234
#> Fixed Effects:
#> (Intercept)  
#>      -1.242

lme4qtl's People

Contributors

deruncie avatar variani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

lme4qtl's Issues

Error about kinship matrix

Dear Dr. Ziyatdinov,
Thanks for your providing a very friendly-user interface to achieve LMM on the R platform.
Some errors happened when providing kinship matrix, like the following,

lmm0 <- relmatLmer(y~1+Pilot+Year+Breed+(1|ID),data=d,relmat=list(ID=mykin),REML=TRUE)
invalid assignment for reference class field ‘Ut’, should be from class “dgCMatrix” or a subclass (was class “dgeMatrix”)

Best,
Julong

Negative binomial models

Hello,

Is it possible to use negative binomial models with relmatGlmer?

glmer.nb will run negative binomial models, but not lmer. So i think it is not possible with relmatGlmer, but would be grateful for your much deeper insight.

Many thanks!

random slope

The lme4 package naturally supports models with a random slope (on a continuous variable). Fitting similar models in lme4qtl supposed to be straightforward, but has not been explored yet.

library(lme4)
library(lme4qtl)

data(dat40)
dat40 <- within(dat40, ID2 = ID)

m0 <- relmatLmer(trait1 ~ AGE + (1|ID), dat40, relmat = list(ID = kin2))

m1 <- lmer(trait1 ~ AGE + (1 + AGE||FAMID), dat40)
m2 <- relmatLmer(trait1 ~ AGE + (1 + AGE||ID), dat40, relmat = list(ID = kin2))

m3 <- relmatLmer(trait1 ~ AGE + (1|ID) + (0 + AGE|ID), dat40, relmat = list(ID = kin2))
m4 <- relmatLmer(trait1 ~ AGE + (1|ID) + (0 + AGE|ID2), dat40, relmat = list(ID = kin2, ID2 = kin2))

m5 <- relmatLmer(trait1 ~ AGE + (0 + AGE|ID), dat40, relmat = list(ID = kin2))

Fitting: Models like m0 are basic models in lme4qtl: one grouping factor and corresponding relationship matrix. Now we aim to fit a model like m2 (formulas in m2 and m3 express the same according to the lme4 design).

As the kin2 matrix has to be injected in two places, m4 model might be a safer option by separating two random effects and using different grouping variables ID and (its duplicate) ID2.

Another model m5 is good for testing, because it has the only random effect.

BLUP: Applying the method lme4::ranef to models fitted by lme4qtl is not the right thing. See the second formula in the lme4qtl article or pedigreemm:ranef method.

Given Cholesky decomposition kin2 = R' R, then one gets BLUP, e.g. from m5, by matrix multiplication in R R %*% lme4:ranef(m5). The matrix R is also referred to as relative factor (relfac variable used within lme4qtl).

Thing to do:

  • Update relmatLmer functions to return the relative factor(s).
  • Create a new function relmatRanef for BLUP estimates.

Ideally, we need create a new class inherited from lmerMod similarly as in pedigreemm.

Ideas to be update `lme4qtl` due to `rBind()` function

Hi everybody,

I am going to apply relmatLmer() to evaluate the effect of age (center and continuous variable) and sex, in order to reproduce our result of my PhD studies and apply the methodology to other fields. Only one trouble has been appeared in relmatLmer(), and I also supose in relmatGlmer().

 mod$PTES <- relmatLmer(
      PTES ~ AGEc*gr + SEX*gr + (1|ID), 
      data = phen3,
      relmat = list(ID = K),
      control = lmerControl,
      REML = FALSE)

Warning message:
'rBind' is deprecated.
 Since R version 3.2.0, base's rbind() should work fine with S4 objects

 mod$PTES
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: PTES ~ AGEc * gr + SEX * gr + (1 | ID)
   Data: phen3
      AIC       BIC    logLik  deviance  df.resid 
 8480.304  8545.958 -4226.152  8452.304       790 
Random effects:
 Groups   Name        Std.Dev.
 ID       (Intercept) 14.16   
 Residual             44.29   
Number of obs: 804, groups:  ID, 804
Fixed Effects:
(Intercept)         AGEc          gr1          gr2          gr4         SEXF     AGEc:gr1     AGEc:gr2     AGEc:gr4     gr1:SEXF     gr2:SEXF  
   217.6032      -0.2230      93.3613       0.9101     -15.7515      29.7029     -10.1526      -4.5380       0.2763     -33.8119      -1.2078  
   gr4:SEXF  
    -0.3870

The function works well, but the R terminal shows the warning message. Would be possible update the rBind() function?

Thanks in advance.

how to apply GRM matrix into longitudinal data?

Hi @variani @deruncie ,
I have a longitudinal data with four time point, to do the GWAS study. I have two questions:

  1. If lme4qtl cope with GRM matrix (n=a) and long-data (n=4a)?
  2. Whether assocLmer can deal the GE interactions, like I want to put the "timesnp" into the equation.
    Thanks a lot!

relmatGlmer slow

Hi,

I am trying to use the relmatGlmer function to model a binary response variable with two fixed effects and one random effect using ~16,000 individuals. The random effect is a genomic relationship matrix with the 16,000 individuals.

i.e. relmatGlmer(
myBinaryTrait ~ myCovariate1 + myCovariate2 + (1|myID),
myData,
relmat = list(myID = myMatrix),
family = binomial
)

It seems to be running very slow. Should this be a problem with this number of individuals? Or is the cause of it running slowly likely something else? Also, is there a recommended way to format the relationship matrix so that is runs more quickly?

Thanks.

way to adjust for kinship only?

Hello there,

Is there a way I can only adjust for kinship and not any random effect when using lme4qtl? My model would be as:

output <- relmatGlmer(trait ~ age + sex, data, relmat = list(ID = kin2), family = binomial(logit))

Repeated measurements using relmatGlmer

Hi,

I am trying to analyse count data with relmatGlmer accounting for relatedness and repeated measurements. When including repeated measurements I get the error “nrow(Ztlist[[i]])%%ncol(Ztlist[[i]]) == 0 is not TRUE”.

When I transform the example data “dat40” into count data (trait1, trait2) relmatGlmer works (but outputs warnings). When I then add repeated measurements to the example data, relmatGlmer doesn’t work and outputs the same error as mentioned above.

Here is the code I run:

##########
library(lme4) # needed for VarCorr function
library(lme4qtl)

data(dat40)

data.count <- read.table("data.count.txt", sep="\t")
data.count.repeat <- read.table("data.count.repeat.txt", sep="\t")

No repeated measurements (works):

relmatGlmer(cbind(trait1, trait2) ~ AGE + SEX + (1|FAMID) + (1|ID), data.count, relmat = list(ID = kin2), family = binomial)

relmatGlmer((trait1/(trait1+trait2)) ~ AGE + SEX + (1|FAMID) + (1|ID), data.count, weights=(trait1+trait2), relmat = list(ID = kin2), family = binomial)

Repeated measurements (does not work):

mod <- relmatGlmer(cbind(trait1, trait2) ~ AGE + SEX + (1|FAMID) + (1|ID), data.count.repeat, relmat = list(ID = kin2), family = binomial)

mod <- relmatGlmer((trait1/(trait1+trait2)) ~ AGE + SEX + (1|FAMID) + (1|ID), data.count.repeat, weights=(trait1+trait2), relmat = list(ID = kin2), family = binomial)

##########

Please find the input data attached. Any hint how to model data with relamtGlmer including repeated measurements would help me a lot!

Many thanks and best regards,
Melanie

data.count.repeat.txt
data.count.txt

"direct" test of SNP in relmatGlmer / relmatLmer

Hi @variani,

I was looking at your Demo code and wondering why you don't include the SNP as another term in the relmatLmer() model and test association to the SNP directly from the lme4qtl model fit? You could do so via anova() (against a null model without the SNP) or with lmerTest, unless I'm missing something.

Also, for relmatGlmer the $coefficients from the model fit include a p-value for the SNP, so there's no need to use add-on packages there.

I'm hoping you can enlighten by pointing out the flaws in my argument 😄

Thanks!
Dan.

lme4qtl compared to lmekin?

I am really happy that I recently found your package lme4qtl. This seems to be exactly the package I have hoped for! So far, I have used the function lmekin from package coxme to model data with phylogenetic covariance matrix to account for species independence, but that function is not well developed (e.g not able to plot, predict). My only concern is that lme4qtl gives a bit different model parameters than lmekin. Most notably, the estimates and standard errors of the variables directly connected to species identity, such as mean body mass of species, are 2-3 times higher than in lmekin.

For example, it seems a bit odd that the model output from lme4qtl indicates a significant interaction between mass and habitat but the confidence bands for mass are so wide that this makes the interaction hard to believe.
Do you know what could be the reason for the large difference between lme4qtl and lmekin? Is one of them wrong or which package should I prefer for using with my type of data?

I will send my data sample, phylogenetic tree, and code by e-mail.

Remove effect of the modeled VCV (e.g., pedigree) from the expected values of y

I want to remove the effect of the the modeled VCV matrix (e.g., pedigree) from the expected values of y.

Say that I run fit <- lme4qtl::relmatLmer(y ~ time + (1 | ID) + group, data = dat, relmat = list(ID = ID_VCV)) and y_exp <- predict(fit) to get the expected values. How can I then remove the effect of the modeled VCV from y_exp?

Does fit@resp$mu correspond to u in Equation 1 from the paper?

If so, in order to remove the effect of the modeled VCV from the expected values, could I simply subtract fit@resp$mu from y_exp?

Thanks,
Johannes

Checking for Multi-collinearity

Is there a way to check for multi-collinearity among fixed and random effects (including the relatedness covariance matrix).

How to do unstructured and how to do compound symmetry heterogeneous covariance models ?

Hi,

My message is in the spirit of clarification. I've done unstructured as well as compound symmetry heterogeneous covariance modeling in SPSS before. The process is fairly simple.
Select Mixed Models> Linear. Then, select Subjects and Repeated measure as well as the type of Repeated covariance matrix. Then select the appropriate IVs and DVs and Estimates. Then hit the okay button.

lme4qtl, seem to require calculated covariance matrix. It seems to me that functionality that could facilitate this calculation process are no included in lme4qtl package. If this is a correct assumption, would you let me know how, perhaps from other packages, I could automate the process of generating the appropriate covariance matrix?

Thank you in advance!

speed up relmatLmer?

Hi, @variani
Fitting a linear mixed model with kinship as random effect is a quite time-consuming task. I am trying to apply the relmatLmer on a dataset with about 500 samples and 30K "variants" (not the numeric variable but factorial), by fitting
Model1 = relmatLmer(PHE ~ PC1+PC2+PC3+PC4+PC5+(1|ID), data, relmat = list(ID = Kin),REML=F,control = lmerControl(optimizer = "nloptwrap",optCtrl=list(xtol_abs=1e-6, ftol_abs=1e-6),calc.derivs = F))
and turn into a loop
Model2 = update(Model1, .~.+(1|G))
which G is a single factorial variant. Then two models were compared by anova to get the significance of G. This takes about 20 hours, even i paralleled the loop by foreach in package doSNOW with 40 threads.

I have already followed some methods on the stack overflow to improve the efficiency, but the time used is still not ideal. Is there any way to speed up the regression?

Predicting new data/unobserved phenotypes

Dear Variani, thanks a lot for your useful package,

I would like to know if there is a way, perhaps through the ranef()function, to predict genotypes that are in the kinship matrix, but have not been phenotyped, usually the parental ones, or new combinations, for example:

pedigree_scheme

fitA<- relmatLmer(y~ rep + (x1 | ID), data=phe,
                relmat= list(ID=A), REML=T)
ranef(fitA)
$ID
   (Intercept)        x1
P1    ?
P2    ?
P3    ?
P4    ?
X1   -4.174645  3.869076
X2   -1.354763  1.255599
X3   -1.301316  1.206064
X4    1.264602 -1.172037
X5    ?

This naïve exemple files:
A.txt
pedigree.txt
pheno.txt

Thank you very much!

Use matlm for interaction between dependent variable and SNP

Dear Andrey,

I followed your tutorial (https://github.com/variani/lme4qtl/blob/master/demo/gwas.R) using matlm to rapidly run lme4qtl using precalculated variance components. It was, indeed, very fast.

Is it possible to use a similar strategy to quickly run a mixed model that evaluates the significant of an interaction between a SNP and a condition?

For example, matlm::matlm(trait1 ~ HEIGHT * SNP1, dat40, pred = gdat40, ids = ids, transform = W,
batch_size = 100, verbose = 2),

where HEIGHT = short or tall, and SNP1 = 0 or 1. I would like P vals for the effect of SNP1 (and SNP2, SNP3, etc) on trait1 in short people as well as tall people, and also the P val for the interaction.

Based on your comments in the code for matlm, you are working on this issue, but I thought there would be no harm in asking.

Thank you very much!

Multiple matrices as random effects

Dear Dr. Ziyatdinov,

Thank you for making lme4qtl. I am writing with a question. Is it possible with lme4qtl to fit random effects for two matrices. Say in my model, I am interested in fitting a random effect for not only the kinship matrix, but also a second matrix of pairwise distances.

In a simple example, extending that from your software page, my understanding is that I could fit a second matrix as follows.

library(lme4)
library(lattice)
library(lme4qtl)

data(dat40, package = "lme4qtl")
m1 <- relmatLmer(trait1 ~ AGE + SEX + (1|FAMID) + (1|ID), dat40, relmat = list(ID = kin2))

# make new matrix
K = kin2 * kin2
dat40$ID2 <- dat40$ID

m2 <- relmatLmer(trait1 ~ AGE + SEX + (1|FAMID) + (1|ID) + (1|ID2), dat40, relmat = list(ID = kin2, ID2=K))

Could you please confirm that this would work as I expect?
A second question if you could confirm the following - say the IDs in the rownames of the kinship matrix and in the ID variable of the dat40 dataframe have an overlapping set but are not exactly the same (one is a subset of the other). Do I understand correctly that that the overlapping set of IDs will be subsetted in the relmatLmer function automatically, and that is the set the mixed model will be run on?

Lastly, when modeling something like gender or FAMID, do you recommend keeping the variable type as "chr" instead of "factor" or does it make no difference?

Many thanks in advance,

Mashaal Sohail

error with reference class field ‘Ut’, should be from class “dgCMatrix”

Thank you for providing a method for analysis! I am using lme4qtl for a linear mixed model analysis, but I keep getting an error:

Error: invalid assignment for reference class field ‘Ut’, should be from class “dgCMatrix” or a subclass (was class “dgeMatrix”)

Do you have any suggestions where that might stem from?

This is how I got the kinship matrix:

library("Matrix")
library("lme4")
maatriks2=read.csv("table.txt", sep=" ", dec=".", header=TRUE,stringsAsFactors=FALSE)
class(maatriks2[,1])
class(maatriks2)
dim(maatriks2)
rownames(maatriks2) = maatriks2[,1]
maatriks2[,1]=NULL

maatriks2<-as.matrix(maatriks2)
maatriks2=as(maatriks2,"dgCMatrix”)

mudel2 <- relmatLmer(CTG_LDL_statin ~ Age+ Sugu + (1|Vcode1), FH2, relmat = list(Vcode1 = maatriks2))

PIRLS step-halvings failed to reduce deviance in pwrssUpdate with probit relmatGlmer

Hi, I am using the relmatGlmer function with the probit link function in lme4qtl. Here is my data and code

kinship.txt
genotype_phenotype.txt

test<-read.table("genotype_phenotype.txt")
kinship<-as.matrix(read.table("kinship.txt"))
colnames(kinship)<-rownames(kinship)
relmatGlmer(phenotype ~ genotype + (1|ID), test, relmat = list(ID = kinship), family = binomial(link="probit"))

And I got the following error message:

Error in pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GQmat, compDev = compDev, :
(maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

I found a relevant question in lme4 https://github.com/lme4/lme4/issues/579, and tried their solution

relmatGlmer(phenotype ~ genotype + (1|ID), test, relmat = list(ID = kinship), family = binomial(link="probit"),nAGQ=20)

It returns the error message:

Error in pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GQmat, compDev = compDev, :
AGQ only defined for a single scalar random-effects term

I think my random effect term is a scalar random effect. Am I doing anything wrong here?

Any help would be appreciated!

Zoe

Perform EVD outside relmatLmer/relmatGlmer

In the current version 0.1.10, the relationship matrix is passed to relmatLmer and then it is decomposed by Cholesky/EVD. The decomposition operation is not visible to the user and there is little control over, e.g. errors. See also issue #1.

library(lme4qtl)
data(dat40)

mod <- relmatLmer(trait1 ~ 1 + (1|ID), dat40, relmat = list(ID = kin2))

It would be helpful to have something like this:

evd_kin2 <- eigen(kin2)

mod <- relmatLmer(trait1 ~ 1 + (1|ID), dat40, relmat = list(ID = evd_kin2))

cannot coerce type 'closure' to vector of type 'character'

Hello,

Thanks again for developing this very useful package. I am encountering the following error:

(1) model continiuous trait trait1
mod <- relmatLmer(trait1 ~ AGE + SEX + (1|FAMID) + (1|ID), dat40, relmat = list(ID = kin2))
Error in as.character(sys.call(sys.parent())[[1L]]) :
cannot coerce type 'closure' to vector of type 'character'

I have attached the conda environment I am working in when I get the error. I don't get the error when I try running the same analysis using R 3.6.0.

GenoFunc.yaml.txt

Have you encountered this error before or noticed any issue when using R 4.0.2?

Many thanks,

Ollie

Confidence Intervals Errors

Dear Andrey,

First, thanks for developing lme4qtl, I have been looking for this kind of package for a while.

Second, I attempted to compute confidence intervals with the example data as well as with my own data and receive lots of warnings and errors. I used the code from the paper supplement and applied it to the quick start example:

library(lme4) # needed for `VarCorr` function
library(lme4qtl)

# load synthetic data set `dat40` distributed within `lme4qtl`
# - table of phenotypes `dat40`
# - the double kinship matrix `kin2`
data(dat40)

# (1) model continiuous trait `trait1`
mod <- relmatLmer(trait1 ~ AGE + SEX + (1|FAMID) + (1|ID), dat40, relmat = list(ID = kin2))

# `?lme4::profile`
prof <- profile(mod, which = "theta_", prof.scale = "varcov")

# `?lme4qtl::varpropProf`
prof_prop <- varpropProf(prof)

ci <- confint(prof_prop, level = 0.95)
ci

I get following warnings from profile():

Warning messages:
1: In sqrt(m) : NaNs produced
2: In sqrt(m) : NaNs produced
3: In sqrt(m) : NaNs produced
4: In sqrt(m) : NaNs produced
5: In sqrt(m) : NaNs produced
6: In sqrt(m) : NaNs produced
7: In zetafun(np, ns) : NAs detected in profiling
8: In nextpar(mat, cc, i, delta, lowcut, upcut) :
  Last two rows have identical or NA .zeta values: using minstep

and then when using varpropProf():

Error in na.fail.default(data.frame(x = as.numeric(obj1), y = as.numeric(obj2))) : 
  missing values in object

Is there any error in the script or is this bug? Are there alternative ways to obtain confidence intervals? I would appreciate it, if you could help with obtaining confidence intervals.

Warning for RE variable mismatch in formula and relmat

This issue was first posted on SO.

As for now, lme4qtl 0.2.1 accepts anything in relmat and takes what it needs from relmat. The following models are fine:

library(lme4qtl)
data(dat40)

m1 <- relmatLmer(trait1 ~ (1|ID), dat40) # same as calling `lmer`
m2 <- relmatLmer(trait1 ~ (1|ID), dat40, relmat = list(ID = kin2))
m3 <- relmatLmer(trait1 ~ (1|ID), dat40, relmat = list(NotExistingID = kin2))

stopifnot(all.equal(getME(m1, "theta"), getME(m3, "theta")))

When fitting m3, there is no any warning/error that variable names in formula (ID) and relmat (NotExistingID) are different. That means that the user sought to fit a model like m2 but got a model like m1.

Warning or Error?

m4 <- relmatLmer(trait1 ~ (1|FAMID) + (1|ID), dat40, relmat = list(ID = kin2))
m5 <- update(m4, . ~ . - (1|ID))

It should be thrown a warning. Otherwise, fitting m5 would fail.

Details of the Eigenvalue decomposition operation

An additional file to your paper about lme4qtl mentions that lme4qtl is able to deal with a low-rank random effect covariance matrix by replacing Cholesky decomposition by the Eigenvalue decomposition operation. Unfortunately, I was not able to find details about how this method works anywhere.In case you are able to provide me with any details or relevant sources where I could learn more about this I would highly appreciate it.

assocLMer

Dear Dr. Ziyatdinov,

I am trying to use the assoCLmer function to analyse my data (it's a fish, I have 46K SNPs and about 6000 individuals), but I am having trouble figuring out from the function how the trait ans SNP data should be formatted for input. If I am reading the function correctly, this is a 2-step approach, It first fist a model without the SNPs (and I assume it used a relationship matrix G generated from the SNP data) and the fits one SNP at time. Is that correct?
Lastly, I was wondering if it is possible to extract r-squared for the individual SNP test,

Sorry for all the questions.

Cheers,

Tiago

help documentation

Hi there,

I was wondering if there would be more of an expansion of the help documentation in the future. In particular, I'm wondering how the covariance matrix should be coded/structured and am having trouble figuring that out based on the information available. I have played with some of the sample data sets, but still am uncertain as to how to proceed.

Thanks so much.

General doubts & ideas to improve 'lme4qtl'

I am going to apply relmatLmer() to evaluate the effect of age (center and continuous variable), sex and height in others. I have some doubts:

  • How is the best way to evaluate the covariates in relmatLmer(): model with additive effects or interaction of covariates.
relmatLmer(var ~ AGE_gr + AGEc + SEXc + height + (1 | ID), data = dat, relmat = list(id  = dat_id))
relmatLmer(var ~ AGE_gr * AGEc + AGE_gr * SEXc + AGE_gr * height + (1 | ID), data = dat, relmat = list(id  = dat_id))
  • In case that the additive model contains NAs in covariates (e.g. height), drop1 function does not evaluate the significance by any test.
Error in drop1.merMod(model) : 
  number of rows in use has changed: remove missing values?
  • There are any option to evaluate the significance of the covariates with NAs? The significance test could be included into model summary() as glm() allowed?

Any solution available?

Thanks in advance.

ICC or h2 for binary response.

I have been working with the lme4qtl package, I have a binary response database, in README indicates that you can work for a binary response (which is my problem) but I need the ICC (Intraclass Correlation coefficient) especially the h2 for binary data with family data structure. Using lme4qtl::VarProp I do not get the desired h2 since the prop value is 1, could you help me how I can find h2 for binary response data with family structure please?

Family data, genetic variant

Hi!

I’m not familiar with the analysis of quantitative traits with family data. There are data on individuals from two extended families including several generations. Of these individuals, about 50% are carriers of a specific mutation (genetic variant). It is already known that a particular disease develops in the carriers, but it is of interest to evaluate the effect of this genetic variant on some traits associated with this disease. The phenotype variable, let's call it pheno, is normally distributed, and among individuals (id) there are both carriers and non-carriers of the genetic variant (a binary variable gvar). In addition, there are several other explanatory variables, such as sex and age. It's essential to account for family relationships, and it seems that lme4qtl package offers a solution. However, it is not entirely clear to me whether it is possible to model the effect of a single genetic variant with the lme4qtl package and how to proceed with the analysis. Below are some specific questions:

  • How the data on family relationships (i.e. family tree, several generations) should be coded? Probably some generation identifier (genid, 0,1,2,3,4) and/or family or sibling identifier?

  • What kind of custom covariance matrices should be used with the data described above?

  • How to model (or account for) the gene-environment effects, such as the shared environmental effect for siblings (sibid)?

And more generally, what are essential things to take into account when using lme4qtl package?

available via CRAN?

As far as I can tell, lme4qtl isn't available from CRAN. Package installation and management with packrat or conda would be facilitated by having lme4qtl available from CRAN. Are you planning (or currently working on) submitting lme4qtl to CRAN?

Code hangs when using relmatGlmer

Hello,

We are using lme4qtl with a large kinship matrix (95722836 elements, 73.4 Mb). We are able to get relmatLmer to work with our linear model; however, relmatGlmer hangs when we use it with our logistic model. I tried using just the first 20 rows of our matrix and relmatGlmer did work with the smaller matrix. Is there a size limit for the matrix?

Additionally, I tried to use the summaryCoef command that is in your slide presentation to get the p-values for the logistic model, but I can't find the package that contains the summaryCoef command.

Thank you!

Error in is.nloptr(ret) : objective in x0 returns NA

I keep getting the following error when I run:

lme4qtl::relmatLmer(FG ~ SeasonDay + (1|Site) + (1|Sample), Data, relmat = list(Sample=A))

Error in is.nloptr(ret) : objective in x0 returns NA
In addition: Warning message:
In sqrt(out$values) : NaNs produced

head(Data)

Sample FG SeasonDay Site
1 AL-5-28-2 20.42 148 Almont1
2 AL-5_28-1 21.13 148 Almont1
3 AL-5_29-1 22.43 149 Almont1
4 AL-5_29-2 20.55 149 Almont1
5 AL-5_29-3 20.58 149 Almont1
6 AL-5_29-4 20.14 149 Almont1

head(A)

Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:260100] 0 1 2 3 4 5 6 7 8 9 ...
..@ p : int [1:511] 0 510 1020 1530 2040 2550 3060 3570 4080 4590 ...
..@ Dim : int [1:2] 510 510
..@ Dimnames:List of 2
.. ..$ : chr [1:510] "AL-5-28-2" "AL-5_28-1" "AL-5_29-1" "AL-5_29-2" ...
.. ..$ : chr [1:510] "AL-5-28-2" "AL-5_28-1" "AL-5_29-1" "AL-5_29-2" ...
..@ x : num [1:260100] 0 0.000013 0.00127 0.000013 0.002184 ...
..@ factors : list()

I generated this matrix using a file with all possible pairs and "rab" values.

File:

Sample1   Sample2      rab

1 AL-5-28-2 AL-5-28-2 0.000000
2 AL-5-28-2 AL-5_28-1 0.000013
3 AL-5_28-1 AL-5_28-1 0.000000
4 AL-5-28-2 AL-5_29-1 0.001270
5 AL-5_29-1 AL-5_29-1 0.000000
6 AL-5_28-1 AL-5_29-1 0.008056

G <- graph.data.frame(File,directed=FALSE)
A <- as_adjacency_matrix(G,names=TRUE,attr="rab",type='both')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.