andjar / alasca Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 0.0 108.69 MB

Home Page: https://andjar.github.io/ALASCA/

R 92.22% TeX 7.05% JavaScript 0.73%

alasca's People

Contributors

Stargazers

Watchers

alasca's Issues

ID error in cross-sectional analysis

Hi!
I want to apply ALASCA for a design that does not include repeated measurements similar to the first model in example 3. To test it, I first downloaded the files of example 3 (https://figshare.com/articles/software/ALASCA_An_R_package_for_longitudinal_and_cross-sectional_analysis_of_multivariate_data_by_ASCA-based_methods/21362979/1) and when running the model (02.ex3.part1) it generated an error with the IDs, as follows:

INFO [2023-01-15 11:17:47] Initializing ALASCA (v1.0.7, 2022-12-10)
WARN [2023-01-15 11:17:47] Guessing effects: disease
INFO [2023-01-15 11:17:47] Will use linear models!
INFO [2023-01-15 11:17:47] Will use Rfast!
WARN [2023-01-15 11:17:47] Converting IDs to integer values
WARN [2023-01-15 11:17:47] The disease column is used for stratification
fstcore package v0.9.14
(OpenMP detected, using 20 threads)
INFO [2023-01-15 11:17:48] Scaling data with sdref ...
WARN [2023-01-15 11:17:48] The scaling sdref has been replaced by sdt1 as there is only one effect term. This corresponds to the column disease
INFO [2023-01-15 11:17:50] Calculating LM coefficients
INFO [2023-01-15 11:17:50] Reducing the number of dimensions with PCA
INFO [2023-01-15 11:17:54] Keeping 111 components from initial PCA, explaining 95.11 % of variation. The limit can be changed with reduce_dimensions.limit
INFO [2023-01-15 11:17:54] -> Finished the reduction of dimensions!
Error in bmerge(i, x, leftcols, rightcols, roll, rollends, nomatch, mult, :
Incompatible join types: x.ID (integer) and i.V1 (character)

The same error appears when trying to apply it to my data. However, when I run the script 02.ex3.part2 , example 1 and example 2, which are repeated measurements, they were processed without any problem.

How could I solve this problem?
Thank you so much for your time and help.

Sincerely,
Cynthia

sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Mexico.utf8 LC_CTYPE=Spanish_Mexico.utf8
[3] LC_MONETARY=Spanish_Mexico.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Mexico.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] fstcore_0.9.14 ggrepel_0.9.2 ggplot2_3.4.0 ALASCA_1.0.7
[5] data.table_1.14.6 here_1.0.1

loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 log4r_0.4.3 pillar_1.8.1 compiler_4.2.2
[5] tools_4.2.2 bit_4.0.5 lifecycle_1.0.3 tibble_3.1.8
[9] gtable_0.3.1 pkgconfig_2.0.3 rlang_1.0.6 cli_3.4.1
[13] DBI_1.1.3 rstudioapi_0.14 parallel_4.2.2 duckdb_0.6.1
[17] withr_2.5.0 dplyr_1.0.10 generics_0.1.3 vctrs_0.5.1
[21] hms_1.1.2 RcppZiggurat_0.1.6 Rfast_2.0.6 rprojroot_2.0.3
[25] bit64_4.0.5 grid_4.2.2 tidyselect_1.2.0 glue_1.6.2
[29] R6_2.5.1 fansi_1.0.3 vroom_1.6.0 tzdb_0.3.0
[33] readr_2.1.3 magrittr_2.0.3 scales_1.2.1 ellipsis_0.3.2
[37] fst_0.9.8 assertthat_0.2.1 colorspace_2.0-3 utf8_1.2.2
[41] munsell_0.5.0 crayon_1.5.2

Plot function prnts output as showing x number of variables, but not actually

Hello,

This was an issue I discovered by chance.

I have my data in the long format, and I have 13 dependent variables. All NAs are omitted from the dataset.

The model itself runs fine, with no error.

When I plot the effect plot, it prints "Showing 13 of 13 variables. Adjust the number with n_limit"

However, in the loadings plot, there is a variable missing. This is irrespective of what type of plot I'm using to look at the loadings. The variable does not show up on the histogram plot, nor the loadings.

And it is just one variable, per principal component, and not even the same every time.

And yes, I have the latest version of the package.

Happy for any lead!

plot(mod1, effect = c(1), component = c(1,2), type = 'effect')
INFO [2024-05-15 15:46:16] Effect plot. Selected effect (nr 1): Age. Component: 1 and 2.
WARN [2024-05-15 15:46:16] Showing 13 of 13 variables. Adjust the number with n_limit
WARN [2024-05-15 15:46:16] Showing 13 of 13 variables. Adjust the number with n_limit

Some errors applying ALASCA to longitudinal (repeat measures) microbial counts

Hi Anders!

This is an excellent package! And one of the few to accept repeat measures intelligently! (talking to you vegan)
I commend you on making it so user friendly. I have a special use case (or hopefully not so special) where I want to model microbial community change in a time series. This data seem to fit with the general requirements:
values = microbial abundance (~500 spp.),
time = days (x5, including day 0),
group = individual or biological replicate (x4).
sub_group = technical replicates (3/replicate/day)

My experimental setup is 4 biological replicates (individuals) sampled across Days (0 baseline, 1, 3, 7, and 14). Each biological replicate has 3 technical replicates at each time point (pseudo-replication).

Question 1. I may need a slightly different model structure than your examples define since I have technical replicates. (4 individuals) * (5 days) * (3 tech. replicates) = 60 observations. I'm only interested in the change over time (fixed effect), not in each individuals contribution (individuals are my blocks) and I'd like to explicitly model the technical replicates (i.e, sub_group) nested in each individual (i.e., group) instead of averaging them outside the model. Should I set the random effect to be (group|sub_group)?

model.formula2 <- value ~ time*group + (group|sub_group)
Some results:

output from validate = F

Question 2. I ran my model and I get a usable object (awesome!) but when I validate it I get interesting differences between the methods. With bootstrap validation it runs fine. With permutations (see error below). With "loo" ... R crashes. In general, are these bootstraps (or optionally permutations) aware of the model formula? I know some analyses packages use the permute package which requires a call to how() to set blocks, plots etc. so that permutations are not "free" but constrained to only the independent blocks of your study design. How is this handled in ALASCA?

Some issues I found with my unorthodox dataset:

summary(mod$regr.model[[1]]) #returns NULL in every case
Length Class Mode
0 NULL NULL

Also this warning message when I use permutations instead of bootstrap:

  	PE.mod <- ALASCA(df_long, model.formula2, separateTimeAndGroup = T,  useRfast = T, forceEqualBaseline = T, validate = T, validateRegression = F, validationMethod = "permutation", nValRuns = 100)

====== ALASCA ======

0.0.0.106 (2022-01-22)

Will use linear mixed models!
Using group for stratification.
Scaling data...
Calculating LMM coefficients...
Finished calculating regression coefficients!
Calculating predictions from regression models...
Finished calculating predictions from regression models!
Calculating effect matrix
Finished calculating effect matrix!
Running validation...

Run 1 of 1000
Error in prepareValidationRun(object) : object 'temp_object' not found

Thanks for your time Anders,

Sam

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] ALASCA_0.0.0.105 data.table_1.14.2 ggvegan_0.1-0 pairwiseAdonis_0.4
[5] cluster_2.1.2 patchwork_1.1.1 MicrobiotaProcess_1.6.3 weathermetrics_1.2.2
[9] tidyquant_1.0.3 quantmod_0.4.18 TTR_0.24.3 PerformanceAnalytics_2.0.4
[13] xts_0.12.1 zoo_1.8-9 ggtext_0.1.1 lubridate_1.8.0
[17] wesanderson_0.3.6 viridis_0.6.2 viridisLite_0.4.0 Cairo_1.5-14
[21] cowplot_1.1.1 ggthemes_4.2.4 magrittr_2.0.1 reshape_0.8.8
[25] reshape2_1.4.4 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
[29] purrr_0.3.4 readr_2.1.1 tidyr_1.1.4 tibble_3.1.6
[33] tidyverse_1.3.1 DivNet_0.4.0 breakaway_4.7.6 DESeq2_1.34.0
[37] SummarizedExperiment_1.24.0 Biobase_2.54.0 MatrixGenerics_1.6.0 matrixStats_0.61.0
[41] GenomicRanges_1.46.1 GenomeInfoDb_1.30.0 IRanges_2.28.0 S4Vectors_0.32.3
[45] BiocGenerics_0.40.0 metagMisc_0.0.4 microbiome_1.16.0 ggplot2_3.3.5
[49] phyloseq_1.38.0 vegan_2.5-7 lattice_0.20-45 permute_0.9-5
[53] ANCOMBC_1.4.0 corrplot_0.92 pvclust_2.2-0 dendextend_1.15.2

Error: Mat::init()

Hey there,
I tried to perform an ALASCA model using a longitudinal dataset of 1260 rows and 5 columns (colnames: ID, Timepoint, Group, Variable, value).

Call for the model:
mod <- ALASCA(df = longitudinal_data, formula = value ~ Timepoint*Group + (1|"Patient code"), scale_function = "sdt1", validate = TRUE, ignore_missing_covars = T)

Unfortunately this error comes out:

INFO  [2023-10-18 11:11:14] Initializing ALASCA (v1.0.11, 2023-06-19)
WARN  [2023-10-18 11:11:14] Guessing effects: `Timepoint+Timepoint:Group+Group`
INFO  [2023-10-18 11:11:14] Will use linear mixed models!
INFO  [2023-10-18 11:11:14] Will use Rfast!
WARN  [2023-10-18 11:11:14] Converting IDs to integer values
WARN  [2023-10-18 11:11:14] The `Timepoint` column is used for stratification
WARN  [2023-10-18 11:11:14] Converting `character` columns to factors
WARN  [2023-10-18 11:11:14] Predictor variables missing for some samples! Continue with caution!
INFO  [2023-10-18 11:11:14] Scaling data with sdt1 ...
INFO  [2023-10-18 11:11:14] Calculating LMM coefficients

Error: Mat::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD

Of course I've tried to delve into various Stackoverflow's threads withount any success.
Do you know how to deal with this error?
Best,

P.S. I'm working on macOS Sonoma 14.0 on Macbook Pro 16" M1 Max

Error in ALASCA with scale_function = "none"

Hi Anders!
I continue testing the fantastic ALASCA package. This time I am trying a previously normalized and standardized database, so when using the ALASCA function, I used the condition scale_function = "none". The function starts running without a problem, but when it reaches the scaling, it generates an error. I tried with the database of Example 2, and the same thing happened:

if (!file.exists(here("output/ex2.part1A/validation_IDs.csv"))) {

mod <- ALASCA(
```
df,
```

formula = value ~ time + time:group + (1|ID),

```
wide = TRUE,
```
```
separate_effects = TRUE,
```
```
equal_baseline = TRUE,
```
```
scale_function = "none",
```

filepath = here("output","ex2.part1A"),

```
filename = "model.ex2.part1",
```
```
n_validation_runs = 1000,
```
```
validate = TRUE,
```
```
save = TRUE,
```
```
validation_method = "bootstrap",
```
```
save_validation_ids = TRUE
```
)
flip(mod, effect = 1)
plot(mod, effect = c(1,2), component = 1)
plot(mod, effect = c(1,2), component = 2)
plot(mod, effect = 1, component = c(1,2), type = "2D")
plot(mod, effect = 2, component = c(1,2), type = "2D")
plot(mod, effect = 1, component = 1, type = "validation")
plot(mod, effect = 1, component = 2, type = "validation")
plot(mod, effect = 2, component = 1, type = "validation")
plot(mod, effect = 2, component = 2, type = "validation")
plot(mod, effect = 1, component = 1, type = "histogram")
plot(mod, effect = 2, component = 1, type = "histogram")
}
INFO [2023-02-01 12:35:03] Initializing ALASCA (v1.0.8, 2023-01-15)
WARN [2023-02-01 12:35:03] Guessing effects: time and time:group
INFO [2023-02-01 12:35:03] Will use linear mixed models!
INFO [2023-02-01 12:35:03] Will use Rfast!
INFO [2023-02-01 12:35:03] Converting from wide to long!
INFO [2023-02-01 12:35:03] Found 16 variables
WARN [2023-02-01 12:35:03] The group column is used for stratification
WARN [2023-02-01 12:35:03] Not scaling data...
Error in identity() : argument "x" is missing, with no default

According to the package, the scale_function condition can use the following options: none, sdall, sdref, sdreft1, sdt1. I want to corroborate if, indeed, you can work ALASCA without the scaling of the data inside the function.
Again thank you very much for your time and excellent work.

Regards,
Cynthia

sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Mexico.utf8 LC_CTYPE=Spanish_Mexico.utf8
[3] LC_MONETARY=Spanish_Mexico.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Mexico.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] ggplot2_3.4.0 ALASCA_1.0.7 data.table_1.14.6 here_1.0.1

loaded via a namespace (and not attached):
[1] log4r_0.4.3 pillar_1.8.1 compiler_4.2.2 ggpubr_0.5.0 tools_4.2.2
[6] lifecycle_1.0.3 tibble_3.1.8 gtable_0.3.1 pkgconfig_2.0.3 rlang_1.0.6
[11] DBI_1.1.3 cli_3.4.1 rstudioapi_0.14 withr_2.5.0 dplyr_1.0.10
[16] generics_0.1.3 vctrs_0.5.1 gtools_3.9.4 rprojroot_2.0.3 grid_4.2.2
[21] tidyselect_1.2.0 glue_1.6.2 R6_2.5.1 rstatix_0.7.1 fansi_1.0.3
[26] carData_3.0-5 purrr_1.0.0 tidyr_1.2.1 car_3.1-1 magrittr_2.0.3
[31] scales_1.2.1 backports_1.4.1 assertthat_0.2.1 abind_1.4-5 colorspace_2.0-3
[36] ggsignif_0.6.4 utf8_1.2.2 munsell_0.5.0 broom_1.0.2

Problems with adding random slopes to the model.

Hi Anders,

To familiarize myself with the ALASCA package, I tried to retrieve random slopes from simulated data. Running the script:

**
library(lme4)
library(data.table)
library(ggplot2)
library(ALASCA)

df <- fread("[...]/data_long.csv")

res <- ALASCA(
df,
value ~ time + time:group + (time | sub_id),
use_Rfast = FALSE,
equal_baseline = FALSE,
validate = TRUE,
n_validate = 1000,
effects = c("time", "time:group", "time+time:group"),
scale_function = "sdall"
)
**

returns the output:

INFO [2024-06-17 16:47:01] Initializing ALASCA (v1.0.15, 2024-02-07)
INFO [2024-06-17 16:47:01] Will use linear mixed models!
ERROR [2024-06-17 16:47:01] Cannot use Rfast in this case. Use lme4 with use_Rfast = FALSE instead!
Error in private$set_method() :

It works well when I try random intercepts only (1 | sub_id).

Thanks already for the great package! Any further help, working example, or reference is much appreciated!

Cheers
Martin

Error in eigen(w1) : infinite or missing values in 'x' error when validation method is chosen

Dear Anders,

I wanted to open another issue that may help you to answer better so here you go :)

I converted my continuous age variable to a six-level continuous variable and my ALASCA model looks like:

my_model <- ALASCA(input,
                          value ~ time*age_factor+ (1|id),
                          separate_effects = T,
                          scale_function = "sdt1",
                     equal_baseline = T,
                     plot.loading_group_column = "type",
                     plot.loading_group_label = "Amino acid class",
                     max_PC =22, equal_baseline = T,
                     pca_function = "princomp",
                     validate=T, n_validation_runs = 90,
                     validation_method = "jack-knife")

So, this command works well after a couple of trials otherwise, it gives this error:

Error in eigen(w1) : infinite or missing values in 'x'

When I replace the validation method with "bootstrap", it fails more quicker. I tried other pca_function arguments too but jack-knife seems to work with fewer issues.

According to your benchmark in ALASCA publication, jack-knife seems to have smaller CIs but the two methods do not significantly differ from each other. For example, here they preferred "bootstrapping" but here they used jack-knifing. But what causes this issue, why does jack-knife seem to analyze better than Bootstrap and I got this error?

Thanks!
Best regards,
Nilay

Error in self$model$get_scores(effect_i = effect_i, component = component) : object 'PC' not found

Hi developer,

I got the ALASCA model successfully but accidentally removed the log file. I met the error as shown in the title when I ran plot(ALASCA_model, component = c(1,2), type = "effect")

I appreciate any help you could provide.

Best,
Muyao

object of type 'closure' is not subsettable

Hi!
I am using an untargeted metabolomics data and I have around 5000 features (variables), two time points and 44 subjects, and 4 Treatment groups.
I am getting following error:
mod <- ALASCA(

df = final_result,
formula = Value ~ Time + (1 | Subject),scale_function = "sdt1")
INFO [2024-03-11 13:22:11] Initializing ALASCA (v1.0.15, 2024-02-07)
WARN [2024-03-11 13:22:11] Guessing effects: Time
INFO [2024-03-11 13:22:11] Will use linear mixed models!
INFO [2024-03-11 13:22:11] Will use Rfast!
WARN [2024-03-11 13:22:11] The Time column is used for stratification
INFO [2024-03-11 13:22:11] Scaling data with sdt1 ...
Error in value[get(self$effect_terms[[1]]) == self$get_ref(self$effect_terms[[1]])] :
object of type 'closure' is not subsettable

I have some values with 0's but they are not missing values. Otherwise I do not find why I would see this error.
Is it related to having many variable names ?
Thank you!

Plotting not working in example in "Getting Started".

I am trying to run the example code https://andjar.github.io/ALASCA/articles/ALASCA.html. While ALASCA runs, I am unable to use the 'plot' command:

plot(res, component = c(1,2), type = 'effect')
INFO [2024-08-07 17:27:27] Effect plot. Selected effect (nr 1): time. Component: 1 and 2.
Error in vapply(columns, FUN = function(column) { :
values must be length 1,
but FUN(X[[1]]) result is length 0

Do you happen to know why this is happening?

type="covars" gives error while plotting the ALASCA object

Hi Anders,

Thanks so much for implementing such a cool package, it's quite useful and versatile!

I have two Q's, the first one is when I try to plot the ALASCA object with "type=covars" argument with plot function, I always get this error with different models:

Error in factor(covars, levels = data_to_plot[order(loading), covars]) : 
  object 'covars' not found

I installed the package with devtools::install_github("andjar/ALASCA", ref = "main") command so it should get the latest updates, right?

Thanks in advance!
Best regards,
Nilay

andjar / alasca Goto Github PK

alasca's People

Contributors

Stargazers

Watchers

alasca's Issues

ID error in cross-sectional analysis

Plot function prnts output as showing x number of variables, but not actually

Some errors applying ALASCA to longitudinal (repeat measures) microbial counts

Error: Mat::init()

Error in ALASCA with scale_function = "none"

Problems with adding random slopes to the model.

Error in eigen(w1) : infinite or missing values in 'x' error when validation method is chosen

Error in self$model$get_scores(effect_i = effect_i, component = component) : object 'PC' not found

object of type 'closure' is not subsettable

Plotting not working in example in "Getting Started".

type="covars" gives error while plotting the ALASCA object

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent