Comments (5)
Thanks for the report, that indeed requires fixing.
from ampliseq.
I tested:
nextflow run nf-core/ampliseq -r 2.7.1 -profile test_failed,singularity --outdir result_test_failed_2-7-1
that is loosing a sample during early QC. This sample is missing in the ASV table but phyloseq works just fine.
nextflow run nf-core/ampliseq -r 2.7.1 -profile test_failed,singularity --outdir result_test_failed_2-7-1 --min_frequency 135 --exclude_taxa Moraxellaceae --diversity_rarefaction_depth 100 --skip_diversity_indices --skip_ancom
to make it also loose ASV so that an additional sample will loose all counts during taxonomic filtering (QIIME2_FILTERTAXA
), and consequentely only 3 samples are left in the ASV table. But the pipeline is still fine and the phyloseq object is created.
@agrier-wcm I'll need more information here, I cannot reproduce the problem.
from ampliseq.
Thanks for promptly looking into this @d4straub. I have a relatively small dataset that works fine with 2.6.1, but I get this error with 2.7.1:
Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:PHYLOSEQ_WORKFLOW:PHYLOSEQ (dada2)'
Caused by:
Process `NFCORE_AMPLISEQ:AMPLISEQ:PHYLOSEQ_WORKFLOW:PHYLOSEQ (dada2)` terminated with an error exit status (1)
Command executed:
#!/usr/bin/env Rscript
suppressPackageStartupMessages(library(phyloseq))
otu_df <- read.table("reformat_filtered-table.tsv", sep="\t", header=TRUE, row.names=1)
tax_df <- read.table("ASV_tax_species.silva.tsv", sep="\t", header=TRUE, row.names=1)
otu_mat <- as.matrix(otu_df)
tax_mat <- as.matrix(tax_df)
OTU <- otu_table(otu_mat, taxa_are_rows=TRUE)
TAX <- tax_table(tax_mat)
phy_obj <- phyloseq(OTU, TAX)
if (file.exists("Metadata.tsv")) {
sam_df <- read.table("Metadata.tsv", sep="\t", header=TRUE, row.names=1)
SAM <- sample_data(sam_df)
phy_obj <- merge_phyloseq(phy_obj, SAM)
}
if (file.exists("")) {
TREE <- read_tree("")
phy_obj <- merge_phyloseq(phy_obj, TREE)
}
saveRDS(phy_obj, file = paste0("dada2", "_phyloseq.rds"))
# Version information
writeLines(c("\"NFCORE_AMPLISEQ:AMPLISEQ:PHYLOSEQ_WORKFLOW:PHYLOSEQ\":",
paste0(" R: ", paste0(R.Version()[c("major","minor")], collapse = ".")),
paste0(" phyloseq: ", packageVersion("phyloseq"))),
"versions.yml"
)
Command exit status:
1
Command output:
(empty)
Command error:
Error in validObject(.Object) : invalid class “phyloseq” object:
Component sample names do not match.
Try sample_names()
Calls: merge_phyloseq ... do.call -> new -> initialize -> initialize -> validObject
Execution halted
It does have a sample that loses all reads to the tax filtering step, so I assumed that was the problem because I thought for phyloseq object creation the set of samples in the Metadata file would have to match the set in the OTU table. Maybe there's something else going on. I'm testing a couple other small datasets now with 2.7.1 to see if I can reproduce the error with another dataset. Here is the command I was using, for reference:
nextflow run ampliseq -profile singularity --input ./SampleSheet.tsv --FW_primer GTGYCAGCMGCCGCGGTAA --RV_primer CCGYCAATTYMTTTRAGTTT --metadata ./Metadata.tsv --outdir ./results --email my.email@me --dada_ref_taxonomy silva --ignore_empty_input_files --ignore_failed_trimming --min_frequency 10 --retain_untrimmed --trunclenf 240 --trunclenr 160 --metadata_category_barplot "TumorLocation,SampleTissueType" --tax_agglom_max 7 --max_memory 32.GB
Does your test also use silva as the taxonomic database? I will have more information shortly from the tests that I am running, but I'm not exactly sure what to interrogate if the problem is not the tax filter thing. I will update this evening.
from ampliseq.
Well, it ended up being something stupid. In the Metadata file, some samples had a space at the end of the sample names, which they did not have in the SampleSheet. Somewhat interestingly, this did not cause any problems in 2.6.1, even for the comparisons/group assignments. Sorry to waste your time @d4straub , but thank you for looking into it. Closing.
from ampliseq.
I see, happens to the best of us ;)
Thanks for returning feedback.
from ampliseq.
Related Issues (20)
- ANCOM-BC for differentially abundant taxa HOT 2
- ERROR: R_HOME ('/usr/local/lib/R') not found HOT 5
- Misleading error message when samples are not passing filterandtrim HOT 1
- Accessing 16S gene identifiers HOT 13
- DADA2 split regions singularity HOT 13
- multi-region analysis: sidle/reconstructed/reconstructed_merged.tsv OCCATIONALLY mis-formatted HOT 1
- Running test error HOT 1
- Abundance plots for qiime2 results without metadata provided HOT 3
- Adding ONT read support for ampliseq HOT 2
- `overall_summary.tsv` sometimes with misleading numbers in 2.9.0 HOT 11
- Analyse data set that contains unknown primer set HOT 5
- Cutadapt with "-u" instead of "fw/rv_primer seq" HOT 1
- There is no qiime2 result file in the results HOT 2
- Misleading text in output documentation HOT 1
- Add MACSE to enable frame-shift detection for COI HOT 6
- Pipeline fails when run with a lot of cores HOT 6
- Error "Can't stage file https://files.plutof.ut.ee:" Download fail HOT 7
- Taxonomy database choices not validated against permitted values since 2.9.0 HOT 3
- New (10.0) version of Unite databases available to `--dada_ref_taxonomy`
- phred score mis-detection by dada2 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ampliseq.