Giter VIP home page Giter VIP logo

Comments (5)

d4straub avatar d4straub commented on September 20, 2024

Thanks for the report, that indeed requires fixing.

from ampliseq.

d4straub avatar d4straub commented on September 20, 2024

I tested:
nextflow run nf-core/ampliseq -r 2.7.1 -profile test_failed,singularity --outdir result_test_failed_2-7-1 that is loosing a sample during early QC. This sample is missing in the ASV table but phyloseq works just fine.
nextflow run nf-core/ampliseq -r 2.7.1 -profile test_failed,singularity --outdir result_test_failed_2-7-1 --min_frequency 135 --exclude_taxa Moraxellaceae --diversity_rarefaction_depth 100 --skip_diversity_indices --skip_ancom to make it also loose ASV so that an additional sample will loose all counts during taxonomic filtering (QIIME2_FILTERTAXA), and consequentely only 3 samples are left in the ASV table. But the pipeline is still fine and the phyloseq object is created.

@agrier-wcm I'll need more information here, I cannot reproduce the problem.

from ampliseq.

agrier-wcm avatar agrier-wcm commented on September 20, 2024

Thanks for promptly looking into this @d4straub. I have a relatively small dataset that works fine with 2.6.1, but I get this error with 2.7.1:

Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:PHYLOSEQ_WORKFLOW:PHYLOSEQ (dada2)'

Caused by:
  Process `NFCORE_AMPLISEQ:AMPLISEQ:PHYLOSEQ_WORKFLOW:PHYLOSEQ (dada2)` terminated with an error exit status (1)

Command executed:

  #!/usr/bin/env Rscript
  
  suppressPackageStartupMessages(library(phyloseq))
  
  otu_df  <- read.table("reformat_filtered-table.tsv", sep="\t", header=TRUE, row.names=1)
  tax_df  <- read.table("ASV_tax_species.silva.tsv", sep="\t", header=TRUE, row.names=1)
  otu_mat <- as.matrix(otu_df)
  tax_mat <- as.matrix(tax_df)
  
  OTU     <- otu_table(otu_mat, taxa_are_rows=TRUE)
  TAX     <- tax_table(tax_mat)
  phy_obj <- phyloseq(OTU, TAX)
  
  if (file.exists("Metadata.tsv")) {
      sam_df  <- read.table("Metadata.tsv", sep="\t", header=TRUE, row.names=1)
      SAM     <- sample_data(sam_df)
      phy_obj <- merge_phyloseq(phy_obj, SAM)
  }
  
  if (file.exists("")) {
      TREE    <- read_tree("")
      phy_obj <- merge_phyloseq(phy_obj, TREE)
  }
  
  saveRDS(phy_obj, file = paste0("dada2", "_phyloseq.rds"))
  
  # Version information
  writeLines(c("\"NFCORE_AMPLISEQ:AMPLISEQ:PHYLOSEQ_WORKFLOW:PHYLOSEQ\":",
      paste0("    R: ", paste0(R.Version()[c("major","minor")], collapse = ".")),
      paste0("    phyloseq: ", packageVersion("phyloseq"))),
      "versions.yml"
  )

Command exit status:
  1

Command output:
  (empty)

Command error:
  Error in validObject(.Object) : invalid class “phyloseq” object: 
   Component sample names do not match.
   Try sample_names()
  Calls: merge_phyloseq ... do.call -> new -> initialize -> initialize -> validObject
  Execution halted

It does have a sample that loses all reads to the tax filtering step, so I assumed that was the problem because I thought for phyloseq object creation the set of samples in the Metadata file would have to match the set in the OTU table. Maybe there's something else going on. I'm testing a couple other small datasets now with 2.7.1 to see if I can reproduce the error with another dataset. Here is the command I was using, for reference:

nextflow run ampliseq -profile singularity --input ./SampleSheet.tsv --FW_primer GTGYCAGCMGCCGCGGTAA --RV_primer CCGYCAATTYMTTTRAGTTT --metadata ./Metadata.tsv --outdir ./results --email my.email@me --dada_ref_taxonomy silva --ignore_empty_input_files --ignore_failed_trimming --min_frequency 10 --retain_untrimmed --trunclenf 240 --trunclenr 160 --metadata_category_barplot "TumorLocation,SampleTissueType" --tax_agglom_max 7 --max_memory 32.GB

Does your test also use silva as the taxonomic database? I will have more information shortly from the tests that I am running, but I'm not exactly sure what to interrogate if the problem is not the tax filter thing. I will update this evening.

from ampliseq.

agrier-wcm avatar agrier-wcm commented on September 20, 2024

Well, it ended up being something stupid. In the Metadata file, some samples had a space at the end of the sample names, which they did not have in the SampleSheet. Somewhat interestingly, this did not cause any problems in 2.6.1, even for the comparisons/group assignments. Sorry to waste your time @d4straub , but thank you for looking into it. Closing.

from ampliseq.

d4straub avatar d4straub commented on September 20, 2024

I see, happens to the best of us ;)
Thanks for returning feedback.

from ampliseq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.