Deion of feature I can add a module to Ampliseq that would p

Is that issue solved by <a class="issue-link js-issue-link" data-error-text="Failed to

Proposal: Phyloseq R object creation at end of pipeline about ampliseq HOT 7 CLOSED

a4000 commented on September 23, 2024

Proposal: Phyloseq R object creation at end of pipeline

from ampliseq.

Comments (7)

cpauvert commented on September 23, 2024 1

I'm not 100% sure if phyloseq allows creating an object without the metadata (I can test that tomorrow), but if it doesn't, could maybe create a dummy metadata sheet with the sample names. I do know that it is possible to create the object without a tree, so that shouldn't be an issue.

Yes, it is possible to create a phyloseq object starting from any of the components class (OTU, sample, tree, taxonomy). The example of the constructor function does not have metadata ; )

from ampliseq.

d4straub commented on September 23, 2024

Yes, that would be handy!
edit: I forgot, are 2 elements also possible (i.e. without metadata % tree) for a phyloseq object? The application case would be that someone would skip downstream analysis via ampliseq by not supplying metadata and would like to do downstream analysis outside of ampliseq and aims to use the phyloseq object for that.

from ampliseq.

a4000 commented on September 23, 2024

I'm not 100% sure if phyloseq allows creating an object without the metadata (I can test that tomorrow), but if it doesn't, could maybe create a dummy metadata sheet with the sample names. I do know that it is possible to create the object without a tree, so that shouldn't be an issue.

from ampliseq.

a4000 commented on September 23, 2024

I've decided the first thing I'll add to Ampliseq is the phylogenetic tree because I figured it would be an easy place to start while I familiarise myself more with Ampliseq. I have some questions for the Ampliseq team to make sure I'm on the right track with implementing this feature.

For the table of ASV counts, I'm planning on using the table found in ch_dada2_asv, though I could also use the filtered table in QIIME2_FILTERTAXA.out.tsv if that table exists. Are there other count tables I should consider using?

From what I can tell, the pipeline can produce 1 or more taxonomy tables. It should be easy enough to produce multiple phyloseq objects depending on which taxa tables exist. I've found two taxa tables in the pipeline that are already in the correct format for what I need found in ch_dada2_tax and ch_sintax_tax. I've found two other taxa tables that are in slightly different formats found in ch_pplace_tax and QIIME2_TAXONOMY.out.tsv. I'm wondering if I should add modules to reformat these tables or if there are other tables I haven't found yet?

I've found a nwk phylogenetic tree produced by the pipeline in FASTA_NEWICK_EPANG_GAPPA.out.grafted_phylogeny. I believe there's also a tree produced by QIIME? The problem is that this tree has taxonomy names as the tip labels of the tree while phyloseq expects the tip labels to match the ASV names. I'm wondering if the pipeline produces a tree with those ASV names as the tip labels, or if adding the tree to the object will require a different plan (e.g., producing a different tree that has those ASV names)?

from ampliseq.

a4000 commented on September 23, 2024

Actually, I tried running the pipeline with my phyloseq module and it successfully added the tree, so you can probably disregard that point.

To elaborate on my point about the tax tables. The dada2 and Sintax tax tables have the different tax levels as columns, while the other two tax table just have one taxonomy column. The Phyloseq object will still be created, but it might be beneficial to have the tax tables in a more consistent format.

from ampliseq.

d4straub commented on September 23, 2024

For the table of ASV counts, I'm planning on using the table found in ch_dada2_asv, though I could also use the filtered table in QIIME2_FILTERTAXA.out.tsv if that table exists. Are there other count tables I should consider using?

Essentially, the ASV count table is produced by DADA2, subsequently optionally filtered by some custom filter scripts here ending up in ch_dada2_asv in all cases. Then optionally filtered by QIIME2 and exported as TSV in here as ch_tsv = QIIME2_FILTERTAXA.out.tsv. So yes, those two should do it.

From what I can tell, the pipeline can produce 1 or more taxonomy tables. It should be easy enough to produce multiple phyloseq objects depending on which taxa tables exist. I've found two taxa tables in the pipeline that are already in the correct format for what I need found in ch_dada2_tax and ch_sintax_tax. I've found two other taxa tables that are in slightly different formats found in ch_pplace_tax and QIIME2_TAXONOMY.out.tsv. I'm wondering if I should add modules to reformat these tables or if there are other tables I haven't found yet?

Four tools are currently able to produce tax tables. One particular tax table is chosen for downstream analysis and is imported to QIIME2 (if it is run) here as ch_tax. All tax tables are used by QIIME2 here to export aggregated and merged (ASV count & tax) tables. I imagine something similar could be done for phyloseq objects (outside of QIIME2 obviously).
About reformatting, tax has always ; as separator, if separated at all, I think. I believe QIIME2 is accepting only tax tables with ; separated tax levels, meaning qiime2 needs that format. One module should do for reformatting for your needs, I assume.

from ampliseq.

d4straub commented on September 23, 2024

Is that issue solved by #615? If yes, please close it.

from ampliseq.

Proposal: Phyloseq R object creation at end of pipeline about ampliseq HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent