Comments (6)
I would add it into modules/nf-core
, then either use nf-core modules patch
to remove the meta map or, probably even easier, just add a fake meta map to the channel before entering the unmodified blastn module.
About the issues:
- That could be done in an additional step, initially a local copy would be fine I guess
- Optimally the output file should be modified in an additional process/module (local) that makes the output compatible with downstream analysis
- This problem I am not aware of. I would hope that with nextflow channels and file staging that shouldn't be necessary, but there were some oddities of containerized programs that I encountered as well.
from ampliseq.
I can't see why one would have to patch the module to remove the meta map. One can just create a meta map on the fly -- usually, just the id
entry is needed -- before calling the module. (Edit: Sorry, hadn't seen that you already proposed this, @d4straub.)
Otherwise, I agree with @d4straub, including his comments for point 3: I think it's a matter of properly staging the directory in which the blast database is located or will be created in. I see there is already a module for makeblastdb
, so I suppose it's just a matter of calling that the correct way.
from ampliseq.
Come to think of it: We're already using VSEARCH, which is basically doing the same thing as BLASTN but at least an order of magnitude faster -- can't that be used instead?
from ampliseq.
I haven't tried using VSEARCH for classification, but I'll give it a try using it tomorrow. If it does the same thing, but faster, then the only argument I still have for adding blastn is for cases when someone wants to use a blast database. For example, if someone is working on a supercomputer that stores and manages a blast nt database.
from ampliseq.
Hi, I think a strong case can be made for incorporating blast
as an option rather than only keeping VSEARCH,
as @a4000 has suggested.
Increasingly there will be a need for re-analysis of data, especially when longitudinal bio-monitoring projects will become more standard practice in e.g. marine environments. A lot of data sets already created had taxonomic annotation performed via blast. One would at least want to be able to recreate those results in a backward compatible manner. As there is already an nf-core module available, incorporation should not require a huge amount of work?
from ampliseq.
Imho, re-analysis from scratch of relevant data might be more helpful than attempting to use old methods to make data comparable ("old" doesnt mean here particularly blast, but all steps that come with analysing raw data). And ampliseq supports large data batches.
As there is already an nf-core module available, incorporation should not require a huge amount of work?
The implementation effort just to run blast on ASVs should be indeed relatively little. But the module covers only a small snippet of whats needed (and the easiest, imho). But I would be happy to be proven wrong!
from ampliseq.
Related Issues (20)
- Edge case: Clustering with VSEARCH fails at QIIME2_INSEQ HOT 1
- Allow to analyse 454 sequencing data HOT 2
- Add option to assign ASV to multiple species with DADA2 HOT 3
- Debug information for docker-based run. HOT 4
- Allow stratified output from picrust2 HOT 4
- nf-core/ampliseq with conda - change bioconductor-biostrings HOT 2
- Launch webpage not working HOT 4
- Adding qza file for downstream analysis in R HOT 3
- When using `--vsearch_cluster`, if you have many thousands of clusters, `AMPLISEQ:FILTER_CLUSTERS` will fail with an `Argument list too long` error. HOT 8
- test_full Cannot access file fastq HOT 1
- Multipe region amplicon sequencing analysis support (5R / SMURF / q2-sidle) HOT 1
- Getting ca 50% more ASVs than when using DADA2 on QIIME2 HOT 2
- ampliseq fails during taxonomy assignation when processing ITS sequences HOT 14
- Error No subject alternative DNS name matching zenodo.org found HOT 2
- minor improvement of sort() before denoising with method = "radix HOT 2
- 12S taxonomic classification databases HOT 3
- Does the `gtdb` database only include Bacteria? HOT 5
- Remove PhytoRef as it's included in PR2 5.0.0 HOT 1
- Template update 2.13.1 HOT 1
- Barrnap filtration HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ampliseq.