Comments (6)
The files needed for the pipeline are:
- Human reference genome + bowtie2 indexes.
- Virus reference genome + bowtie2 indexes + blast db + gff/gtf file (Currently NC_045512,2 version from refseq is being used for sars-cov-2)
from viralrecon.
What is the best way to get the GFF/GTF file?
There is one linked here
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/858/895/GCF_009858895.2_ASM985889v3/GCF_009858895.2_ASM985889v3_genomic.gff.gz
or would it be better to generate one from the NC_045512,2 full Genbank record?
from viralrecon.
Yep, it would be helpful if we can have links to where the reference files were downloaded (especially for the virus). If they have been generated manually from standard files (e.g. blast db) then worth listing the command used here too.
Also, just out of interest why did you pick NC_045512,2 as a reference sequence and not one of the others that are being used? e.g.
https://www.ncbi.nlm.nih.gov/nuccore/MN908947
I am just trying to figure out if we make all/one of these available in the format required by the pipeline?
from viralrecon.
NC_045512,2 and MN908947 are exactly the same sequences, one is the RefSeq identifier and the other one is the Genbank's. The genbank one was the one used at the beginning because the refseq didn't exist. But the sequence and coordinates are the same.
We've downloaded everything from ncbi.
Virus fasta file: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/858/895/GCF_009858895.2_ASM985889v3/GCF_009858895.2_ASM985889v3_genomic.fna.gz
Virus gff file:https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/858/895/GCF_009858895.2_ASM985889v3/GCF_009858895.2_ASM985889v3_genomic.gff.gz
blast db:
makeblastdb -in NC_045512.2 -parse_seqids -dbtype nucl
bowtie2 index:
bowtie2-build NC_045512.2.fasta NC_045512.2.fasta
from viralrecon.
Amazing! Thanks @saramonzon. This will come in handy if and when we decide to upload the data to AWS iGenomes or I guess we could just provide URLs to the pipeline 🤔
from viralrecon.
This should be sorted now with this PR to host the viral genomes on test-datasets
:
nf-core/test-datasets#148
from viralrecon.
Related Issues (20)
- Allow viralrecon to take gtf as input annotation file
- primer_set is not taking param
- Artic v5 mismatched primer names in artic-ncov2019 repo cause certain amplicons to be erroneously filtered/removed
- Add QIAseq DIRECT SARS-CoV-2 Kit amplicons
- Argument input-fasta is missing in NFCORE_VIRALRECON:ILLUMINA:CONSENSUS_BCFTOOLS:CONSENSUS_QC:NEXTCLADE_RUN" HOT 4
- Problem installing viralrecon
- Adding "aggregate" and "plot" methods in the freyja subworkflow HOT 4
- Error running version 2.6.0 with Nanopore data in process NFCORE_VIRALRECON:NANOPORE:ARTIC_MINION
- Unable to download the python script HOT 3
- Split authors in generated RO crate
- Adding "--grouplineages" parameter in the nf-core/viralrecon HOT 3
- Make cutadapt primer's position ext.arg
- Properly deal with multiqc in the config files before the next release HOT 3
- Temp file problem in VARIANTS_IVAR:BCFTOOLS_SORT HOT 1
- Allow skipping `freyja boot` HOT 1
- `MOSDEPTH_AMPLICON` doesn't run in `dev` branch HOT 1
- Non-SCV2 amplicon run returns consensus genomes with no low-coverage masking HOT 1
- nf-core/viralrecon run halted due to R version clash HOT 1
- Error when running illumina command
- ivariant filtering problem using 'tsv file' generated by 'ivar variant' command. HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from viralrecon.