Comments (7)
- You don't need to train your own model. We provide PacBio model trained on multiple samples at multiple depths. You can start with that model
fullv3-pacbio-ngmlr -hg001+hg002+hg003+hg004-hg19
. - Correct.
- You can call variants first and later when you know where the repetitive regions are, you can further remove the variants called in those regions.
from clairvoyante.
So, the modelling is not dependent on characteristics of the species? What the model depends on and when do you need to do training to have a dedicated one? I thought I could't make use of a human trained model on grape.
For repeats, if I don't need to do a training, I will just need to mark the variants a s usually. But in the training, if I wold do that, how I do account for them?
from clairvoyante.
Sorry, if I bum the question, but I need to understand if and how I can run your tools.
So, the modelling is not dependent on characteristics of the species? What the model depends on and when do you need to do training to have a dedicated one? I thought I could't make use of a human trained model on grape.
For repeats, if I don't need to do a training, I will just need to mark the variants a s usually. But in the training, if I wold do that, how I do account for them?
from clairvoyante.
The modeling is not dependent on the characteristics of the species. The model depends on the sequencing technology you are using only. And for better performance, a matching aligner and a matching genome version with the model are preferred.
from clairvoyante.
Obviously the genome version cannot be the same, as it is another species. How much does this impact on the quality of the results? And using ngmlr rather than minimap2 in this situation, as I'd have to redo the training but I should anyway make use human genome/data?
from clairvoyante.
genome version difference makes a 0.3% f1-score difference in human genome. I suggest you to give the human model a try and see how it goes.
from clairvoyante.
Thanks Ruibang,
I'll give a try with the model you suggested first.
from clairvoyante.
Related Issues (20)
- program crashes internally but still writes VCF and successfully returns 0 HOT 1
- High runtime, high memory, low precision and recall using NA12878 data HOT 2
- GetTruth.py: error: unrecognized arguments: --noGT 1 HOT 13
- cannot connect to X server localhost HOT 2
- trainedModels directory HOT 1
- TrainingModels and new Bam file HOT 1
- Non-human data? HOT 1
- Failed to tabix the generated vcf files HOT 2
- callVarBam.py fails with taskset: failed to set pid 0's affinity: Invalid argument HOT 1
- SNV report is not consistent with the bam file HOT 1
- Is model of human genome applicable to variant calling on bacterial data ? HOT 1
- Applying full support for the IUPAC nucleotide code standard for better robustness? HOT 4
- Does the pacBio model you have trained can be used for PacBio CCS reads? HOT 1
- GPU conda install docs remove Clairvoyante HOT 2
- How is the QUAL metric calculated? HOT 2
- Check that BAM is indexed HOT 1
- PyPy HOT 1
- PacBio hifi HOT 1
- BAM:the genome region you specified has no read cover HOT 2
- No output in commands.sh HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clairvoyante.