Comments (7)
In the vcf output, most problems come from counting AF for insertion.
After digging into the code, turns out there is a bug in counting insertion and deletion read depth.
Hope the problem is fixed now, welcome to try again and see if any weird things happen.
from clairvoyante.
Hello,
I reinstalled Clairvoyante. I don't get any more allele frequencies bigger than 1, so this seems to be fixed.
However, I am still getting af of exactly 0 .
bcftools view GC031600.160503.sorted.bam.vcf.gz| grep -v "#"|awk -F"\t|:" '{print $16}'|sort -n|head
0
0
0
This is not necessarily in INDELS regions, as filtering for SNPs only gives
bcftools view GC031600.160503.sorted.bam.vcf.gz|grep -v "#"| awk -F"\t|:" '{if (length($3) == 1 && length($4) == 1) print $16}'|sort -n|head
0
0
0
Here are sample complete lines
A_Seg100 871 . G C 101 . . GT:GQ:DP:AF 1/1:101:1:0
A_Seg100 878 . T G 55 . . GT:GQ:DP:AF 1/1:55:1:0
A_Seg100 881 . T G 65 . . GT:GQ:DP:AF 1/1:65:1:0
A_Seg100 882 . C A 70 . . GT:GQ:DP:AF 1/1:70:1:0
A_Seg100 883 . G A 89 . . GT:GQ:DP:AF 1/1:89:1:0
A_Seg100 902 . T G 213 . . GT:GQ:DP:AF 1/1:213:44:0
A_Seg100 916 . T C 89 . . GT:GQ:DP:AF 1/1:89:44:0
A_Seg100 922 . G T 83 . . GT:GQ:DP:AF 1/1:83:44:0
A_Seg100 1479 . C T 234 . . GT:GQ:DP:AF 1/1:234:116:0
A_Seg100 1480 . C G 411 . . GT:GQ:DP:AF 1/1:411:117:0
Is this expected behaviour in such cases?
Interestingly it seems to happen only for homozygous site for the ALT allele. So is this a computation like
REF/(ALT+REF)?
from clairvoyante.
Hello I looked into Tablet ...
So first, it doesn't seem to be linked with 1/1 GT only
A_Seg100 1536 . T C 999 . . GT:GQ:DP:AF 1/1:999:201:0
However, when looking into Tablet at this position ... there is actually no variant visible!
as you can see just next to the 2 yellow boxes is the position 1536 ... Clairvoyante predicted a C however all the reads have a T there. So indeed, the frequency of "C" is 0 ... But then, why was it called, and why with a score of 999? The following screenshot doesn't show all the reads at the position but they are all exactly the same, I show only a few for clarity.
Is this a side effect of having put the threshold to "0" in the command? By doing this, I wanted to see all variant irrespectively of their frequency, I did not literally wanted non existing alleles :D
from clairvoyante.
It is possible to get 0 AF. Clairvoyante calls variant not only based on the observation at the variant position but also the flanking 16 bases. But AF only calculates the variant at the position. For your case, I suggest you filter those variants with AF under your expectation.
from clairvoyante.
Ok, thanks for your answer. I will do that.
Just out of curiosity, what kind of information could one get from an af of 0? Isn't it systematically indicating the call is wrong?
from clairvoyante.
For models only considering the observation at a single genome position, it's true that AF 0 almost always leads to "the allele is not a variant". I use "almost" because it is still possible to have no observation of the true variant allele at low depth.
Clairvoyante also considers the observations of the flanking 16bp. We know that PacBio and ONT are often erroneous at the homopolymer regions, and it's not impossible that the true variant alleles are all left shifted due to a homopolymer region to the left of a true variant. Clairvoyante tackles these cases and will try to provide the right answer. But for these cases, AF will be 0 because all true variant alleles are left-shifted.
from clairvoyante.
Thanks for the explanation and the support in using Clairvoyante (really cool piece of software) :)
from clairvoyante.
Related Issues (20)
- program crashes internally but still writes VCF and successfully returns 0 HOT 1
- High runtime, high memory, low precision and recall using NA12878 data HOT 2
- GetTruth.py: error: unrecognized arguments: --noGT 1 HOT 13
- cannot connect to X server localhost HOT 2
- trainedModels directory HOT 1
- TrainingModels and new Bam file HOT 1
- Non-human data? HOT 1
- Failed to tabix the generated vcf files HOT 2
- callVarBam.py fails with taskset: failed to set pid 0's affinity: Invalid argument HOT 1
- SNV report is not consistent with the bam file HOT 1
- Is model of human genome applicable to variant calling on bacterial data ? HOT 1
- Applying full support for the IUPAC nucleotide code standard for better robustness? HOT 4
- Does the pacBio model you have trained can be used for PacBio CCS reads? HOT 1
- GPU conda install docs remove Clairvoyante HOT 2
- How is the QUAL metric calculated? HOT 2
- Check that BAM is indexed HOT 1
- PyPy HOT 1
- PacBio hifi HOT 1
- BAM:the genome region you specified has no read cover HOT 2
- No output in commands.sh HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clairvoyante.