Giter VIP home page Giter VIP logo

Comments (4)

dancooke avatar dancooke commented on August 30, 2024

Suspicious that all of the reads are on the reverse strand. Is this WGS or WES? Octopus will usually ignore variants that are so highly strand biased as they are indicative of sequencing or mapping error. I could potentially tweak the logic to include these 'cleaner' strand-biased variants, but they'd almost certainly get filtered in any case.

from octopus.

brendanofallon avatar brendanofallon commented on August 30, 2024

The data are from a large targeted panel - but I'm not sure that this variant is strand biased. The reads all go in one direction (we're on the right edge of a targeted region), but the number of alt-supporting reads isn't strongly biased in one direction relative to ref-supporting reads. (From what I understand, strand bias isn't a test of how close the forward / reverse ratio is to 50/50, it's a Chi-squared or FIsher's exact-like test to measure if alt-supporting reads are distributed similarly to ref-supporting reads across read directions. In the case above where there aren't any forward reads, the test probably cannot be performed meaningfully)
I've come across a handful of other false negative SNVs in similar regions where all the reads are in one direction. Happy to provide more small .bams or something if it would be helpful.

from octopus.

dancooke avatar dancooke commented on August 30, 2024

You're right that "strand bias" usually describes the difference in proportion of read direction supporting two or more distinct haplotypes. Octopus does compute a statistic of this sort of strand bias for filtering purposes. But as you point out, this quantity is not meaningful in the absence of reads in one direction or for homozygous calls. For WGS, we expect a binomial distribution of read directions supporting all called haplotypes, so observing a haplotype on just one strand is suggestive that the either the call is incorrect (e.g. due to a missed structural variant). It's therefore informative (about the validity of a call) to consider just the proportion of supporting read directions, which I also called "strand bias" for want of a better term. For targeted sequencing, we don't expect a binomial distribution of read directions for off-target regions, but then we cannot reasonably assign high confidence to calls in these regions in any case. I'd argue that such regions should be included in the panel, if variation within them is important.

That said, I can appreciate that some might want such variants to be called, even if they are filtered. I've therefore modified the variant discovery criteria to allow variants only seen in one direction if there are only reads in that direction overlapping the region (available in v0.5.3-beta). This should cause the variants that you mention to be called.

from octopus.

yangyxt avatar yangyxt commented on August 30, 2024

Using 0.7.4 docker image of octopus. Running into this:

image

The ploidy is 6 instead of 2. The middle mismatches of 4-bp deletion and the SNV are both captured but failed to have a valid GT (in the output both are recorded as ./.) The overlapping reads MQ and BQ are great, no strand bias.

GGGGA G 5884.22 RF . GT:AD:DP:FT ./.:67,40:152:RF
G A 5379.3 RF . GT:AD:DP:FT ./.:67,49:151:RF

Here are the command line arguments I used:
/opt/octopus/bin/octopus \ -R ${ref_genome} \ -I ${bam_file} \ -o ${output_vcf} \ --trace ${output_vcf/.vcf*/.trace.log} \ --debug ${output_vcf/.vcf*/.debug.log} \ --bamout ${bam_out} \ --bamout-type FULL \ ${region_arg} \ --threads ${threads} \ --consider-unmapped-reads \ --min-mapping-quality 15 \ --duplicate-read-detection-policy RELAXED \ --disable-downsampling \ --variant-discovery-mode ILLUMINA \ --organism-ploidy ${ploidy} \ --read-linkage PAIRED \ --caller ${caller_mode} \ --min-forest-quality ${forest_vq_filter} \ --forest ${forest} \ --annotations AD

A bit confused why this happened

from octopus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.