Giter VIP home page Giter VIP logo

basicfiltering's Introduction

basicfiltering

Basic Filtering for:

  1. Variant Allele Frequency = 1% (default)
  2. Variant Reads = 5 (default)
  3. Tumor-Normal Variant Allele Frequency Ratio >= 5 (default)
  4. If vcf of hotspot location are given it skips positions that have hotspots regardless of not satisfying number 3 from the above criteria

for Multiple Tools

Build Status codecov

Requirements:

Auto CWL post-process requirements

  • Convert inputVcf to have both string and file as input type
  • Convert inputTxt to have both string and file as input type
  • Convert hotspotVcf to have both string and file as input type

Works with following versions output formats:

SomaticIndelDetector (filter_sid.py)

usage: filter_sid.py [options]

Filter Indels from the output of SomaticIndelDetector

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         make lots of noise
  -ivcf SomeID.vcf, -inputVcf SomeID.vcf
                        Input SomaticIndelDetector vcf file which needs to be
                        filtered
  -itxt SomeID.txt, -inputTxt SomeID.txt
                        Input SomaticIndelDetector txt file which needs to be
                        filtered
  -tsn SomeName, --tsampleName SomeName
                        Name of the tumor Sample
  -dp 0, --totaldepth 0
                        Tumor total depth threshold
  -ad 5, --alleledepth 5
                        Tumor allele depth threshold
  -tnr 5, --tnRatio 5   Tumor-Normal variant frequency ratio threshold
  -vf 0.01, --variantfrequency 0.01
                        Tumor variant frequency threshold
  -hvcf hostpot.vcf, --hotspotVcf hostpot.vcf
                        Input bgzip / tabix indexed hotspot vcf file to used
                        for filtering
  -o /somepath/output, --outDir /somepath/output
                        Full Path to the output dir.

MuTect (filter_mutect.py)

  • MuTect version = 1.1.4
  • Takes in the text and vcf file input and filters based on text input.
usage: filter_mutect.py [options]

Filter SNPS from the output of muTect v1.14

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         make lots of noise
  -ivcf SomeID.vcf, -inputVcf SomeID.vcf
                        Input vcf muTect file which needs to be filtered
  -itxt SomeID.txt, -inputTxt SomeID.txt
                        Input txt muTect file which needs to be filtered
  -tsn SomeName, --tsampleName SomeName
                        Name of the tumor Sample
  -dp 0, --totaldepth 0
                        Tumor total depth threshold
  -ad 5, --alleledepth 5
                        Tumor allele depth threshold
  -tnr 5, --tnRatio 5   Tumor-Normal variant frequency ratio threshold
  -vf 0.01, --variantfrequency 0.01
                        Tumor variant frequency threshold
  -hvcf hostpot.vcf, --hotspotVcf hostpot.vcf
                        Input bgzip / tabix indexed hotspot vcf file to used
                        for filtering
  -o /somepath/output, --outDir /somepath/output
                        Full Path to the output dir.

VarDict (filter_vardict.py)

usage: filter_vardict.py [options]

Filter Indels from the output of vardict

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         make lots of noise
  -i SomeID.vcf, -inputVcf SomeID.vcf
                        Input vcf vardict file which needs to be filtered
  -tsn SomeName, --tsampleName SomeName
                        Name of the tumor Sample
  -dp 0, --totaldepth 0
                        Tumor total depth threshold
  -ad 5, --alleledepth 5
                        Tumor allele depth threshold
  -tnr 5, --tnRatio 5   Tumor-Normal variant frequency ratio threshold
  -vf 0.01, --variantfrequency 0.01
                        Tumor variant frequency threshold
  -hvcf hostpot.vcf, --hotspotVcf hostpot.vcf
                        Input bgzip / tabix indexed hotspot vcf file to used
                        for filtering
  -o /somepath/output, --outDir /somepath/output
                        Full Path to the output dir.

PINDEL (filter_pindel.py)

usage: filter_pindel.py [options]

Filter Indels from the output of pindel

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         make lots of noise
  -i SomeID.vcf, -inputVcf SomeID.vcf
                        Input vcf freebayes file which needs to be filtered
  -tsn SomeName, --tsampleName SomeName
                        Name of the tumor Sample
  -dp 0, --totaldepth 0
                        Tumor total depth threshold
  -ad 5, --alleledepth 5
                        Tumor allele depth threshold
  -tnr 5, --tnRatio 5   Tumor-Normal variant frequency ratio threshold
  -vf 0.01, --variantfrequency 0.01
                        Tumor variant frequency threshold
  -o /somepath/output, --outDir /somepath/output
                        Full Path to the output dir.
  -min 25, --min_var_len 25
                        Minimum length of the Indels
  -max 500, --max_var_len 500
                        Max length of the Indels
  -hvcf hostpot.vcf, --hotspotVcf hostpot.vcf
                        Input bgzip / tabix indexed hotspot vcf file to used
                        for filtering

basicfiltering's People

Contributors

hisplan avatar lordzappo avatar rhshah avatar

Watchers

 avatar  avatar  avatar

basicfiltering's Issues

basic filtering

Dear, Ronak H
I used your scripts(https://github.com/rhshah/basicfiltering) for filtering variant call from Pindel (Tumor-normal paired). The vcf output of Pindel (pindel2vcf) don't have DP (depth of coverage) field, only AD (allele depth). But the script from basicfiltering has as argument option or choose filter by DP, AD and VAF. Where VAF (AD of each allele/ DP). I would like to know if you use VAF information and how to extract.

Thanks for your attention,
Leandro

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.