Giter VIP home page Giter VIP logo

variantqc's Introduction

VariantQC

Variant quality checking scripts for complex indel variant discovery and filtering from Pindel-C outputs. Referenced in Systematic discovery of complex insertions and deletions in human cancers (doi:10.1038/nm.4002).

How to run QC

Main QC script is run using bsub_qc.sh, which initiates the main qc_pipeline.sh. The input to bsub_qc.sh is described in the file.

#Steps

  1. Extract complex insertions and deletions from pindel output.
  2. Identify somatic, germline, and loss of heterozygosity(loh) events.
  3. Filter out low coverage sites (20 read min).
  4. Make unfiltered VCF for germline, somatic and loh events.
  5. Run readcount tool on tumor sample. Performing readcount analysis will determine if somatic and loh events are appropriately classified (Note: Not run for germline).
  6. Run readcount tool on normal sample. Performing readcount analysis will determine if somatic and loh events are appropriately classified (Note: Not run for germline).
  7. Reclassify germline, somatic, and loh based on read count data of somatic events.
  8. Making VCFs for filtered pindel output for VEP input & annotate final filtered VCF using VEP.

Reyka Jayasinghe ([email protected]) and Steven Foltz ([email protected]).

variantqc's People

Contributors

envest avatar reykajayasinghe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

variantqc's Issues

No HTML output

Hi,

I was trying to run VariantQC for VEP annotated vcf with Hg38 GATK fasta. The tools runs fine with 1 exception. I can't get the html output file.

java -jar DISCVRSeq-1.3.42.jar VariantQC
-R Homo_sapiens_assembly38.fasta
-V vep_annoated.final.vcf
-O VCF_output.html

12:19:49.275 INFO  VariantQC - ------------------------------------------------------------
12:19:49.278 INFO  VariantQC - DISCVR-seq Toolkit v1.3.42
12:19:49.278 INFO  VariantQC - For support and documentation go to https://software.broadinstitute.org/gatk/
12:19:49.278 INFO  VariantQC - Executing as banerjeep3@cn4310 on Linux v4.18.0-425.19.2.el8_7.x86_64 amd64
12:19:49.279 INFO  VariantQC - Java runtime: Java HotSpot(TM) 64-Bit Server VM v17.0.3.1+2-LTS-6
12:19:49.279 INFO  VariantQC - Start Date/Time: July 12, 2023 at 12:19:49 PM EDT
12:19:49.279 INFO  VariantQC - ------------------------------------------------------------
12:19:49.279 INFO  VariantQC - ------------------------------------------------------------
12:19:49.280 INFO  VariantQC - HTSJDK Version: 3.0.5
12:19:49.280 INFO  VariantQC - Picard Version: unknown
12:19:49.280 INFO  VariantQC - Built for Spark Version: unknown
12:19:49.281 INFO  VariantQC - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:19:49.281 INFO  VariantQC - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:19:49.281 INFO  VariantQC - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:19:49.281 INFO  VariantQC - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:19:49.281 INFO  VariantQC - Deflater: IntelDeflater
12:19:49.282 INFO  VariantQC - Inflater: IntelInflater
12:19:49.282 INFO  VariantQC - GCS max retries/reopens: 20
12:19:49.282 INFO  VariantQC - Requester pays: disabled
12:19:49.283 INFO  VariantQC - Initializing engine
12:19:49.753 INFO  FeatureManager - Using codec VCFCodec to read file file:///final.g.vcf
12:19:49.832 INFO  VariantQC - Done initializing engine
12:19:49.842 INFO  VariantQC - Total VariantEval instances: 8
12:19:50.020 INFO  Reflections - Reflections took 135 ms to scan 1 urls, producing 9 keys and 31 values 
12:19:50.106 INFO  Reflections - Reflections took 78 ms to scan 1 urls, producing 7 keys and 42 values 
12:19:50.119 INFO  VariantEvalEngine - Creating 3 combinatorial stratification states
12:19:50.127 INFO  VariantQC - Shutting down engine
[July 12, 2023 at 12:19:50 PM EDT] com.github.discvrseq.walkers.variantqc.VariantQC done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=44810240
org.broadinstitute.hellbender.exceptions.GATKException: The intervals used for this job contain a contig not present in the sequence dictionary: chr1_KI270762v1_alt
	at com.github.discvrseq.walkers.variantqc.Contig.lambda$getContigNames$0(Contig.java:47)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at com.github.discvrseq.walkers.variantqc.Contig.getContigNames(Contig.java:45)
	at com.github.discvrseq.walkers.variantqc.Contig.<init>(Contig.java:26)
	at com.github.discvrseq.walkers.variantqc.ExtendedVariantEvalEngine.createVariantStratifier(ExtendedVariantEvalEngine.java:58)
	at org.broadinstitute.hellbender.tools.walkers.varianteval.VariantEvalEngine.initializeStratificationObjects(VariantEvalEngine.java:812)
	at org.broadinstitute.hellbender.tools.walkers.varianteval.VariantEvalEngine.validateAndInitialize(VariantEvalEngine.java:215)
	at com.github.discvrseq.walkers.variantqc.ExtendedVariantEvalEngine.doValidateAndInitialize(ExtendedVariantEvalEngine.java:37)
	at com.github.discvrseq.walkers.variantqc.ExtendedVariantEvalEngine.<init>(ExtendedVariantEvalEngine.java:28)
	at com.github.discvrseq.walkers.variantqc.VariantQC$VariantEvalWrapper.configureEngine(VariantQC.java:570)
	at com.github.discvrseq.walkers.variantqc.VariantQC.onTraversalStart(VariantQC.java:380)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1096)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:149)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:198)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:217)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
	at com.github.discvrseq.Main.main(Main.java:32)

Thanks

Pindel-C

Where can we download Pindel-C
Thanks,
Ashiq

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.