dellytools / sansa Goto Github PK
View Code? Open in Web Editor NEWStructural variant VCF annotation, duplicate removal and comparison
License: BSD 3-Clause "New" or "Revised" License
Structural variant VCF annotation, duplicate removal and comparison
License: BSD 3-Clause "New" or "Revised" License
Thank you for providing the convenient tool "dellytools." Regarding the interpretation of the annotation results, I have some questions and would like to confirm them with you further.
Sincerely yours,
Clarence
hi
first thank you for provide this beautiful program of art
when i run sansa annotate with my VCF.gz (using delly call and delly filter) and hg38.gtf for gene_id annotation
error thrown like this :
[2021-Mar-31 14:30:16] sansa annotate -i gene_id -g /users/hslee/ref/hg38/hg38.refGene.gtf.gz JSH_LN_Lt.delly.SV.vcf.gz
[2021-Mar-31 14:30:17] Parse SV annotation database
[2021-Mar-31 14:30:17] Parsed 2 out of 2 VCF/BCF records.
[2021-Mar-31 14:30:17] BED feature parsing
terminate called after throwing an instance of 'boost::wrapexcept<boost::bad_lexical_cast>'
what(): bad lexical cast: source type value could not be interpreted as target
stopped (core dumped)
my VCF file seems intact so... i can't figure it out by myself
what can i do?
thanks in advance
Hi!
I was annotating the SV calls from Parliament2 when I realized that the Insertion events have 0 length in the tabulated output. Is there a way that we can add the Average length from the SV VCF?
Here is a minimal example
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT T8B
chr1 150880 DEL0011SUR N <DEL> 6 PASS SUPP=1;SUPP_VEC=00001;AVGLEN=66;SVTYPE=DEL;SVMETHOD=SURVIVORv2;CHR2=chr1;END=150946;CIPOS=0,0;CIEND=0,0;STRANDS=+-;CALLERS=MANTA GT:SP 0/1:MANTA
chr1 295428 TRA00406SUR N N[chr19:295428[> . PASS SUPP=1;SUPP_VEC=00001;AVGLEN=100000;SVTYPE=BND;SVMETHOD=SURVIVORv2;CHR2=chr19;END=295428;CIPOS=0,0;CIEND=0,0;STRANDS=++;CALLERS=MANTAGT:SP 0/1:MANTA
chr1 341697 INS0013SUR N <INS> . PASS SUPP=1;SUPP_VEC=00001;AVGLEN=59;SVTYPE=INS;SVMETHOD=SURVIVORv2;CHR2=chr1;END=341697;CIPOS=0,0;CIEND=0,0;STRANDS=+-;CALLERS=MANTA GT:SP 1/1:MANTA
chr1 341785 TRA0014SUR N N[chr7:72261928[> . PASS SUPP=1;SUPP_VEC=00001;AVGLEN=100000;SVTYPE=BND;SVMETHOD=SURVIVORv2;CHR2=chr7;END=72261928;CIPOS=0,0;CIEND=0,0;STRANDS=++;CALLERS=MANTAGT:SP 0/1:MANTA
chr1 380129 DEL0015SUR N <DEL> 6 PASS SUPP=4;SUPP_VEC=10111;AVGLEN=6001;SVTYPE=DEL;SVMETHOD=SURVIVORv2;CHR2=chr1;END=386165;CIPOS=-11,116;CIEND=-12,0;STRANDS=+-;CALLERS=BREAKDANCER,DELLY,LUMPY,MANTA GT:SP 1/1:BREAKDANCER,DELLY,LUMPY,MANTA
[1]ANNOID query.chr query.start query.chr2 query.end query.id query.qual query.svtype query.ct query.svlen query.startfeature query.endfeature query.containedfeature Gene Fusion
id000000004 chr1 150880 chr1 150946 DEL0011SUR 6 DEL 3to5 66 NA NA NA False
id000000005 chr1 295428 chr19 295428 TRA00406SUR 0 BND 3to3 0 NA NA NA False
id000000006 chr1 341697 chr1 341697 INS0013SUR 0 INS NtoN 0 NA NA NA False
id000000007 chr1 341785 chr7 72261928 TRA0014SUR 0 BND 3to3 0 NA NA NA False
id000000008 chr1 380129 chr1 386165 DEL0015SUR 6 DEL 3to5 6036 LOC100685782(0;-) LOC100685782(0;-) NA False
The last column is just my label to see if genes are fusing or not, so that can be ignored.
Would it be possible to add this average length of insertion to the SANSA annotation?
I used delly and sansa to build a structural variation analysis pipeline and attempted to evaluate its accuracy through NCCL validation data.
Delly detected the expected mutation. But I'm a bit surprised why there is a 1 bp difference in the representation of breakpoints.
answer:
the sansa result:
[1]ANNOID | query.chr | query.start | query.chr2 | query.end | query.id | query.qual | query.svtype | query.ct | query.svlen | query.startfeature | query.endfeature |
---|---|---|---|---|---|---|---|---|---|---|---|
None | 7 | 55266405 | 7 | 92462404 | INV00002766 | 7260 | INV | 3to3 | 37195999 | EGFR(0;+) | CDK6(0;-) |
None | 22 | 23632600 | 9 | 133729450 | BND00012515 | 10000 | BND | 5to3 | 0 | BCR(0;+) | ABL1(0;+) |
in delly.bcf
22 23632600 BND00012515 A ]9:133729450]A 10000 PASS PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv1.1.6;END=23632601;CHR2=9;POS2=133729450;PE=431;MAPQ=60;CT=5to3;CIPOS=-3,3;CIEND=-3,3;SRMAPQ=60;INSLEN=0;HOMLEN=3;SR=56;SRQ=1;CONSENSUS=AGAATAAAACTAATTTTTTCTCCCAATTTTCTCTTCCTTTTTCTTTTTTCTGTTCCCCCCTTTCTCTTCCAGAGTAAGTACTGGTTTGGGGAGGAGGGTTGCAGCGGCCGAGCCAGGGTCTCCACCCAGGAAGGACTCATCGGGCAGGGTGTGGGGAA;CE=1.97658;RDRATIO=1;SOMATIC GT:GL:GQ:FT:RCL:RC:RCR:RDCN:DR:DV:RR:RV 0/1:-253.075,0,-455.964:10000:PASS:222980:329282:106302:2:748:431:155:95 0/0:0,-75.2007,-857.843:10000:PASS:221127:323350:102223:2:1302:0:250:0
7 55266405 INV00002766 C <INV> 7260 PASS PRECISE;SVTYPE=INV;SVMETHOD=EMBL.DELLYv1.1.6;END=92462404;PE=101;MAPQ=60;CT=3to3;CIPOS=-2,2;CIEND=-2,2;SRMAPQ=60;INSLEN=0;HOMLEN=1;SR=20;SRQ=1;CONSENSUS=TAATGATGACTAAAGCAAGGGATTGTGATTGTTCATTCATGATCCCACTGCCTTCTTTTCTTGCTTCATCCTCGTGAGCCAGGGAGCTGCGCCCTCGCCATCTGGGGCCTCGCGCGCG;CE=1.97084;RDRATIO=0.997411;SOMATIC GT:GL:GQ:FT:RCL:RC:RCR:RDCN:DR:DV:RR:RV 0/1:-193.272,0,-528.756:10000:PASS:197626:345940:220737:2:577:101:173:77 0/0:0,-75.2167,-873.159:10000:PASS:197576:346736:220664:2:768:0:250:0
Hi,
Is it feasible to use delly sansa to specify the output location and format?
I was getting the output bcf file at the current directory and ideally, I would like to get a tab-delimited table to examine fusion candidates like the one in the README.md.
The command I used is below:
Thanks,
Jeff
Dear Sansa team:
Good morning,
And the some deletion showed very weird result, the deletion sequence is almost 50% chromosome.
Is there any connection in the warning message and the result??
Thank you very much.
Best
Clarence
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.