Giter VIP home page Giter VIP logo

Comments (11)

tobiasrausch avatar tobiasrausch commented on September 27, 2024 1

Ensembl and UCSC use different feature values. By default, sansa uses "gene" as the feature value (fine for Ensembl) but for UCSC you need to specify "transcript".

sansa annotate -f transcript -g hg38.knownGene.gtf.gz ADB_LL.delly.SV.vcf.gz

from sansa.

leehs96 avatar leehs96 commented on September 27, 2024
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	JSH_LN_Lt	ADB_Blood	LMS_Blood	SJH_Blood	JSH_Blood	10847652_NT	12140782_Blood
chr1	63400385	DEL00000000	A	<DEL>	0	LowQual	IMPRECISE;SVTYPE=DEL;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr1;END=209990649;PE=2;MAPQ=37;CT=3to5;CIPOS=-519,519;CIEND=-519,519;RDRATIO=1.0334;SOMATIC	GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV	0/1:-4.58867,0,-44.725:46:PASS:0:18476967:0:-1:8:3:0:0	0/0:0,-8.1278,-162:81:PASS:0:18508703:0:-1:27:0:0:0	0/0:0,-6.02059,-120:60:PASS:0:17239644:0:-1:20:0:0:0	0/0:0,-2.40824,-48:24:PASS:0:16914471:0:-1:8:0:0:0	0/0:0,-4.81643,-93.9999:48:PASS:0:17879783:0:-1:16:0:0:0	0/0:0,-3.91338,-78:39:PASS:0:18005816:0:-1:13:0:0:0	0/0:0,-8.42883,-167.7:84:PASS:0:17091360:0:-1:28:0:0:0
chr1	246232881	DUP00000001	C	<DUP>	.	PASS	IMPRECISE;SVTYPE=DUP;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr1;END=246233246;PE=3;MAPQ=36;CT=5to3;CIPOS=-393,393;CIEND=-393,393;RDRATIO=1.4739;SOMATIC	GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV	0/1:-4.88483,0,-59.5916:49:PASS:3645:12321:3421:3:12:3:0:0	0/0:0,-4.50384,-71.5884:45:PASS:4352:6450:3251:2:15:0:0:0	0/0:0,-1.16884,-9.36473:12:LowQual:1356:1997:332:2:4:0:0:0	0/0:0,-2.40646,-41.0982:24:PASS:3547:10813:2313:4:8:0:0:0	0/0:0,-2.35972,-37.8515:24:PASS:3319:10219:2997:3:8:0:0:0	0/0:0,-5.29267,-88.5741:53:PASS:5360:7338:6727:1:18:0:0:0	0/0:0,-3.57627,-56.4639:36:PASS:3338:4977:3309:1:12:0:0:0

this is how my vcf file looks like except header region

from sansa.

tobiasrausch avatar tobiasrausch commented on September 27, 2024

Hi,

Are you using the release version or the latest github code? I think I fixed this already in the latest code.

Thanks, Tobias

from sansa.

leehs96 avatar leehs96 commented on September 27, 2024

thanks for fast reply
i installed sansa with conda and my sansa version is 0.07 which is release version.

should i delete it and reinstall with lastest github code?

**********************************************************************
Program: Sansa
This is free software, and you are welcome to redistribute it under
certain conditions (BSD License); for license details use '-l'.
This program comes with ABSOLUTELY NO WARRANTY; for details use '-w'.

Sansa (Version: 0.0.7)
Contact: Tobias Rausch ([email protected])
**********************************************************************

Usage: sansa <command> <arguments>

Commands:

    annotate     annotate VCF file

from sansa.

tobiasrausch avatar tobiasrausch commented on September 27, 2024

Yes, that fix happened after the release of v0.0.7. I have updated now the bioconda version with a new release v0.0.8.

https://anaconda.org/bioconda/sansa

from sansa.

leehs96 avatar leehs96 commented on September 27, 2024

hi @tobiasrausch
thanks for your kindness

yes, i re-download newly updated conda version (0.0.8)
and error message that i've mentioned was gone.
that's really greatful
but, i have another issue when i run sansa

my command :
sansa annotate -g hg38.knownGene.gtf.gz ADB_LL.delly.SV.vcf.gz

and error message :

[2021-Apr-06 11:40:35] sansa annotate -g hg38.knownGene.gtf.gz ADB_LL.delly.SV.vcf.gz 
[2021-Apr-06 11:40:35] Parse SV annotation database
[2021-Apr-06 11:40:35] Parsed 1 out of 1 VCF/BCF records.
[2021-Apr-06 11:40:35] GTF feature parsing
Error parsing GTF/GFF3/BED file!

my GTF file was downloaded from here :
https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/

and my vcf form :

.....header region
##contig=<ID=HLA-DRB1*15:01:01:02,length=11571>
##contig=<ID=HLA-DRB1*15:01:01:03,length=11056>
##contig=<ID=HLA-DRB1*15:01:01:04,length=11056>
##contig=<ID=HLA-DRB1*15:02:01,length=10313>
##contig=<ID=HLA-DRB1*15:03:01:01,length=11567>
##contig=<ID=HLA-DRB1*15:03:01:02,length=11569>
##contig=<ID=HLA-DRB1*16:02:01,length=11005>
##INFO=<ID=RDRATIO,Number=1,Type=Float,Description="Read-depth ratio of tumor vs. normal.">
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Somatic structural variant.">
##bcftools_viewVersion=1.8+htslib-1.8
##bcftools_viewCommand=view /users/hslee/cancer/WGS_hg38/SV/delly/ADB_LL.delly.SV.bcf; Date=Wed Mar 31 00:20:53 2021
#CHROM[1]  POS[2]  ID[3]        REF[4]  ALT[5]  QUAL[6]  FILTER[7]  INFO[8]                                             
chr5       648742  DUP00000003  G       <DUP>   .        LowQual    IMPRECISE;SVTYPE=DUP;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=

thanks in advance again

from sansa.

leehs96 avatar leehs96 commented on September 27, 2024

oh i solved this problem with Ensembl GFF3.gz format.
i can't figure it out why general GTF format doesn't work but .. it works with GFF3.gz format anyway.

thanks @tobiasrausch
have a nice day!

i'll close this issue

from sansa.

leehs96 avatar leehs96 commented on September 27, 2024

oh, you're right
it runs perfectly
thanks for your advice
and thanks for this beautiful work as well

from sansa.

Sherry520 avatar Sherry520 commented on September 27, 2024

Ensembl and UCSC use different feature values. By default, sansa uses "gene" as the feature value (fine for Ensembl) but for UCSC you need to specify "transcript".

sansa annotate -f transcript -g hg38.knownGene.gtf.gz ADB_LL.delly.SV.vcf.gz
I downloaded gff3.gz from here: http://ftp.ensemblgenomes.org/pub/plants/release-51/gff3/zea_mays/
And run command as flows,
sansa_v0.0.8 annotate -g Zea_mays.Zm-B73-REFERENCE-NAM-5.0.51.gtf.gz ../03-SV/P1_vs_P2.somatic.vcf
I also try '-f transcript' or with GTF file, but all get the same error as flows:

[2021-Dec-17 18:21:06] Parse SV annotation database
[2021-Dec-17 18:21:07] Parsed 13276 out of 13276 VCF/BCF records.
[2021-Dec-17 18:21:07] GTF feature parsing
Error parsing GTF/GFF3/BED file!

from sansa.

tobiasrausch avatar tobiasrausch commented on September 27, 2024

I think the problem is the attribute (-i). It's not called gene_name in your case but gene_id.

sansa annotate -i gene_id -f gene ...

from sansa.

Sherry520 avatar Sherry520 commented on September 27, 2024

from sansa.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.