Giter VIP home page Giter VIP logo

Comments (7)

chapmanb avatar chapmanb commented on September 28, 2024

I think that the phase is correct but happy to adjust if the GenomeTools folks think otherwise. The GFF spec specifies the phase as 0,1 or 2:

while codon_start from the GenBank file is 1, 2 or 3:

so I've made the adjustment from 1 to 0 in the GFF output when converting. Let me know if your interaction with the GenomeTools developers indicate I've missed something in the conversion.

from bcbb.

mercutio22 avatar mercutio22 commented on September 28, 2024

Thanks Brad. I will contact them and will let you know asap.

 .''.      Hugo A. M. Torres : :' : . '   “Talk is cheap,  -    show me the code. ”  -- L. Torvalds.

On Mon, Mar 12, 2012 at 3:04 PM, Brad Chapman
[email protected]

I think that the phase is correct but happy to adjust if the GenomeTools folks think otherwise. The GFF spec specifies the phase as 0,1 or 2:

while codon_start from the GenBank file is 1, 2 or 3:

so I've made the adjustment from 1 to 0 in the GFF output when converting. Let me know if your interaction with the GenomeTools developers indicate I've missed something in the conversion.

Reply to this email directly or view it on GitHub:
#52 (comment)

from bcbb.

mercutio22 avatar mercutio22 commented on September 28, 2024

HI Brad, perhaps this might be useful for testing your program:

I tried and the tool pointed for instance is that the produced gff3
file file has a "source" field. IIRC Peter Cock in one his blog posts
says genbank has those but GFF3 does not.

Here, I paste you a sample report:

GFF3 File Validation Report


generated: 12-Mar-12 15:27:10








First 10 lines of the analyzed GFF3 file follows:

[line 1]> ##gff-version 3
[line 2]> ##sequence-region NG_017013.1 1 26144
[line 3]> NG_017013.1 annotation remark 1 26144 .
[line 3]> . . comment=REVIEWED%20REFSEQ%3A%20This%20record%20has%20been%20curated%20by%20NCBI%20staff%20in%0Acollaboration%20with%20Graham%20Taylor.%20The%20reference%20sequence%20was%0Aderived%20from%20AC087388.9%20and%20AC007421.13.%0AThis%20sequence%20is%20a%20reference%20standard%20in%20the%20RefSeqGene%20project.%0APublication%20Note%3A%20%20This%20RefSeq%20record%20includes%20a%20subset%20of%20the%0Apublications%20that%20are%20available%20for%20this%20gene.%20Please%20see%20the%20Gene%0Arecord%20to%20access%20additional%20publications.%0ASummary%3A%20This%20gene%20encodes%20tumor%20protein%20p53%2C%20which%20responds%20to%0Adiverse%20cellular%20stresses%20to%20regulate%20target%20genes%20that%20induce%20cell%0Acycle%20arrest%2C%20apoptosis%2C%20senescence%2C%20DNA%20repair%2C%20or%20changes%20in%0Ametabolism.%20p53%20protein%20is%20expressed%20at%20low%20level%20in%20normal%20cells%0Aand%20at%20a%20high%20level%20in%20a%20variety%20of%20transformed%20cell%20lines%2C%20where%0Ait%27s%20believed%20to%20contribute%20to%20transformation%20and%20malignancy.%20p53%0Ais%20a%20DNA-binding%20protein%20containing%20transcription%20activation%2C%0ADNA-binding%2C%20and%20oligomerization%20domains.%20It%20is%20postulated%20to%20bind%0Ato%20a%20p53-binding%20site%20and%20activate%20expression%20of%20downstream%20genes%0Athat%20inhibit%20growth%20and/or%20invasion%2C%20and%20thus%20function%20as%20a%20tumor%0Asuppressor.%20Mutants%20of%20p53%20that%20frequently%20occur%20in%20a%20number%20of%0Adifferent%20human%20cancers%20fail%20to%20bind%20the%20consensus%20DNA%20binding%0Asite%2C%20and%20hence%20cause%20the%20loss%20of%20tumor%20suppressor%20activity.%0AAlterations%20of%20this%20gene%20occur%20not%20only%20as%20somatic%20mutations%20in%0Ahuman%20malignancies%2C%20but%20also%20as%20germline%20mutations%20in%20some%0Acancer-prone%20families%20with%20Li-Fraumeni%20syndrome.%20Multiple%20p53%0Avariants%20due%20to%20alternative%20promoters%20and%20multiple%20alternative%0Asplicing%20have%20been%20found.%20These%20variants%20encode%20distinct%20isoforms%2C%0Awhich%20can%20regulate%20p53%20transcriptional%20activity.%20%5Bprovided%20by%0ARefSeq%2C%20Jul%202008%5D.;
[line 3]> sequence_version=1;source=Homo%20sapiens%20%28human%29;
[line 3]> taxonomy=Eukaryota,Metazoa,Chordata,
[line 3]> Craniata,Vertebrata,Euteleostomi,
[line 3]> Mammalia,Eutheria,Euarchontoglires,
[line 3]> Primates,Haplorrhini,Catarrhini,
[line 3]> Hominidae,Homo;keywords=RefSeqGene;
[line 3]> references=location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Marcel%2CV.%2C%20Tran%2CP.L.%2C%20Sagne%2CC.%2C%20Martel-Planche%2CG.%2C%20Vaslin%2CL.%2C%20Teulade-Fichou%2CM.P.%2C%20Hall%2CJ.%2C%20Mergny%2CJ.L.%2C%20Hainaut%2CP.%20and%20Van%20Dyck%2CE.%0Atitle%3A%20G-quadruplex%20structures%20in%20TP53%20intron%203%3A%20role%20in%20alternative%20splicing%20and%20in%20production%20of%20p53%20mRNA%20isoforms%0Ajournal%3A%20Carcinogenesis%2032%20%283%29%2C%20271-278%20%282011%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%2021112961%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Naidu%2CS.R.%2C%20Love%2CI.M.%2C%20Imbalzano%2CA.N.%2C%20Grossman%2CS.R.%20and%20Androphy%2CE.J.%0Atitle%3A%20The%20SWI/SNF%20chromatin%20remodeling%20subunit%20BRG1%20is%20a%20critical%20regulator%20of%20p53%20necessary%20for%20proliferation%20of%20malignant%20cells%0Ajournal%3A%20Oncogene%2028%20%2827%29%2C%202492-2501%20%282009%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%2019448667%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Anczukow%2CO.%2C%20Ware%2CM.D.%2C%20Buisson%2CM.%2C%20Zetoune%2CA.B.%2C%20Stoppa-Lyonnet%2CD.%2C%20Sinilnikova%2CO.M.%20and%20Mazoyer%2CS.%0Atitle%3A%20Does%20the%20nonsense-mediated%20mRNA%20decay%20mechanism%20prevent%20the%20synthesis%20of%20truncated%20BRCA1%2C%20CHK2%2C%20and%20p53%20proteins%3F%0Ajournal%3A%20Hum.%20Mutat.%2029%20%281%29%2C%2065-73%20%282008%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%2017694537%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Bourdon%2CJ.C.%0Atitle%3A%20p53%20Family%20isoforms%0Ajournal%3A%20Curr%20Pharm%20Biotechnol%208%20%286%29%2C%20332-336%20%282007%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%2018289041%0Acomment%3A%20Review%20article,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Murray-Zmijewski%2CF.%2C%20Lane%2CD.P.%20and%20Bourdon%2CJ.C.%0Atitle%3A%20p53/p63/p73%20isoforms%3A%20an%20orchestra%20of%20isoforms%20to%20harmonise%20cell%20differentiation%20and%20response%20to%20stress%0Ajournal%3A%20Cell%20Death%20Differ.%2013%20%286%29%2C%20962-972%20%282006%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%2016601753%0Acomment%3A%20Review%20article,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Flaman%2CJ.M.%2C%20Waridel%2CF.%2C%20Estreicher%2CA.%2C%20Vannier%2CA.%2C%20Limacher%2CJ.M.%2C%20Gilbert%2CD.%2C%20Iggo%2CR.%20and%20Frebourg%2CT.%0Atitle%3A%20The%20human%20tumour%20suppressor%20gene%20p53%20is%20alternatively%20spliced%20in%20normal%20cells%0Ajournal%3A%20Oncogene%2012%20%284%29%2C%20813-818%20%281996%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%208632903%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Lamb%2CP.%20and%20Crawford%2CL.%0Atitle%3A%20Characterization%20of%20the%20human%20p53%20gene%0Ajournal%3A%20Mol.%20Cell.%20Biol.%206%20%285%29%2C%201379-1385%20%281986%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%202946935%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Harlow%2CE.%2C%20Williamson%2CN.M.%2C%20Ralston%2CR.%2C%20Helfman%2CD.M.%20and%20Adams%2CT.E.%0Atitle%3A%20Molecular%20cloning%20and%20in%20vitro%20expression%20of%20a%20cDNA%20clone%20for%20human%20cellular%20tumor%20antigen%20p53%0Ajournal%3A%20Mol.%20Cell.%20Biol.%205%20%287%29%2C%201601-1610%20%281985%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%203894933%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Zakut-Houri%2CR.%2C%20Bienz-Tadmor%2CB.%2C%20Givol%2CD.%20and%20Oren%2CM.%0Atitle%3A%20Human%20p53%20cellular%20tumor%20antigen%3A%20cDNA%20sequence%20and%20expression%20in%20COS%20cells%0Ajournal%3A%20EMBO%20J.%204%20%285%29%2C%201251-1255%20%281985%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%204006916%0Acomment%3A,
[line 3]> location%3A%20%5B0%3A26144%5D%0Aauthors%3A%20Matlashewski%2CG.%2C%20Lamb%2CP.%2C%20Pim%2CD.%2C%20Peacock%2CJ.%2C%20Crawford%2CL.%20and%20Benchimol%2CS.%0Atitle%3A%20Isolation%20and%20characterization%20of%20a%20human%20p53%20cDNA%20clone%3A%20expression%20of%20the%20human%20p53%20gene%0Ajournal%3A%20EMBO%20J.%203%20%2813%29%2C%203257-3262%20%281984%29%0Amedline%20id%3A%20%0Apubmed%20id%3A%206396087%0Acomment%3A;
[line 3]> accessions=NG_017013;data_file_division=PRI;
[line 3]> date=19-FEB-2012;organism=Homo%20sapiens;
[line 3]> gi=293651587
[line 4]> NG_017013.1 feature source 1 26144 . + .
[line 4]> db_xref=taxon%3A9606;mol_type=genomic%20DNA;
[line 4]> organism=Homo%20sapiens;chromosome=17;
[line 4]> map=17p13.1
[line 5]> NG_017013.1 feature gene 1 6475 . - .
[line 5]> note=WD%20repeat%20containing%2C%20antisense%20to%20TP53;
[line 5]> db_xref=GeneID%3A55135,HGNC%3A25522,
[line 5]> MIM%3A612661;gene=WRAP53;gene_synonym=DKCB3%3B%20TCAB1%3B%20WDR79
[line 6]> NG_017013.1 feature mRNA 2845 6475 . - .
[line 6]> db_xref=GI%3A221136857,GeneID%3A55135,
[line 6]> HGNC%3A25522,MIM%3A612661;product=WD%20repeat%20containing%2C%20antisense%20to%20TP53%2C%20transcript%20variant%202;
[line 6]> transcript_id=NM_001143990.1;inference=similar%20to%20RNA%20sequence%2C%20mRNA%20%28same%20species%29%3ARefSeq%3ANM_001143990.1;
[line 6]> exception=annotated%20by%20transcript%20or%20proteomic%20data;
[line 6]> gene=WRAP53;gene_synonym=DKCB3%3B%20TCAB1%3B%20WDR79;
[line 6]> ID=NM_001143990.1
[line 7]> NG_017013.1 feature mRNA 2845 2956 . - .
[line 7]> Parent=NM_001143990.1
[line 8]> NG_017013.1 feature mRNA 3224 3322 . - .
[line 8]> Parent=NM_001143990.1
[line 9]> NG_017013.1 feature mRNA 3467 3898 . - .
[line 9]> Parent=NM_001143990.1
[line 10]> NG_017013.1 feature mRNA 6322 6475 . - .
[line 10]> Parent=NM_001143990.1


Line Number Error/Warning

4 [ERROR] invalid type (type: source)
7 [ERROR] invalid type pair - check all parents (at line
6; mRNA to mRNA)
12 [ERROR] invalid type pair - check all parents (at line
11; mRNA to mRNA)
17 [ERROR] invalid type pair - check all parents (at line
16; mRNA to mRNA)
22 [ERROR] invalid type pair - check all parents (at line
21; mRNA to mRNA)
26 [ERROR] invalid type pair - check all parents (at line
25; CDS to CDS)
30 [ERROR] invalid type pair - check all parents (at line
29; CDS to CDS)
34 [ERROR] invalid type pair - check all parents (at line
33; CDS to CDS)
38 [ERROR] invalid type pair - check all parents (at line
37; CDS to CDS)
44 [ERROR] invalid type pair - check all parents (at line
43; mRNA to mRNA)
56 [ERROR] invalid type pair - check all parents (at line
55; mRNA to mRNA)
69 [ERROR] invalid type pair - check all parents (at line
68; mRNA to mRNA)
82 [ERROR] invalid type pair - check all parents (at line
81; mRNA to mRNA)
94 [ERROR] invalid type pair - check all parents (at line
93; mRNA to mRNA)
113 [ERROR] invalid type pair - check all parents (at line
112; CDS to CDS)
124 [ERROR] invalid type pair - check all parents (at line
123; CDS to CDS)
135 [ERROR] invalid type pair - check all parents (at line
134; CDS to CDS)
145 [ERROR] invalid type pair - check all parents (at line
144; CDS to CDS)
162 [ERROR] invalid type pair - check all parents (at line
161; CDS to CDS)
171 [ERROR] invalid type pair - check all parents (at line
170; mRNA to mRNA)
180 [ERROR] invalid type pair - check all parents (at line
179; mRNA to mRNA)
189 [ERROR] invalid type pair - check all parents (at line
188; mRNA to mRNA)
206 [ERROR] invalid type pair - check all parents (at line
205; CDS to CDS)
214 [ERROR] invalid type pair - check all parents (at line
213; CDS to CDS)
221 [ERROR] invalid type pair - check all parents (at line
220; CDS to CDS)

 .''.      Hugo A. M. Torres : :' : . '   “Talk is cheap,  -    show me the code. ”  -- L. Torvalds.

On Mon, Mar 12, 2012 at 3:50 PM, A M Torres, Hugo
[email protected] wrote:

Thanks Brad. I will contact them and will let you know asap.

 .''.      Hugo A. M. Torres : :' : . '   “Talk is cheap,  -    show me the code. ”  -- L. Torvalds.

On Mon, Mar 12, 2012 at 3:04 PM, Brad Chapman
[email protected]

I think that the phase is correct but happy to adjust if the GenomeTools folks think otherwise. The GFF spec specifies the phase as 0,1 or 2:

while codon_start from the GenBank file is 1, 2 or 3:

so I've made the adjustment from 1 to 0 in the GFF output when converting. Let me know if your interaction with the GenomeTools developers indicate I've missed something in the conversion.

Reply to this email directly or view it on GitHub:
#52 (comment)

from bcbb.

chapmanb avatar chapmanb commented on September 28, 2024

Thanks for this. The validator is complaining about 'source' not being present in the Sequence Ontology. Mapping GenBank to SO is a fairly large problem. I tried to tackle this a few years back but it ended up being too much work. Here's the progress I made:

Practically, most tools will not enforce this requirement, so being unable to map the entire thing I took the approach of keeping the output GFF similar to the input GenBank. If you wanted to take on a mapping of GenBank to Sequence Ontology I'd be happy to incorporate in.

Is GenomeTools requiring the ontology matches, or just that online validator?

from bcbb.

mercutio22 avatar mercutio22 commented on September 28, 2024

Hi Brad,

Is GenomeTools requiring the ontology matches, or just that online validator?

Hmm, It seems only the validator. GenomeTools seems only to be
complaining about that "phase" field.

I have already posted your considerations on their issue tracker. I
will let you know what they say when I get a reply. In any case,
thanks for taking the time you spent on looking at my problem.

from bcbb.

chapmanb avatar chapmanb commented on September 28, 2024

Thanks Hugo -- let me know if there ends up being anything I can change on my end to improve the phase information. Hopefully that'll do it and get things working smoothly with GenomeTools. Thanks for your patience with this.

from bcbb.

chapmanb avatar chapmanb commented on September 28, 2024

I'm going to close this to clean up the issues. Hopefully everything was solved on the GenomeTools side. Thanks

from bcbb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.