Giter VIP home page Giter VIP logo

starfish's People

Contributors

egluckthaler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

starfish's Issues

Trying to run on my own samples

Hello Emile and group! Excited to try your program on my Histoplasma datasets but I have hit a couple of snags, some I've solved but currently stuck on this one.

I have been following your tutorial for your test dataset and the portion I have been stuck on is your eggnog annotation sorting.
My genomes have been annotated using Funannotate, but I also attempted an eggnog-mapper run and my output does not look like yours.

Your step that cuts your emapper into a text file for later steps is giving me an issue:

cut -f1,10 ann/*emapper.annotations | grep -v '#' | perl -pe 's/^([^\s]+?)\t([^\|]+).+$/\1\t\2/' > ann/macph6.gene2og.txt

the -f1,10 cut is looking for "bestOG|evalue|score" tab where it pulls out the first portion "bestOG" to create the text file. Is it maybe a -flag I am missing for this specific emapper output?

Funannotate:
GeneID TranscriptID Feature Contig Start Stop Strand Name Product Alias/Synonyms EC_number BUSCO PFAM InterPro EggNog COG GO Terms Secreted Membrane Protease CAZyme Notes gDNA mRNA CDS-transcript Translation
19VMG-15_000001 19VMG-15_000001-T1 mRNA scaffold_1 63793 65897 + hypothetical protein 2.1.1.310 EOG091P0ENB PF01189;PF17125 IPR001678 SAM-dependent methyltransferase RsmB-F/NOP2-type domain;IPR011023 Nop2p;IPR018314 Bacterial Fmu (Sun)/eukaryotic nucleolar NOL1/Nop2p, conserved site;IPR023267 RNA (C5-cytosine) methyltransferase;IPR023273 RNA (C5-cytosine) methyltransferase, NOP2;IPR029063 S-adenosyl-L-methionine-dependent methyltransferase superfamily;IPR031341 Ribosomal RNA small subunit methyltransferase F, N-terminal ENOG503NUZ7 L:(L) Replication, recombination and repair GO_component: GO:0005730 - nucleolus [Evidence IEA];GO_function: GO:0008757 - S-adenosylmethionine-dependent methyltransferase activity [Evidence IEA];GO_function: GO:0008168 - methyltransferase activity [Evidence IEA];GO_function: GO:0003723 - RNA binding [Evidence IEA];GO_function: GO:0009383 - rRNA (cytosine-C5-)-methyltransferase activity [Evidence IEA];GO_process: GO:0006396 - RNA processing [Evidence IEA];GO_process: GO:0001510 - RNA methylation [Evidence IEA];GO_process: GO:0070475 - rRNA base methylation [Evidence IEA];GO_process: GO:0000470 - maturation of LSU-rRNA [Evidence IEA]

My eggnog-mapper:
#query seed_ortholog evalue score eggNOG_OGs max_annot_lvl COG_category Description Preferred_name GOs EC KEGG_ko KEGG_PathwayKEGG_Module KEGG_Reaction KEGG_rclass BRITE KEGG_TC CAZy BiGG_Reaction PFAMs
01_16_1 5037.XP_001538943.1 0 3312 COG1020@1|root,KOG1178@2759|Eukaryota,38V1T@33154|Opisthokonta,3Q08R@4751|Fungi,3R2AF@4890|Ascomycota,20DTP@147545|Eurotiomycetes,3AZS6@33183|Onygenales 4751|Fungi Q Condensation domain - - - ko:K22152 - - - ko00000 - - - AMP-binding,Condensation,PP-binding

Your eggnog-mapper:
#query_name seed_eggNOG_ortholog seed_ortholog_evalue seed_ortholog_score predicted_gene_name GO_terms KEGG_pathways Annotation_tax_scope OGs bestOG|evalue|score COG cat eggNOG annot
mp040_13792 441959.XP_002479979.1 5.00E-71 232.2 FG02084.1 GO:0005575,GO:0005623,GO:0005886,GO:0006810,GO:0008150,GO:0015886,GO:0016020,GO:0016021,GO:0031224,GO:0044425,GO:0044464,GO:0044699,GO:0044765,GO:0051179,GO:0051181,GO:0051234,GO:0071702,GO:0071705,GO:0071944,GO:1901678 fuNOG[21] 03RFZ@ascNOG,0IWNB@euNOG,0M5FN@euroNOG,0MQ5B@eurotNOG,0PN5J@fuNOG,11Q0B@NOG,13QP8@opiNOG 0PN5J|7.70053432588e-86|288.860290527 S RTA1 domain protein

Any guidance would be appreciated!

-Tania

Typo in step-by-step tutorial

I believe that the annotate command in the step-by-step tutorial has a typo:

starfish annotate -T 2 -x macpha6_tyr -a ome2assembly.txt -g macpha6.gff3 -p ../database/YRsuperfams.p1-512.hmm -P ../database/YRsuperfamRefs.faa -i tyr -o geneFinder/

Where the -g parameter should be ome2gff.txt instead of macpha6.gff3.

Thanks for the fantastic programme!

typo in install instructions

From: https://github.com/egluckthaler/starfish/wiki/Installation

when replacing the cnef.cc file, your instructions say:
cp ../scripts/cneff.cc .

but it should say:
cp ../scripts/cnef.cc .

I guess this is a bit of a dangerous error, and might pop up due to other install errors. Since the original file and the new file have the same names, it is hard to distinguish between them.

I guess an option would be to rename your cnef.cc to cnef_starfish.cc

and then add a sed command for the Makefile? Something like this?
sed -i "s/cnef.cc/cnef_starfish.cc/" Makefile

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.