Giter VIP home page Giter VIP logo

Comments (3)

enormandeau avatar enormandeau commented on June 7, 2024
  1. The GO enrichment compares a set of genes of interest to all the genes present in the transcriptome. These genes of inerest are what is refered to in the significant_ids.txt file. They can be genes whose expression level differs between conditions and for which you want to know if they are enriched with some GO terms.

  2. The wanted_transcripts.ids file contains one transcript name per line. The annotation.tsv file is the result of annotating the transcripts with the GO database. However, these names are not what I am expecting. If you posted the first 20 lines of each of the files it would help answer your questions.

  3. Same thing. Please post the first 20 lines of this file.

from go_enrichment.

shiyi-pan avatar shiyi-pan commented on June 7, 2024

thank you for your reply .
here is my step3 code:
$PATHON $SCRIPTS/03_annotate_genes.py $SEQUENCE_FILE $ANNOTATION_FOLDER sequence_annotation.txt
and I get an sequence_annotation.txt like this:

Name Accession Fullname Altnames Pfam GO CellularComponent Molecular Function Biological Process
NN01g00001.1 locus=Chr01:150058:150620:- Q15KI9 Q0WV21 Q15KJ0 Q9CAB1 Q9CAB2 Protein PHYLLO, chloroplastic PF00561;PF13378;PF02775;PF16582;PF02776; GO:0031969;GO:0016021;GO:0070204;GO:0070205;GO:0046872;GO:0043748;GO:0030976;GO:0009063;GO:0009234;GO:0042550;GO:0042372; C:chloroplast membrane; C:integral component of membrane; F:2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylic-acid synthase activity; F:2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase activity; F:metal ion binding; F:O-succinylbenzoate synthase activity; F:thiamine pyrophosphate binding; P:cellular amino acid catabolic process; P:menaquinone biosynthetic process; P:photosystem I stabilization; P:phylloquinone biosynthetic process;
NN01g00002.1 locus=Chr01:238607:249703:+ Q9SZL8 Protein FAR1-RELATED SEQUENCE 5 PF03101;PF10551;PF04434; GO:0005634;GO:0008270;GO:0006355; C:nucleus; F:zinc ion binding; P:regulation of transcription, DNA-templated;
NN01g00003.1 locus=Chr01:258602:264467:- Q8GX93 O65486 Q93XN4 Q9SVX1 Chloride channel protein CLC-e CBS domain-containing protein CBSCLC3; PF00571;PF00654; GO:0034707;GO:0009535;GO:0005247;GO:0034765; C:chloride channel complex; C:chloroplast thylakoid membrane; F:voltage-gated chloride channel activity; P:regulation of ion transmembrane transport;

But in the following steps as you descripted in go_enrichment/01_scripts , The input files are significant_ids.txt ,all_ids.txt ,all_go_annotations.csv and go_enrichment.csv , as you said above, the significant_ids.txt is the set of genes I interest and all_ids.txt is all the genes . what are the all_go_annotations.csv and go_enrichment.csv ? what"s the relation between them and the step3 result file sequence_annotation.txt.

from go_enrichment.

enormandeau avatar enormandeau commented on June 7, 2024

Here is how I run this step :

./01_scripts/03_annotate_genes.py 03_sequences/analyzed_genes.fasta \
    05_annotations/ all_annotations_transcripts.tsv

The all_annotations_transcripts.tsv file is the output file name.

from go_enrichment.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.