Comments (3)
-
The GO enrichment compares a set of genes of interest to all the genes present in the transcriptome. These genes of inerest are what is refered to in the
significant_ids.txt
file. They can be genes whose expression level differs between conditions and for which you want to know if they are enriched with some GO terms. -
The
wanted_transcripts.ids
file contains one transcript name per line. Theannotation.tsv
file is the result of annotating the transcripts with the GO database. However, these names are not what I am expecting. If you posted the first 20 lines of each of the files it would help answer your questions. -
Same thing. Please post the first 20 lines of this file.
from go_enrichment.
thank you for your reply .
here is my step3 code:
$PATHON $SCRIPTS/03_annotate_genes.py $SEQUENCE_FILE $ANNOTATION_FOLDER sequence_annotation.txt
and I get an sequence_annotation.txt like this:
Name Accession Fullname Altnames Pfam GO CellularComponent Molecular Function Biological Process
NN01g00001.1 locus=Chr01:150058:150620:- Q15KI9 Q0WV21 Q15KJ0 Q9CAB1 Q9CAB2 Protein PHYLLO, chloroplastic PF00561;PF13378;PF02775;PF16582;PF02776; GO:0031969;GO:0016021;GO:0070204;GO:0070205;GO:0046872;GO:0043748;GO:0030976;GO:0009063;GO:0009234;GO:0042550;GO:0042372; C:chloroplast membrane; C:integral component of membrane; F:2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylic-acid synthase activity; F:2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase activity; F:metal ion binding; F:O-succinylbenzoate synthase activity; F:thiamine pyrophosphate binding; P:cellular amino acid catabolic process; P:menaquinone biosynthetic process; P:photosystem I stabilization; P:phylloquinone biosynthetic process;
NN01g00002.1 locus=Chr01:238607:249703:+ Q9SZL8 Protein FAR1-RELATED SEQUENCE 5 PF03101;PF10551;PF04434; GO:0005634;GO:0008270;GO:0006355; C:nucleus; F:zinc ion binding; P:regulation of transcription, DNA-templated;
NN01g00003.1 locus=Chr01:258602:264467:- Q8GX93 O65486 Q93XN4 Q9SVX1 Chloride channel protein CLC-e CBS domain-containing protein CBSCLC3; PF00571;PF00654; GO:0034707;GO:0009535;GO:0005247;GO:0034765; C:chloride channel complex; C:chloroplast thylakoid membrane; F:voltage-gated chloride channel activity; P:regulation of ion transmembrane transport;
But in the following steps as you descripted in go_enrichment/01_scripts , The input files are significant_ids.txt ,all_ids.txt ,all_go_annotations.csv and go_enrichment.csv , as you said above, the significant_ids.txt is the set of genes I interest and all_ids.txt is all the genes . what are the all_go_annotations.csv and go_enrichment.csv ? what"s the relation between them and the step3 result file sequence_annotation.txt.
from go_enrichment.
Here is how I run this step :
./01_scripts/03_annotate_genes.py 03_sequences/analyzed_genes.fasta \
05_annotations/ all_annotations_transcripts.tsv
The all_annotations_transcripts.tsv
file is the output file name.
from go_enrichment.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from go_enrichment.