Comments (4)
Hi Callum,
Thank you for your question. Have you checked the transcript novelty column? Novel genes are rather rare in the human data so I would not be too surprised not to have a novel gene in 10e4 reads.
from talon.
Sorry I was stupid and I scrolled all the way through the file I can see novel TALON annotations. Does the script some sort of sorting of the results? Based on annotations hits rather than the order of the reads in the sam file.
Perhaps I see too many. I did run it through TranscriptClean but I ignored the splice site correction so it probably is better to do this.
from talon.
The abundance file does tend to list the known annotations first.
I would recommend looking at how many transcripts you get in each novelty category, and going from there. Without filtering the novel transcripts for reproducibility across multiple datasets, you might reasonably expect to get a higher rate of novelty in the output. If these are mainly incomplete splice matches (ISM) and novel in catalog (NIC) transcripts, then TranscriptClean splice site correction won't have a big impact since these transcripts contain known splice donors and acceptors- it is mainly targeted at fixing artifactual novel not in catalog (NNC) transcripts.
from talon.
Note: This script will plot the number of distinct transcript models detected for each novelty category, and the number of reads that were annotated to each novelty category:
https://github.com/dewyman/TALON-paper-2019/blob/master/analysis_scripts/plot_novelty_categories_from_abd_table.R
Here are instructions to run it:
Rscript plot_novelty_categories_from_abd_table.R --h
Options:
--f=F
TALON abundance file
--datasets=DATASETS
Optional: Comma-separated list of datasets to include. Default is to use all datasets in the file
-o OUTDIR, --outdir=OUTDIR
Output directory for plots and outfiles
-h, --help
Show this help message and exit
from talon.
Related Issues (20)
- TALON seems to be stuck or I have a least no idea what it is doing. HOT 11
- TALON support for CIGAR strings found in pbmm2 sam files
- new release? HOT 2
- TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType HOT 1
- Does TALON database contain the alignments ? HOT 4
- Issue with "Could not retrieve index file" HOT 2
- NameError: name 'vertex_counter' is not defined HOT 6
- Error with installation HOT 1
- Problem with talon_initialize_database HOT 13
- Question - merging TALON databases HOT 1
- Issue with talon filter HOT 4
- internal priming on PCS111 cDNA kit HOT 1
- Antisense after pychopper minimap2. to -uf or not -uf HOT 1
- 'check_database_integrity' error HOT 3
- error when running talon annotator
- What is the meaning of ISM None HOT 2
- Strange error HOT 2
- Multithreading is not working
- The transcript numbers don't match between example_raw_talon.gtf and example_talon_abundance.tsv
- ERROR : [process_sams.py:108:preprocess_sam] when running talon
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from talon.