Giter VIP home page Giter VIP logo

Comments (4)

dewyman avatar dewyman commented on July 18, 2024

Hi Callum,
Thank you for your question. Have you checked the transcript novelty column? Novel genes are rather rare in the human data so I would not be too surprised not to have a novel gene in 10e4 reads.

from talon.

callumparr avatar callumparr commented on July 18, 2024

Sorry I was stupid and I scrolled all the way through the file I can see novel TALON annotations. Does the script some sort of sorting of the results? Based on annotations hits rather than the order of the reads in the sam file.

Perhaps I see too many. I did run it through TranscriptClean but I ignored the splice site correction so it probably is better to do this.

from talon.

dewyman avatar dewyman commented on July 18, 2024

The abundance file does tend to list the known annotations first.

I would recommend looking at how many transcripts you get in each novelty category, and going from there. Without filtering the novel transcripts for reproducibility across multiple datasets, you might reasonably expect to get a higher rate of novelty in the output. If these are mainly incomplete splice matches (ISM) and novel in catalog (NIC) transcripts, then TranscriptClean splice site correction won't have a big impact since these transcripts contain known splice donors and acceptors- it is mainly targeted at fixing artifactual novel not in catalog (NNC) transcripts.

from talon.

dewyman avatar dewyman commented on July 18, 2024

Note: This script will plot the number of distinct transcript models detected for each novelty category, and the number of reads that were annotated to each novelty category:
https://github.com/dewyman/TALON-paper-2019/blob/master/analysis_scripts/plot_novelty_categories_from_abd_table.R

Here are instructions to run it:

Rscript plot_novelty_categories_from_abd_table.R --h
Options:
	--f=F
		TALON abundance file

	--datasets=DATASETS
		Optional: Comma-separated list of datasets to include. Default is to use all datasets in the file

	-o OUTDIR, --outdir=OUTDIR
		Output directory for plots and outfiles

	-h, --help
		Show this help message and exit

from talon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.