sbotond / rlsim Goto Github PK
View Code? Open in Web Editor NEWA package for simulating RNA-seq library preparation with parameter estimation
License: GNU General Public License v3.0
A package for simulating RNA-seq library preparation with parameter estimation
License: GNU General Public License v3.0
There are two implementation details of FragAfterPrim
, that aren't necessarily bugs, but I don't quite understand. I'm hoping you can shed some light.
First is this:
nrBreaks := rand.Poisson(rate) - 1
if nrBreaks <= 0 {
nrBreaks = 1
}
Why isn't nrBreaks := rand.Poisson(rate)
sufficient? You're drawing from a poisson as an approximate binomial, under the assumption that breakpoints occur at any position with some constant probability. But the number of breakpoints under this scheme is non-linear in the transcript length, which is difficult to account for:
Second, possibly related:
FRAG:
for i := 0; i < len(breaks)-2; i++ {
This skips the last fragment, resulting in a depletion of fragments at the 3' end of the transcript. Was that intentional? Is there reason to believe that fragments can't be primed after the last breakpoint?
Sorry to bombard you with super-specific questions. Rlsim is by far the best rna-seq simulator I've used, and I want to make sure it's behavior in these cases is plausible before I either try to account for it in my quantification model, or make tweaks to rlsim.
I tried to install from the latest release using
wget https://github.com/sbotond/rlsim/tree/master/releases/rlsim-latest_amd64.tar.gz
followed by tar -xzf rlsim-latest_amd64.tar.gz
but this gave an error message that the file was not in gzip format.
I was able to 'Build from source' without trouble after installing golang, so not a major problem but thought I'd let you.
Is there any way to recover the original molecule that produces the final PCR fragments? I would like to simulate an RNA-seq dataset and know which fragments come from the same molecule (and thus are PCR duplicates).
If I open the fas file produced by rlsim
, I see that for all fragments belonging to the same gene, they are numbers from Fg_1
all the way to Fg_n
, but there does not seem to be a way to figure out which are PCR duplicates.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.