ataudt / aneufinder Goto Github PK
View Code? Open in Web Editor NEWFind CNVs in single cell sequencing data.
Find CNVs in single cell sequencing data.
I am using aneufinder to analyze single cell data with default parameters.
For most of cells from healthy donors (assumed with only a small number of copy number variation), only a few called CNVs are not "2somy", which is as expected.
However for around 5% of cells, most of called CNVs are not "2 somy", as shown in attached file.
Is there any problem with the result?
First of all: Thank you for the great package!
It would be nice to have a parameter in the Aneufinder
procedure to disable/skip the plotting, because it seems to be the most time-consuming part. Looking at the code it seems that this would roughly involve making the code between L573-L758 optional.
To make things slightly harder, it would also be great to be able to do the plotting without regenerating the models.
If this is a feature you would accept a patch for, I can create a pull request.
heatmapGenomewide
gets confused if any model in the list of hmms
has an empty segments
variable and plots the wrong identifiers for samples.
It would be better to either drop these samples or throw an error.
I am trying to run Aneufinder(inputfolder=infolder, outputfolder=outfolder,format="bam", numCPU=cpu, method=c("HMM","dnacopy"),pairedEndReads = TRUE )
A config file is created and then I get the Message:
Setting up parallel execution with 4 CPUs ...ssh: connect to host 4 port 22: Invalid argument
and then nothing happens.
What is wrong here??
Hi,
We are dealing with a control setting in which the genome is already pretty messed up.
We would like to determine the most frequently observed copy number state per bin in our reference genome and set that as "normal". Next we would like to plot a heatmap genomewide which shows not the exact copy number state but whether the state is deviating from the most frequently observed state in our control genome. Is this possible in aneufinder or could this be made a feature?
cheers,
Yannick
I've found the following potential issues when looking at the code of the binReads
function and how it is called from Aneufinder
.
binReads
calculates binning for each file it processes in parallelCalculating the bins for multiple bam
files should yield the same result as long as the assembly is the same. This has to be the case if use.bamsignals=FALSE
, but also should be if use.bamsignals=TRUE
.
Why then does Aneufinder generate the same bins again, and this in parallel for each bam
file it processes? If I've got 50 bam
files, this will calculate bins 50 times over:
https://github.com/ataudt/aneufinder/blob/master/R/binReads.R#L216-L237
binReads
run only on pre-calculated binsThere is 3 variables used to pass bins:
binsizes
- bins to calculate with a fixed sizereads.per.bin
- bins to calculate with a fixed read countbins
- already calculated bins either from a file or GRanges
objectThe documentation for bins
states:
A named
list
withGRanges
containing precalculated bins produced byfixedWidthBins
orvariableWidthBins
.
This, however, is not how it is used in the Aneufinder
function; here, bin sizes are passed that are not calculated yet:
https://github.com/ataudt/aneufinder/blob/master/R/Aneufinder.R#L221
In addition, the parallel.helper
function that is supposed to bin reads doesn't do anything with existing bins:
https://github.com/ataudt/aneufinder/blob/master/R/Aneufinder.R#L217-L224
Hi,
When I click the hyperlink 'vignette', it will prompt “404 page not found‘’.
I think the pdf no longer exists, what happened?
Thank you for your attention.
(https://github.com/ataudt/aneufinder/blob/master/vignettes/AneuFinder.pdf)
Hello we are trying to compare single cell data calls made by Ginkgo, Aneufinder and QDNAseq. We can do this if we get either bed or vcf files from aneufinder. We cant work out if this is possible, it will be great if you can let me know. Thanks
I'm about to release cowplot 1.0 and your package currently doesn't pass a package check with the cowplot release candidate. The problem is in this example:
Line 22 in adb7443
The cowplot package doesn't automatically attach ggplot2 anymore, and therefore the example breaks without an explicit library(ggplot2)
. If you depend on an attached ggplot2 in multiple places, you can also ggplot2 from Imports to Depends in your description file, though generally it is now discouraged to have numerous packages in Depends.
There are also a few other important changes to cowplot. I suggest you read through the release notes and make sure your package works with the cowplot release candidate.
https://github.com/wilkelab/cowplot/blob/master/NEWS
I expect to do the CRAN release in the 2nd week of July.
Dear Aneufinder team,
Thank you for providing this tool. I was wondering whether I could use it for CNV calling in bulk WGS data. I tried using hg38 aligned bam files of bulk WGS data and got the error:
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'an integer', got 'Signal'
Do you have any clues on what I need to do to fit the WGS data for this tool? Thank you so much!
Best,
Pingping
I'm trying to generate bin data from GRanges object that I made from:
raw_reads=bam2GRanges(bamfile,remove.duplicate.reads = TRUE,min.mapq = 10,blacklist = blacklist)
And this turned out to work. However when I try:
bins_reads=binReads(raw_reads,
assembly=genome,
chromosomes=chromosomes,
binsizes=c(40000,80000,100000,200000,500000))
It gives me the error massage: Subsetting specified chromosomes ...Error in match(x, table, nomatch = 0L) : 'match' requires vector arguments
The error can be traced back to:
traceback()
3: seqnames(data) %in% chroms2use
2: data[seqnames(data) %in% chroms2use]
1: binReads(raw_reads, assembly = genome, chromosomes = chromosomes,
binsizes = c(40000, 80000, 100000, 200000, 500000))
Please help me understand what is going on here. Thank you so much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.