imgag / clincnv Goto Github PK
View Code? Open in Web Editor NEWDetection of copy number changes in Germline/Trio/Somatic contexts in NGS data
License: MIT License
Detection of copy number changes in Germline/Trio/Somatic contexts in NGS data
License: MIT License
Hello, clincnv analyzed 3 family samples, and an error occurred. Although I set the parameter minimumNumOfElemsInCluster to 1, how can I solve the error? The error content is as follows:
[1] "We run script located in folder /fuer2/03.Soft/01.Soft_project/ClinCNV-1.17.2 . All the paths will be calculated realtive to this one. If everything crashes, please, check the correctness of this path first."
[1] "START cluster allocation."
[1] "Cluster allocated."
[1] "END cluster allocation."
[1] "We are started with reading the coverage files and bed files 2022-03-14 17:20:48"
[1] "Started basic quality filtering. 2022-03-14 17:20:49"
[1] "Amount of regions after filtering of 0-covered regions 99.528"
[1] "Normalization with GC and length starts. 2022-03-14 17:21:00"
[1] "Percentage of regions remained after GC correction: 0.997975235573553"
[1] "Amount of regions after GC-extreme filtering 99.327"
[1] "Amount of regions after Systematically Low Covered regions filtering 99.327"
[1] "We start to cluster your data (you will find a plot if clustering is possible in your output directory) ./result 2022-03-14 17:21:16"
Error: umap: number of neighbors must be smaller than number of items
Execution halted
Hello, I searched for a while for a description of the columns in the "*_cnvs.tsv" output file. These are the columns:
#chr start end CN_change loglikelihood no_of_regions length_KB potential_AF genes qvalue
Some have obvious meaning, some don't (to me), is there an explanatory document somewhere?
Is CN_change
code for something, or is it the actual number of copies?
loglikelihood
of what?
no_of_regions
, is this number of exons, or number of intervals in the input bedfile?
length_KB
, length of what?
potential_AF
, this seems lower than 1 always, allele frequency?
genes
, I suppose this is empty unless the bed file was annotated?
qvalue
, qvalue of what?
Thank you for your patience, sorry if I missed something obvious
PS. I only ask about this output file as I thought it was the best one to look through. Did I get this wrong, as well? Which of the three files makes most sense to look through?
--hi,
i don't have ngs-bit, is there another tool i can use to prepare this bed file:
https://www.twistbioscience.com/sites/default/files/resources/2019-06/Twist_Exome_Target_hg38.bed
thank you --
-[ ] TSV header: each value in one line, or group logically
-[ ] TSV header: number of iterations should not include super-recall
-[ ] SEG file: make log-likelihood positive
-[ ] SEG file: remove CN column
-[ ] SEG file: add coefficient of variation or similar value
Hello, clinCNV analyzed 5 trios, there was an error occurred. Is my sample ID's problem?
below is my sample ID file:
1_cyw,1_cywm,1_cywf
11_ywx,11_ywxm,11_ywxf
10_xjx,10_xjxm,10_xjxf
12_zxy,12_zxym,12_zxyf
13_zmh,13_zmhm,13_zmhf
There is error message:
[1] Error in strsplit(genesThatHasToBeSeparated[i], split = ",") :
[2]non-character argument
[3]Calls: source ... eval -> eval -> plotFoundCNVs -> unlist -> strsplit
[4]Execution halted
Thanks
ClinCnv throws no error if a BAF file is not readable.
Please throw an error in case any file that is given via the command line cannot be opened.
please help,thank`s .
run script : Rscript ./clinCNV.R --bed ./samples/bed_file.bed --normal ./samples/coverages_normal.cov --out result
the error is as follows:
......
Loading required package: sandwich
[1] "We start to estimate covariances between neighboring regions in germline data - may take some time 2022-03-10 17:22:41"
[1] "Tree of covariances (using 2 predictors - sum of regions' lengths and log2 of distance between regions) plotted in result 2022-03-10 17:23:28"
[1] "Calling started 2022-03-10 17:23:28"
[1] "Working with germline sample 0 2022-03-10 17:23:28"
[1] "Working with germline sample 1 2022-03-10 17:23:28"
Error in writeLines(c("#type=GENE_EXPRESSION", paste0("#track graphtype=points name="", :
cannot open the connection
Calls: source ... outputSegmentsAndDotsFromListOfCNVs -> makeTrackAnnotation -> writeLines
Execution halted
Hi,
I am running germline analysis for 40 samples on hg38 reference genome. The normal run (clinCNV.R --normal normal.cov --out outputFolder --bed annotatedBedFile.bed) without including offtarget regions works fine but when I add offtarget parameters then it has the following error.
Run command:
clinCNV.R --normal normal.cov --bed annotatedBedFile.bed --out outputFolder--normalOfftarget offtarget.cov --bedOfftarget annotatedBedFile_offtarget.bed --numberOfThreads 4 --hg38
Note: The bedfile was annotated with ngs-bits BedAnnotateGC and BedAnnotateGenes. Not path issue as the cov files and bedfiles can be read.
Does anyone know what went wrong?
Thanks :)
Dear developers,
I am using your amazing tool for germline WES analysis. It works pretty well I think.
I was wondering if there was a way to use clinCNV
for mitochondrial analysis. At the moment, I remove any chrM samples I have because the cytobandsHG38.txt
file does not contain any chrM information. Is there a way to add chrM information to the cytobandsHG38.txt
file?
Many Thanks,
Krutik
Hi,
I suggest we order the output CNVs exactly as they are ordered in the input Bed file or cov files respectively.
germline
folder)firstStep.R
, perhaps clincnv.R
Hi again,
I already calculated ontarget coverages. I also want to calculate the offtarget coverages to increase the overall accuracy.
Reading the docs, the steps are a bit confusing to me. I have some questions:
A complete guide to produce offtarget coverage files on targeted samples would be really appreciated.
Thanks a lot in advance.
Rscript clinCNV.R --bed hg38_nuc.bed --normal exome_germlines.cov --colNum 4 --reanalyseCohort TRUE --polymorphicCalling YES --superRecall SUPERRECALL --mosaicism --fdrGermline 10 --lengthG 1 --maxNumGermCNVs 100 --maxNumIter 3 --numberOfThreads 24 --out result
[1] "We run script located in folder /work/sassou/ClinCNV . All the paths will be calculated realtive to this one. If everything crashes, please, check the correctness of this path first."
[1] "You've choosen to detect polymorphic regions with the help of our tool - great choice!"
[1] "You suspect your samples to be mosaic - hmmm, we will check this out...(but the mosaic CN change should not be > 1 copy different from default"
[1] "START cluster allocation."
[1] "Cluster allocated."
[1] "END cluster allocation."
[1] "We are started with reading the coverage files and bed files 2021-06-21 13:36:13"
[1] "Started basic quality filtering. 2021-06-21 13:36:15"
[1] "Amount of regions after filtering of 0-covered regions 98.575"
[1] "Normalization with GC and length starts. 2021-06-21 13:36:44"
[1] "Percentage of regions remained after GC correction: 0.998089547500539"
[1] "Amount of regions after GC-extreme filtering 98.387"
[1] "Amount of regions after Systematically Low Covered regions filtering 98.387"
[1] "We start to cluster your data (you will find a plot if clustering is possible in your output directory) result 2021-06-21 13:37:29"
[1] "You ask to clusterise intro clusters of size 10000 but size of the cohort is 5 which is not enough. We continue without clustering."
[1] "Gender estimation started 2021-06-21 13:37:43"
Error in plot.new() : could not open file 'result/genders.png'
Calls: Determine.gender -> plot -> plot -> plot.default -> plot.new
Execution halted
Centromer regions should be added to the arms so that we don't miss centromer CNVs like in the Array benchmark.
Hi,
Would you be able to add your tool to bioconda? This would increase visibility and ease of use tremendously.
Conda packages also come with a free docker image in biocontainers
, which is good for reproducability.
Thanks
M
In this case an error should be thrown instead of just ignoring the regions/bins outside the defined cytobands!
Hello!
Just getting started with ClinCNV. I am running the test samples unsuccessfully:
Rscript clinCNV.R --bed /home/joel/Programs/ClinCNV/samples/bed_file.bed --normal /home/joel/Programs/ClinCNV/samples/coverages_normal.cov --out test_results/ --folderWithScript $PWD
[1] "We are started with reading the coverage files and bed files 2022-05-24 14:06:07"
Error in if (substring(x, 1, nchar(prefix)) == prefix) { :
the condition has length > 1
Calls: startsWith
Execution halted
What's happening?
Thanks in advance!
Please set "length_KB" and "potential_AF" fixed to 3 decimals.
we have to fix this issue:
-[] If --noPlot is given, no plots should be generated.
-[] ClinCNV should run without write permissions in the installation folder (otherwise you cannot run it from a container)
Hi ClinCNV developers,
running ClinCNV with the provided test data produced an error which we couldn't resolve. All packages are installed as indicated (install_deps_clincnv.R).
Thanks a lot!
Best
PS: running on HPC w/ ubuntu 18.04.5.
ClinCNV-1.18.3.patch
Patch file submitted - reported by and end user of ours as a correction from the developer.
--Hi,
is ClinCNV able to analyse germline WES ? And if yes how to format correctly the bed file ?
The bed file is used to buid the library with the target regions specific to the exome.
Thank you --
Hi German,
could you add information which input regions were skipped because of low quality.
Right now those regions cannot be easily recognized in IGV.
In CnvHunter I added them to the end of the SEG file:
https://github.com/imgag/megSAP/blob/master/test/data/vc_cnvhunter_out1.seg
For example, you could add failed regions to the SEG file and add a "QC" column that contains "qc failed" and some info why. The only drawback would be that you have to assign some CN value, e.g. CN=2 (or 1 for male gonosomes).
I opened a IGV issue to see if we can color the failed regions differently:
igvteam/igv#741
Best,
Marc
Hello!
I've managed to run some hg19 exomes through ClinCNV, but not that I'm attempting some b37 exomes, it crashes like so:
[1] "We run script located in folder /home/joel/Programs/ClinCNV . All the paths will be calculated realtive to this one. If everything crashes, please, check the correctness of this path first."
[1] "START cluster allocation."
[1] "Cluster allocated."
[1] "END cluster allocation."
[1] "We are started with reading the coverage files and bed files 2022-08-19 14:32:58"
[1] "Started basic quality filtering. 2022-08-19 14:33:00"
[1] "Amount of regions after filtering of 0-covered regions 94.373"
Error in if (ends_of_chroms[[chrom]] < max(bedFile[bedFile[, 1] == chrom, :
argument is of length zero
Execution halted
My input files are here:
https://file.io/rr1fFFpq8PzH
Thanks in advance!
Hi,
ClinCNV creates a file "Rplots.pdf" in 1.14-stable in the script directory. That should not be there. Please create a commit or bugfix for 1.14 which fixes that problem.
Best,
Axel
Hi @GermanDemidov @marc-sturm @bondarevts @jakobmatthes @Fohlen
I got a problem , when I run clincnv.R .
Do you have any suggestions for a solution?
`$Rscript $ClinCNV/clinCNV.R \
--bed $prepare/gcAnnotated.preparedBedHg38.bin50000.bed \
--normal $prepare/merge_result/merge_S17.cov \
--folderWithScript $ClinCNV \
--scoreG 50 \
--numberOfThreads 10 \
--out $result/clincnv_prepare_result
There is error message:
[1] "We run script located in folder /hwfssz1/CS_CELL/cs_cell/marui1/software/ClinCNV . Please, specify ABSOLUTE paths, relative paths do not work for every machine. If everything crashes, please, check the correctness of this path first."
[1] "START cluster allocation."
[1] "Cluster allocated."
[1] "END cluster allocation."
[1] "We are started with reading the coverage files and bed files 2023-03-23 11:33:33"
[1] "Started basic quality filtering. 2023-03-23 11:33:35"
[1] "Amount of regions after filtering of 0-covered regions 98.565"
[1] "Coordinates in BED file are outside of the cytobands! Please check if your cytobands file matches your reference genome version!"
`
Thanks
Hi! Would it be possible to make this tool available via bioconda for easier installation and automatic containerization via biocontainers?
Cheers, Rike
Hi @marc-sturm @bondarevts @jakobmatthes @Fohlen @GermanDemidov
I don't really understand how to create my.bed file for WGS,
my current samples only have bam format and original fasta format.
The reference genome I use is hg38.
Do I need to download the file hg38.chrom.sizes ?(https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes)
How to determine from the result table whether the type of cnv is deletion or duplication?
thanks!
ClinCnv shows this warning, with R 4.1 which should not be there I guess:
Stdout of '/mnt/storage1/share/opt/R-4.1.0/bin/Rscript --vanilla /mnt/storage2/GRCh38/share/opt/ClinCNV-1.17.1/clinCNV.R':
[1] "Your R version is too old. We can not guarantee stable work."
Hi @marc-sturm
When I run 1 sample, I got this error.
May I ask if this tool is not suitable for a single sample?
Hi,
Congrats for this useful tool. I have a couple of questions:
Thanks!
Hi German,
Alex Seitz had a male patient with a large duplication on chrX, so it was determined to be a female.
Is there a way to overwrite the gender for a sample?
Best,
Marc
The data of the case can be found here: /mnt/users/ahsturm1/Sandbox/ClinCNV/bug_gender_clustering/
If there is no gene overlapping with a CNV, currently 'NA' is written.
Can you just leave the field blank than?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.