Giter VIP home page Giter VIP logo

jacusa's People

Contributors

cdieterich avatar piechottam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jacusa's Issues

change VCF version to 4.2

IGV now complains about the GT tag in VCF files. Manually changing the version to 4.2 solves this.

JACUSA running wrong

Hi

I running JACUSA, but there is something wrong in logfile:
tail logfile
[ INFO ] 02:24:00 : Started screening contig scaffold232081:277-4771
[ INFO ] 02:24:00 : Started screening contig scaffold226941:87-524
[ INFO ] 02:24:00 : Started screening contig scaffold222121:216-831
[ INFO ] 02:24:00 : Started screening contig scaffold234201:3573-10807
[ INFO ] 02:24:00 : Started screening contig scaffold203801:370-869
[ INFO ] 02:24:00 : Started screening contig scaffold236361:111-991
[ INFO ] 02:24:00 : Started screening contig scaffold237141:54-346
[ INFO ] 02:24:00 : Started screening contig scaffold220141:22-1328
[ INFO ] 02:24:00 : Started screening contig scaffold239001:87-462
[ INFO ] 02:24:00 : Started screening contig scaffold233041:28-3157
Exception in thread "Thread-9" java.lang.ArrayIndexOutOfBoundsException: 1
at jacusa.estimate.MinkaEstimateDirMultParameters.getLogLikelihood(MinkaEstimateDirMultParameters.java:187)
at jacusa.estimate.MinkaEstimateDirMultParameters.maximizeLogLikelihood(MinkaEstimateDirMultParameters.java:111)
at jacusa.method.call.statistic.AbstractDirichletStatistic.estimate(AbstractDirichletStatistic.java:229)
at jacusa.method.call.statistic.AbstractDirichletStatistic.getStatistic(AbstractDirichletStatistic.java:255)
at jacusa.method.call.statistic.dirmult.DirichletMultinomialRobustCompoundError.getStatistic(DirichletMultinomialRobustCompoundError.java:76)
at jacusa.method.call.statistic.AbstractDirichletStatistic.addStatistic(AbstractDirichletStatistic.java:145)
at jacusa.pileup.worker.AbstractCallWorker.processParallelPileup(AbstractCallWorker.java:41)
at jacusa.pileup.worker.AbstractWorker.processParallelPileupIterator(AbstractWorker.java:186)
at jacusa.pileup.worker.AbstractWorker.run(AbstractWorker.java:67)

And result output just 9 tmp file:
*txt_9_tmp.gz

I have no idea what or how to deal with it . It will appreciate if you give some advice.

calmd error?

Hi,

When running an older sample of ours, we get this error. I suggest that the software do a complete halt when it sees this.

00:00:02 Thread 1: Working on contig chr1:150384085-150384085
ERROR 00:00:02 Problem with read: E00515:95:HJ7K5ALXX:5:1218:11424:65687 in sorted/mysample.Aligned.out.srt.rg-added.dedup.calmd.bam
java.lang.IllegalArgumentException: Byte 99 unknown
at lib.util.Base.valueOf(Base.java:72)
at lib.recordextended.MDRecordReferenceProvider.getReferenceBase(MDRecordReferenceProvider.java:50)
at lib.data.storage.container.SimpleMDReferenceProvider.addRecordExtended(SimpleMDReferenceProvider.java:66)
at lib.data.storage.container.ComplexSharedStorage.addRecordExtended(ComplexSharedStorage.java:47)
at lib.data.storage.container.UnstrandedCacheContainter.process(UnstrandedCacheContainter.java:47)
at lib.data.storage.container.AbstractStrandedCacheContainer.process(AbstractStrandedCacheContainer.java:80)
at lib.data.assembler.SiteDataAssembler.buildCache(SiteDataAssembler.java:57)
at lib.util.ReplicateContainer.createIterators(ReplicateContainer.java:49)
at lib.util.ConditionContainer.lambda$0(ConditionContainer.java:29)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at lib.util.ConditionContainer.updateWindowCoordinates(ConditionContainer.java:29)
at lib.worker.AbstractWorker.updateReservedWindowCoordinate(AbstractWorker.java:166)
at lib.worker.AbstractWorker.processInit(AbstractWorker.java:182)
at lib.worker.AbstractWorker.run(AbstractWorker.java:214)

The offending read:

samtools view sorted/mysample.Aligned.out.srt.rg-added.dedup.calmd.bam | grep HJ7K5ALXX | grep E00515 | grep "E00515:95:HJ7K5ALXX:5:2215:25246:23636"
E00515:95:HJ7K5ALXX:5:2215:25246:23636 163 chr1 127333446 255 144M = 127333874 578 ATCACTACTAGATAGTACATCCTTATGGATCTGCAGAAATCTGCTCCAAAGGGGTGGGCTATACTTAGTGATTGTTATATATGTTTAACAGTAACAGGAAATGCATATTAACAGCAGGAATCTTTCCTGAAAGAATCCATTACA AAFFFJFJJFFJFJFFFJJJJJJJJJA<A--<FFJAFJJJJJ7JF<-FFJ<F<-AF--AFJJFFJJJ<7JJ<AFF<FJFJJA7<FJFJJFJFFF<F-AAFJF--<AAAJFAAFJAA7)AAAFJAJ---7<<F-<A-<--7<<-7 PG:Z:MarkDuplicates RG:Z:id NH:i:1 HI:i:1 nM:i:1 AS:i:290 NM:i:1 MD:Z:138c5
E00515:95:HJ7K5ALXX:5:2215:25246:23636 83 chr1 127333874 255 150M = 127333446 -578 GAAAGTGCCTTTTATTTGATATTGGAATGGCTATTCAAGCTTGTTTCTTGGGACCATCTGCATGGAAAATTGTTTTCCAGCTCTTTACTCTGAGGTGGGGTTGGTCTTTGTCACTGTGGTAGATTTCCTGTATGCAGTAAAATGCTGGGT JJJJJJJJFJJJJJJJJJJJFJJJJJJJJJJJJJJJJJFJJJJJJJJJJJFJJFJJJFJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFAA PG:Z:MarkDuplicates RG:Z:id NH:i:1 HI:i:1 nM:i:1 AS:i:290 NM:i:0 MD:Z:150

JacusaHelper for RRD

What functionalities are available in JacusaHelper for looking at RRD data? The documentation says that AddBaseChangeInfo, AddEditingFrequencyInfo, etc. are only applicable for RDD data. Why is this the case? Is there something I can run on RRD results to get a similar well-summarized output of number and location of putative A to G editing sites, for example?

Thanks.

unclear -u options

Hi,

Again, many thanks for providing nice software.

Can I clarify the jacusa2 input parameters for -u?

java -Xmx2g -jar /mnts/bioinfo/src/JACUSA_v2.0.0-RC5.jar call-2 -u DirMult
= OK

java -Xmx2g -jar /mnts/bioinfo/src/JACUSA_v2.0.0-RC5.jar call-2 -u calcPvalue
java.lang.IllegalArgumentException: Unknown statistic or wrong option: calcPvalue

Also, the help could be improved by clarifying how the arguments are correctly combined:

i.e.
Default mode
-u DirMult
Calculate a pvalue based on a chi^2 approximation of the likelihood
-u calcPvalue
How do we, for instance, change any params in this part?
-u calcPvalue, showAlpha
-u DirMult, epsilon=0.001

-u Choose between different modes (Default: DirMult):
DirMult Compound Error (estimated error {0.01} + phred score)
| Adjusts variant condition
| :epsilon Fit achieved if |L1 - L2| < epsilon, where L1 and L2
correspond to old
| and new likelihood respectively.
| Default: 0.001
| :maxIterations Maximum number of iterations for Newton's method.
| Default: 100
| :calcPvalue Calculate a pvalue based on a chi^2 approximation of the
likelihood
| ratio
| :showAlpha Show detailed info of Newton's method in output (not in VCF
output).

Thanks

Compilation from GitHub

Hi,

I would like to compile the latest JACUSA GitHub commit but I wasn't able to find any instructions on how to do so. I have tried to manually compile random files but of course, it didn't work. Do you think it would be possible to add a few lines to the manual on how to compile the tool from git clone download?

Thanks!

java.lang.IllegalArgumentException: Byte 77 unknown

I'm working on the human genome (from ncbi) and I get this error message.


  INFO          00:07:20  Thread 1: Working on contig NC_000001.11:248300969-248400968
  INFO          00:07:20  Thread 1: Working on contig NC_000001.11:248676662-248776661
  java.lang.IllegalArgumentException: Byte 77 unknown
        at lib.util.Base.valueOf(Base.java:74)
        at lib.data.storage.container.FileReferenceProvider.getReferenceBase(FileReferenceProvider.java:90)
        at lib.data.storage.container.FileReferenceProvider.getReferenceBase(FileReferenceProvider.java:68)
        at lib.data.assembler.DataAssembler.createDefaultDataContainer(DataAssembler.java:50)
        at lib.data.assembler.DataAssembler.assembleData(DataAssembler.java:39)
        at lib.util.ReplicateContainer.getNullDataContainer(ReplicateContainer.java:61)
        at lib.util.ConditionContainer.getNullDataContainer(ConditionContainer.java:42)
        at jacusa.worker.CallWorker.createParallelData(CallWorker.java:41)
        at lib.worker.AbstractWorker.hasNext(AbstractWorker.java:111)
        at lib.worker.AbstractWorker.processReady(AbstractWorker.java:196)
        at lib.worker.AbstractWorker.run(AbstractWorker.java:213)

The log show that the error seems to occur at the end of the first analyzed sequence (NC_000001.11)
Any help would be greatly appreciated

java.lang.IndexOutOfBoundsException & Exception in thread "Thread-25" java.lang.StackOverflowError

Hi,
I am using JACUSA for my project.
I have gDNA (from bwa aligner) and cDNA (from STAR aligner) bam files sorted and indexed.
I used the following command with java 1.7.
(hs.bed contains chromosome 1 to Y)

java -Xmx50G -jar JACUSA_v1.3.0.jar call-2 -a H:1,D -b hs37d5.bed -p 26 -r rddsnov27aoption.out -s gDNA.bam cDNA.bam &> jacnov27aoption.log

then I got the following errors. The process was stuck and there was no output but only tmp.gz files

java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at java.util.Collections$UnmodifiableList.get(Collections.java:1211)
at jacusa.filter.storage.DistanceFilterStorage.processRecord(DistanceFilterStorage.java:39)
at jacusa.pileup.builder.AbstractPileupBuilder.processRecord(AbstractPileupBuilder.java:358)
at jacusa.pileup.builder.AbstractPileupBuilder.adjustWindowStart(AbstractPileupBuilder.java:178)
at jacusa.pileup.iterator.AbstractWindowIterator.adjustWindowStart(AbstractWindowIterator.java:155)
at jacusa.pileup.iterator.AbstractWindowIterator.adjustCurrentGenomicPosition(AbstractWindowIterator.java:148)
at jacusa.pileup.iterator.TwoSampleIterator.hasNext(TwoSampleIterator.java:38)
at jacusa.pileup.worker.AbstractWorker.processParallelPileupIterator(AbstractWorker.java:183)
at jacusa.pileup.worker.AbstractWorker.run(AbstractWorker.java:67)

Exception in thread "Thread-25" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:446)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

Complete error log attached as pdf below.

jacusaerrorforgitissue.pdf

Please suggest as how these errors could be resolved.
Thank you!
Priya

Jacusa command log in output file header

Hi,

For keeping track of things it would be super to have a copy of the command line options choosen in the first line of any jacusa output file, with ##. Helps keep track of versioning, databases used, etc.

Cheers
/Alistair

bed format error

processing BED files with header produces an error (solution of using a BED with no header works).

This breaks the processing:

(the example from http://genome.ucsc.edu/FAQ/FAQformat#format1)
track name=pairedReads description="Clone Paired Reads" useScore=1
chr1 3073253 3079322

INFO 00:00:00 Computing overlap between sequence records.
java.lang.ArrayIndexOutOfBoundsException: 1
at lib.util.coordinate.provider.BedCoordinateProvider.read(BedCoordinateProvider.java:53)
at lib.util.coordinate.provider.BedCoordinateProvider.(BedCoordinateProvider.java:33)
at lib.util.AbstractMethod.initCoordinateProvider(AbstractMethod.java:194)
at lib.cli.CLI.processArgs(CLI.java:181)
at lib.util.AbstractTool.run(AbstractTool.java:53)
at jacusa.JACUSA.main(JACUSA.java:99)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:61)

JACUSA2 Version: 2.0.0-RC5 call-1 -b mm10_ALL_GENES_plus_minus_5000bp_pos_v2.bed -r tmp ../sorted/myfile.Aligned.out.srt.rg-added.dedup.bam

java.lang.NullPointerException
at lib.worker.WorkerDispatcher.hasNext(WorkerDispatcher.java:73)
at lib.worker.WorkerDispatcher.run(WorkerDispatcher.java:80)
at lib.util.AbstractTool.run(AbstractTool.java:65)
at jacusa.JACUSA.main(JACUSA.java:99)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:61)

java.lang.Exception: Sequence Dictionaries of BAM files do not match

Hi,
I have been trying to use JACUSA using a gDNA bam and cDNA bam file with the command below:

java -jar JACUSA_v1.3.0.jar call-2 -a H:1,B,Y -c 10 -F 1024 -f B -P RF-FIRSTSTRAND,RF-FIRSTSTRAND -p 2 -r rdds_out gDNA.bam cDNA.bam

The RNA is stranded and the DNA is whole exome, both files are sorted, duplicates have been marked and they are indexed.

I have tested it with the example data you provided and it worked but now when using my own data I get the following error:

OutputWriter: rdds_out
[ INFO ] 00:00:00 : Computing overlap between sequence records.
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:61)
Caused by: java.lang.Exception: Sequence Dictionaries of BAM files do not match
at jacusa.method.AbstractMethodFactory.getSAMSequenceRecords(AbstractMethodFactory.java:168)
at jacusa.method.call.TwoSampleCallFactory.initCoordinateProvider(TwoSampleCallFactory.java:273)
at jacusa.JACUSA.main(JACUSA.java:238)
... 5 more

Do you have any suggestions on how this could be resolved?

Best wishes,
Kerstin

Stranded data:

Hi,

Just writing for a definition: In the manual, you guys write:
FR-FIRSTSTRAND STRANDED library - first strand sequenced,
FR-SECONDSTRAND STRANDED library - second strand sequenced, and
UNSTRANDED UNSTRANDED library.

So we have generated data that are sequenced with the KAPA RNA Hyperprep Kit and works like this:

  1. 1st strand cDNA synthesis using random priming;
  2. combined 2nd strand synthesis and A-tailing, which
    converts the cDNA:RNA hybrid to double-stranded
    cDNA (dscDNA), incorporates dUTP into the second
    cDNA strand for stranded RNA sequencing, and adds
    dAMP to the 3' ends of the resulting dscDNA;
  3. adapter ligation, where dsDNA adapters with 3' dTMP
    overhangs are ligated to library insert fragments; and
  4. library amplification, to amplify library fragments
    carrying appropriate adapter sequences at both ends
    using high-fidelity, low-bias PCR. The strand marked
    with dUTP is not amplified, allowing strand-specific
    sequencing.

That means that the FR-SECONDSTRAND STRANDED library - second strand sequenced is the right option for us? In other words, this option means that the second strand will be the identical to the RNA seq ?

Thanks a bunch,

How is Phasing handled?

For example if I have the following reference sequence with two variants/mutations:

image

How would this look in the VCF output? Does it matter if mutations come from the same read? I assume that the output would look like:

1 12345 C T ...
1 12348 T A ...

How do we count low quality bases and how do they influence read end?

With base call quality filtering set to 20
and given the following Read:

Position: 1 2 3 4
Base calls: A C G T
Base Quality: 40 40 40 10

How are low quality base calls treated?
And how do they influence the read end? If at all.

Could it be that low quality bases carry RT arrest information?!

Homopolymer filtering

For the Y pileup filter. The documentation states "Filter wrong variant calls in the vicinity of homopolymers. Default 7 (Y:length)". What do you mean by in the vicinity? Does that mean that the variant is embedded within a homopolymer of length at least 7 or does it mean that the variant could be outside of the homopolyer (in it's vicinity implies outside)? The length I assume is the minimum length of the homopolymer.

java.lang.ArrayIndexOutOfBoundsException

Is it normal to have these kinds of exceptions in the standard out or standard error? I have them in most of my output files.

java.lang.ArrayIndexOutOfBoundsException: 42
        at jacusa.pileup.builder.WindowCache.addHighQualityBaseCall(WindowCache.java:72)
        at jacusa.pileup.builder.UnstrandedPileupBuilder.addHighQualityBaseCall(UnstrandedPileupBuilder.java:65)
        at jacusa.pileup.builder.AbstractPileupBuilder.processAlignmentMatch(AbstractPileupBuilder.java:537)
        at jacusa.pileup.builder.AbstractPileupBuilder.processRecord(AbstractPileupBuilder.java:401)
        at jacusa.pileup.builder.AbstractPileupBuilder.adjustWindowStart(AbstractPileupBuilder.java:178)
        at jacusa.pileup.iterator.AbstractWindowIterator.initLocation(AbstractWindowIterator.java:66)
        at jacusa.pileup.iterator.AbstractTwoSampleIterator.<init>(AbstractTwoSampleIterator.java:41)
        at jacusa.pileup.iterator.TwoSampleIterator.<init>(TwoSampleIterator.java:26)
        at jacusa.pileup.worker.TwoSampleCallWorker.buildIterator(TwoSampleCallWorker.java:44)
        at jacusa.pileup.worker.AbstractWorker.run(AbstractWorker.java:66)

command:

java -jar jacusa call-2 -P UNSTRANDED,UNSTRANDED -a B,S,D,H:1,I,Y,L,M -F 1024 -f V -r jacusa.vcf -p 6 wgs_align.bam rnaseq_nodups.bam &> jacusa.log

Asterisks in FILTER and INFO column of VCF output

As the title describes, I see asterisks in the FILTER and INFO fields in the vcf output, for example:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  wgs_tumor.bam    nodups.bam
1       14574   .       A       G       .       *       *       DP:BC   5:5,0,0,0       18:4,0,14,0
1       14590   .       G       A       .       *       *       DP:BC   5:0,0,5,0       19:12,0,7,0
1       14599   .       T       A       .       B;D     *       DP:BC   7:0,0,0,7       11:4,0,0,7
1       14673   .       G       C       .       H       *       DP:BC   8:0,2,6,0       28:0,1,27,0

I've read the VCF file format specification and nowhere can I find what '*' indicates in the output. I see that if FILTER is not 'PASS' then a semi-colon separated list of codes for filters that failed will be shown. Does this mean that lines with '*' in their FILTER column failed all filters? What about the INFO column? I see asterisks all the way down the file in the INFO column.

Report data for sites not called as edited?

Hi,

It would be really useful to have depth and general information for a site regardless of if it is edited or not. Most importantly for call-1, but also for other modes.
Why? Because knowing the depth of reads in a position, from a sample with theoretically no editing (a good genetic control), allows "no editing, good read depth" vs "no editing - was it lack of read depth OR lack of editing?"

Cheers
/Alistair

File format for rcoverage

  • support for replicates
  • support in JacusaHelper
  • store read arrest and read through counts
  • sture Beta-Statistic and p-value

java.lang.OutOfMemoryError: GC overhead limit exceeded error

Hi,
Thank you for developing this software which looks promising.

I tried to launch an analysis with Jacusa on an RNAseq dataset:

  • 14 libraries of ~30M paired reads of species A (4 different tissues)
  • 14 libraries of ~30M paired reads of species B (4 different tissues)

Alignments were produced with GEM (http://algorithms.cnag.cat/wiki/The_GEM_library)
I'm trying to identify SNPs segregating the 2 species sequenced (considering all tissues, all replicates), so I try this command:

java -jar JACUSA_v1.2.0.jar call-2 -p 30 -r test.res $(ls ../data/bams/with_RG_no_dup/23*bam | paste -sd ',') $(ls ../data/bams/with_RG_no_dup/25*bam | paste -sd ',') &> std_out_err.txt &

Everything apparently ran smoothly for 7 hours and then I got this error for several threads:
Exception in thread "Thread-8" java.lang.OutOfMemoryError: GC overhead limit exceeded

And then a "java.lang.ArrayIndexOutOfBoundsException" for other threads.

I attached the log of the failed run.

I would try to fix this by setting: java -Xms1024M -Xmx2048M (considering this, I'll be running 20 threads on a 64Gb RAM machine).

In case you have another proposition, please let me know.
Thanks !

Etienne
std_out_err.txt

unexpected command line complaint with -r

java -Xmx6g -jar JACUSA_v2.0.0-RC5.jar call-1 -b Sppl2a.bed -f V -r Sppl2a.jacusa2.vcf myfile.bam
INFO 00:00:00 Computing overlap between sequence records.

JACUSA2 Version: 2.0.0-RC5 call-1 -b Sppl2a.bed -f V -r Sppl2a.jacusa2.vcf myfile.bam

java.lang.IllegalArgumentException: Cannot set a file type if the output is not to a file.
at htsjdk.variant.variantcontext.writer.VariantContextWriterBuilder.setOutputFileType(VariantContextWriterBuilder.java:185)
at jacusa.io.format.call.VCFcallWriter.(VCFcallWriter.java:59)
at jacusa.io.format.call.VCFcallFormat.createWriter(VCFcallFormat.java:41)
at lib.worker.WorkerDispatcher.(WorkerDispatcher.java:53)
at lib.util.AbstractMethod.getWorkerDispatcherInstance(AbstractMethod.java:60)
at lib.util.AbstractTool.run(AbstractTool.java:64)
at jacusa.JACUSA.main(JACUSA.java:99)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:61)

How do we define pseudocounts in rt-arrest?

In cases where no read through or read end can be observed at a specific position we need to define pseudocount(s).

Currently, JACUSA 2.x assumes a constant pseudocount of 1 for
read through and arrest. (1 each)

Can we do better?
Considering read ends in the neighborhood of the current position?

Example of valid pileup call for JACUSA

Hi Michael,
would it be possible to provide a valid example of a command line call of JACUSA's pileup mode?
I couldn't find any in the manual or elsewhere.

No special parameters, just the BAM file as input.
I tried:
java -jar /home/ralf/JACUSA_v1.2.0.jar pileup /home/ralf/mybam.sorted.bam
Tis just displayed the help page.

How would the command line call look like, if all reference positions (also non-mismatches) are desired as output?

Many thanks
Ralf

JacusaHelper update

  • Add Tutorial to the manual
  • Move "obsolet" pileup filter in JACUSA to JacusaHelper RareVariant, MinDifference

Explanation about JACUSA output

Hi,

I am doing RDDs with JACUSA (working great !)

  • My test statistics scores range form 0.001 - 300.
    Is this score meaningful when working without replicates?
    What would be a descent/acceptable minimum (10, 100, 200)?

  • In the manual you mention 'base IJ columns indicate inverted base count if on negative strand’.
    In this case, is the vector (A,C,G,T) inverted for RNA sample (FR-FIRSTSTRAND) on minus strand as (T,G,C,A)?
    Is the following example correctly interpreted for the minus strand ('115' corresponds to C or G)?

stat	strand	bases11	        bases21		        DNA - A	DNA - C	DNA - G	DNA - T	RNA - A	RNA - C	RNA - G	RNA - T
175.09	+	0,615,0,0	0,0,0,109	=>	0	615	0	0	0	0	0	109
287.89	-	399,0,0,0	0,0,115,0	=>	399	0	0	0	0	115	0	0
  • About the vcf output, does the ALT base reported is the one with the highest number of reads in samples 2 (after the REF ones)?

Thanks !

Exception in thread "Thread-0" java.lang.NullPointerException

Hello,
I have a sorted and indexed BAM file from an experimental sample which I am attempting to probe for editing activity (specifically ADAR1 mediated A to G editing). My BAM was aligned to the mm10 genome, and allowed multi-mapping.
I am looking in specific bed files corresponding to ERVs for mismatches. I use a command like the following:
I am attempting to use the following command:
java -jar JACUSA_v1.3.0.jar call-1 -b my_bed_file -r JACUSA.out my_bam_file

I end up with the following error message:
Exception in thread "Thread-0" java.lang.NullPointerException
at jacusa.pileup.builder.AbstractPileupBuilder.(AbstractPileupBuilder.java:57)
at jacusa.pileup.builder.UnstrandedPileupBuilder.(UnstrandedPileupBuilder.java:26)
at jacusa.pileup.builder.UnstrandedPileupBuilderFactory.newInstance(UnstrandedPileupBuilderFactory.java:20)
at jacusa.pileup.iterator.AbstractWindowIterator.createPileupBuilders(AbstractWindowIterator.java:94)
at jacusa.pileup.iterator.AbstractOneSampleIterator.(AbstractOneSampleIterator.java:28)
at jacusa.pileup.iterator.OneSampleIterator.(OneSampleIterator.java:24)
at jacusa.pileup.worker.OneSampleCallWorker.buildIterator(OneSampleCallWorker.java:37)
at jacusa.pileup.worker.OneSampleCallWorker.buildIterator(OneSampleCallWorker.java:1)
at jacusa.pileup.worker.AbstractWorker.run(AbstractWorker.java:66)

I also receive a number of exceptions like the following:
java.lang.IllegalStateException: No MD field present for SAMRecord: A00873:245:HL2FWDSXY:1:1345:19416:14826_GGTTT 1/2 95b aligned read.

Is there something obvious I'm missing and needs to be fixed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.