hall-lab / speedseq Goto Github PK

View Code? Open in Web Editor NEW

307.0 46.0 116.0 14.81 MB

A flexible framework for rapid genome analysis and interpretation

License: MIT License

Makefile 1.53% Shell 3.07% Python 0.88% Perl 7.99% C++ 0.96% C 75.63% Java 0.28% Lua 0.87% M4 0.35% Roff 8.42%

speedseq's People

Stargazers

Watchers

Forkers

johnworth pombredanne lukeping khp-informatics xflicsu muralidays abdul59 zengfengbo dh10 hjanime zhmz90 hadoopy fw1121 shujiahuang edwardhust s-boardman bwang2014 piaep pfxuan dhruv-github snewhouse zhc1991 linhua-sun simexin bgistone rpseq adomore jiaolongsun snashraf xubo245 maypoleflyn jeldred meeshcompbio yimsea chenx-bob johnnywang92 baiyuanxiang ahwagner jchenpku ryys1122 indraniel ejdzi xinqianli wisekh6 anorris8 ttriche scchess virajbdeshpande kwuiee wilsonyangliu pevs laigr andyzhuang xchromosome219 inambioinfo clover2008 alfredokcl palc microtsiu tianyunwang lozybean dirnkwaterwang dayedepps sunqiangzai nashera boksic1986 wy2160640 bgi-hys c4t3 joaancui jessiepoquerusse hjcong fengpku harold3000 leiming8886 xiaoqiwang19 zyworship assumeassume wangyumei-gd xtmgah douglasabrams uwb-linux juzheng87 xujinshengjsxu moonchangin n-damo bowen1992 vyx-nir-neerman xin8you pythseq general0710 raydai cmguodong sneakeralex dujidan goodlucksun huaichao2018 leornardzhou janeyang123 xiaotaowang

speedseq's Issues

vcfstream sort instead of bedtools sort

This will greatly reduce memory requirements in merging parallel runs of freebayes

read-depth for BND variants

annotate the read depth of intrachromosomal BND variants

Variant calling fails with [ti_index_core] the chromosome blocks not continuous

speedseq somatic -v fails with following output:

Calling somatic variants...

    create temporary directory

    /opt/speedseq//bin/freebayes -f /opt/vcp_files/human_g1k_v37.fasta \
        --pooled-discrete \
        --min-repeat-entropy 1 \
        --genotype-qualities \
        --min-alternate-fraction 0.05 \
        --min-alternate-count 2 \
        --region $chrom:$start..$end \
        /bamdir/126_normal.bam /bamdir/126_1.bam \
        | somatic_filter 10 18 0 \
        > /workdir/126_1.$chrom:$start..$end.vcf

    cat /workdir/var_command.txt | /opt/speedseq//bin/parallel -j 32

    grep "^##" /workdir/126_1.MT:12136..12498.vcf \
    | cat - <(echo '##INFO=<ID=SSC,Number=1,Type=Float,Description="Somatic score">') <(grep "^#CHROM" /workdir/126_1.MT:12136..12498.vcf) > /workdir/header.txt

    cat /workdir/126_1."$chrom:$start..$end".vcf | grep -v "^#" \
        | sort -k1,1 -k2,2n | cat /workdir/header.txt - \
        | /opt/speedseq//bin/bgzip -c > /workdir/126_1.vcf.gz

    /opt/speedseq//bin/tabix -f -p vcf /workdir/126_1.vcf.gz 
[ti_index_core] the chromosome blocks not continuous at line 1567, is the file sorted? [pos12247310]

The vcf file is indeed not correctly sorted. The problem seems to be related locale settings:

$ env | grep LANG
LANG=en_US.UTF-8
$ sort --help
<snip>
 *** WARNING ***
The locale specified by the environment affects sort order.
Set LC_ALL=C to get the traditional sort order that uses
native byte values.
<snip>

I suggest setting LC_ALL=C in SpeedSeq to avoid issues with different locale settings.

installation error

Hi,
There are some problems in the installation of speedseq.
The error log,

$make

make align
make[1]: Entering directory /faststorage/home/siyang/USER/yeweijian/PipelineTest/Speedseq/speedseq' make -C src/bwa make[2]: Entering directory/faststorage/home/siyang/USER/yeweijian/PipelineTest/Speedseq/speedseq/src/bwa'
make[2]: Nothing to be done for all'. make[2]: Leaving directory/faststorage/home/siyang/USER/yeweijian/PipelineTest/Speedseq/speedseq/src/bwa'
cp src/bwa/bwa bin
cp src/sambamba bin
make -C src/samblaster
make[2]: Entering directory /faststorage/home/siyang/USER/yeweijian/PipelineTest/Speedseq/speedseq/src/samblaster' make[2]: *** No targets specified and no makefile found. Stop. make[2]: Leaving directory/faststorage/home/siyang/USER/yeweijian/PipelineTest/Speedseq/speedseq/src/samblaster'
make[1]: *** [samblaster] Error 2
make[1]: Leaving directory `/faststorage/home/siyang/USER/yeweijian/PipelineTest/Speedseq/speedseq'
make: *** [all] Error 2

Could you help?

segmentation fault for speedseq sv

Hi Colby,

I want to let you know that I am having speedseq running OK for all my samples except this one which
gives me segmentation fault.

I am not sure why this particular one gives me trouble. I have 50 others finished without error.
Also let you know that the new svtyper now gives me correct genotypes.

/risapps/rhel6/speedseq/0.1.0//bin/lumpyexpress: line 411: 31825 Segmentation fault      (core dumped) $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $LUMPY_SPL_STRING > $OUTPUT

Thanks,
Ming

speedseq should check for commands in $PATH

This is something of a nit but ...
If I already have freebayes, bwa, etc. in my $PATH, it seems redundant to have to update speedseq.config. I see some of the vars have the format:

BEDTOOLS=`which bedtools || true`

Is there a reason that others aren't like that?

Setting $SPEEDSEQ_HOME=/usr/local/
works if the binaries are in the same place, but that's not always the case...

CNVnator ROOT error

@s-boardman forked from #40

Hi Colby,

I've been able to run this today and no longer have the chromosome naming error.

However, what we now have is what looks like a root error (output below):

/opt/gridware/pkg/apps/speedseq/0.0.3a/gcc-4.4.6+root-5.34.30+samtools-0.1.19+python-2.7.3/bin/cnvnator-multi: error while loading shared libraries: libCore.so.5.34: cannot open shared object file: No such file or directory
Should I open a new ticket or do you think this related?

speedseq waits indefinitely?

I am trying to speeseq aln, and although it appears I have all the prerequisites installed the process just sits there at 0% CPU indefinitely.

Here are my inputs: (http://clavius.bc.edu/~erik/speedseq/)
http://clavius.bc.edu/~erik/speedseq/chr20_bit.fa
http://clavius.bc.edu/~erik/speedseq/sample005.fa_1.fastq
http://clavius.bc.edu/~erik/speedseq/sample005.fa_2.fastq

Please let me know what obvious thing I'm doing wrong :)

Bam input for alignment

thanks for creating a wonderful tool.
Is there a way to use bam files as input for the alignment step.
Its very time consuming to generate fastqs (sorting bams and bam2fastx steps) from bams.

speedseq sv CNVnator not finding bam file

I'm testing speedseq on our cluster and am running into an issue where calling speedseq sv with CNVnator doesn't find the bam file passed to it.

The command I'm using is:

qsub -V -e e_r -o o_r -b Y -cwd -N speedseq_sv_rd \
speedseq sv -R /mnt/lustre/references/hg19/hg19_validated.fa \
-o NA12877 -g -k -d -v \
-B /mnt/archive/analysis/projects/HiSeq/CNV_Genome_Project/External_Data/Illumina_Platinum_Genomes/NA12877/bam_fromfastq_frombam/NA12877.bam \
-S /mnt/archive/analysis/projects/HiSeq/CNV_Genome_Project/External_Data/Illumina_Platinum_Genomes/NA12877/bam_fromfastq_frombam/NA12877.splitters.bam \
-D /mnt/archive/analysis/projects/HiSeq/CNV_Genome_Project/External_Data/Illumina_Platinum_Genomes/NA12877/bam_fromfastq_frombam/NA12877.discordants.bam

And the error I receive is:

Traceback (most recent call last):
  File "/opt/gridware/pkg/apps/speedseq/0.0.3a/gcc-4.4.6+root-5.34.30+samtools-0.1.19+python-2.7.3/bin/cnvnator_wrapper.py", line 350, in <module>
    chroms_list = get_chroms_list(args.bam)
  File "/opt/gridware/pkg/apps/speedseq/0.0.3a/gcc-4.4.6+root-5.34.30+samtools-0.1.19+python-2.7.3/bin/cnvnator_wrapper.py", line 153, in get_chroms_list
    proc = subprocess.Popen(['samtools', 'view', '-H', bam_fn], stdout = subprocess.PIPE)
  File "/opt/gridware/pkg/apps/python/2.7.3/gcc-4.4.6/lib/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/opt/gridware/pkg/apps/python/2.7.3/gcc-4.4.6/lib/python2.7/subprocess.py", line 1249, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

And from the verbose output I can see that this is the echoed CNVnator command:

# run cnvnator-multi
    /opt/gridware/pkg/apps/python/2.7.3/gcc-4.4.6/bin/python2.7 /opt/gridware/pkg/apps/speedseq/0.0.3a/gcc-4.4.6+root-5.34.30+samtools-0.1.19+python-2.7.3/bin/cnvnator_wrapper.py \
--cnvnator /opt/gridware/pkg/apps/speedseq/0.0.3a/gcc-4.4.6+root-5.34.30+samtools-0.1.19+python-2.7.3/bin/cnvnator-multi \
-T NA12877.V0rHAeDGFjgn/cnvnator-temp -t 1 -w 100 \
-b /mnt/archive/analysis/projects/HiSeq/CNV_Genome_Project/External_Data/Illumina_Platinum_Genomes/NA12877/bam_fromfastq_frombam/NA12877.bam \
NA12877.V0rHAeDGFjgn/NA12877.bam.readdepth \
-c /mnt/lustre/references/speedseq/annotations/cnvnator_chroms -g GRCh37

Therefore /mnt/archive/analysis/projects/HiSeq/CNV_Genome_Project/External_Data/Illumina_Platinum_Genomes/NA12877/bam_fromfastq_frombam/NA12877.bam is args.bam and gives and OSError. However, this file definitely exists and was generated by speedseq align without issue. If I run the speedseq sv command without CNVnator (-d flag) the process completes without errors and I get a genotyped vcf.

Any thoughts/ideas would be greatly appreciated!

Realign: option to change read groups such as ID and SM

Feature request: option to change read groups such as ID and SM while using realign

may be implementing using bamaddrg (https://github.com/ekg/bamaddrg) or any of your favorite tool

gawk is an unspecified dependency

It looks like gawk is utilized explicitly in speedseq, but not currently listed as a prerequisite. I'm seeing errors in a Docker image where I did not explicitly install gawk.

CNVnator failed -- it's missing a header file: TFrame.h

Hi,
The HPC staff is installing speedseq for me and she told me that CNVnator can not be installed:
CNVnator failed -- it's missing a header file: TFrame.h

Any ideas?
Thanks,
Ming

Generating dicordant and splitter bam file

Hi Colby,

I want to run the speedseq sv functionality on existing bwa-mem produced Tumor/Normal bam file.

I tried using samblaster to generate this but i always get the splitter bam file as empty. Do you have any suggestions or other ways to quickly get these files.

Thanks you for your help in advance.

Best,
Ronak

allow joint var calling for "var" and "lumpy" modules

experimental-gls

When running Var module : Getting error messages
"/speedseq/bin/freebayes: unrecognized option '--experimental-gls'
did you mean --use-best-n-alleles ? "

Seems "experimental-gls" option is decapitated

hope it does not effect the outcome?

add -C comment to SAM header in speedseq aln

use automated installation instructions for GEMINI

Much easier and more stable:

http://gemini.readthedocs.org/en/latest/content/installation.html#automated-installation

CNVnator multi sample

make CNVnator run on multi sample LUMPY alignments.

bwa mem proper pairs

Note that the current version of bwa mem (0.7.7) occasionally marks extremely distant reads as concordant (~250 Mb insert). This will seriously bias the LUMPY insert distribution, as calculated by pairend_distro.pl.

Error running speedseq var

Hello,
I am trying to run 'speedseq var' on the 'speedseq aln' output bam file.
I am getting error when speedseq var tries to run freebayes. It seems speedseq requires older version of freebayes.
I have the most recent version freebayes installed (v9.9.13) and the error I am getting is unrecognized option --region
Can you please suggest which version of freebayes will be compatible with the speedseq var command?
Also , if speedseq is being upgraded to use most recent version of freebayes.

Thanks
Priti

What is the difference between speedseq and GATK

As we know, GATK HaplotypeCaller is used for SNP calling.
As speedseq is used for SNV calling, what is the difference between these two tools?
Is it possible to use speedseq for SNP calling?

var/somatic handling empty windows param

i think it will stall with current config. should query entire genome.

update svtyper

don't forget to change the '-d' flag

wrong mapping

Hi
i found out a weird behaving of BWA where it maps on the wrong part
Here my reads which cause the problem pb.fastq:
@M01342:47:000000000-A9VMJ:1:2110:28593:14009 1:N:0:1
ATCGGACCAGGCTTCATTCCC
+
CBCCCCCCCFCCGGGGGGGGG
@M01342:47:000000000-A9VMJ:1:2112:10072:14449 1:N:0:1
ATCGGACCAGGCTTCATTCCC
+
AAABABBBBFAAFGGGGGGGG

here is my reference genome brassica_pb.fa (actually they are micro RNA but it should work also, right?)

bna-miR167c_MIMAT0005628_Brassica_napus_miR167c
TGAAGCTGCCAGCATGATCTA
bna-miR166a_MIMAT0005629_Brassica_napus_miR166a
TCGGACCAGGCTTCATTCCCC

here are my commands:
/home/ctuser/Documents/programmes/bwa-0.7.10/bwa index brassica_pb.fa
/home/ctuser/Documents/programmes/bwa-0.7.10/bwa aln -t 12 -n 0 -k 0 brassica_pb.fa pb.fastq > ./res_BWA_pb_newVersion/pb.sai
/home/ctuser/Documents/programmes/bwa-0.7.10/bwa samse brassica_pb.fa ./res_BWA_pb_newVersion/pb.sai $f > ./res_BWA_pb_newVersion/pb.sam

here is my sam file:
@sq SN:bna-miR167c_MIMAT0005628_Brassica_napus_miR167c LN:21
@sq SN:bna-miR166a_MIMAT0005629_Brassica_napus_miR166a LN:21
@pg ID:bwa PN:bwa VN:0.7.10-r789 CL:/home/ctuser/Documents/programmes/bwa-0.7.10/bwa samse brassica_pb.fa ./res_BWA_pb_newVersion/pb.sai pb.fastq
M01342:47:000000000-A9VMJ:1:2110:28593:14009 4 bna-miR167c_MIMAT0005628_Brassica_napus_miR167c 21 25 21M * 0 0 ATCGGACCAGGCTTCATTCCC CBCCCCCCCFCCGGGGGGGGG XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:21
M01342:47:000000000-A9VMJ:1:2112:10072:14449 4 bna-miR167c_MIMAT0005628_Brassica_napus_miR167c 21 25 21M * 0 0 ATCGGACCAGGCTTCATTCCC AAABABBBBFAAFGGGGGGGG XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:21

we can see that both of the read have been mapped to the first mir but it should not have mapped at all (or maybe to the second but I have put into parameter no mismatch)!!
I tried three different version of BWA (0.7.5, 0.6.1 and 0.7.10) but same behaviour.
Did I do something wrong?
I must say it usually work fine and found the right mir but for two cases includind this one, it found the mir just before the right one. Is there a problem of length?
Thank for any help

reference is not a tree: 8a570f867dde07fbe9a025f2eec706d44b82966c

git clone --recursive git://github.com/cc2qe/speedseq
…
…
Initialized empty Git repository in /speedseq/src/parallel/.git/
fatal: reference is not a tree: 8a570f867dde07fbe9a025f2eec706d44b82966c
Unable to checkout '8a570f867dde07fbe9a025f2eec706d44b82966c' in submodule path 'src/parallel'

Appears the reference is to a commit that’s not been pushed to the main repo

Yum for installation assumes one flavor of OS.

speedseq sv fails when BAM files has less than 10 million reads

TypeError: %d format: a number is required, not numpy.float64

bamtofastq error

Hi there,

I am sorry to bother you again. The speedseq realign was working for me, but when the HPC staff re-install it, I had a problem with the bamtofastq python script. Can you tell what's wrong here? Thank you! Note, it is the same bam file (which was successfully remapped) that gives me error. My bam file should contain the cigar information for the reads.

Ming

Traceback (most recent call last):
File "/risapps/src6/speedseq//bin/bamtofastq.py", line 157, in
sys.exit(main())
File "/risapps/src6/speedseq//bin/bamtofastq.py", line 153, in main
args.header)
File "/risapps/src6/speedseq//bin/bamtofastq.py", line 64, in bamtofastq
if 5 in [x[0] for x in al.cigar]:
TypeError: 'NoneType' object is not iterable
samblaster: Version 0.1.21
samblaster: Inputting from stdin
samblaster: Outputting to stdout
samblaster: Opening temp/disc_pipe for write.
samblaster: Opening temp/spl_pipe for write.
[gzclose] buffer error
samblaster: Loaded 84 header sequence entries.
Traceback (most recent call last):
File "/risapps/src6/speedseq//bin/bamtofastq.py", line 157, in
sys.exit(main())
File "/risapps/src6/speedseq//bin/bamtofastq.py", line 153, in main
args.header)
File "/risapps/src6/speedseq//bin/bamtofastq.py", line 64, in bamtofastq
if 5 in [x[0] for x in al.cigar]:
TypeError: 'NoneType' object is not iterable

Variant calling: "Please specify a BAM file or files" error when bed file contains headers

Variant calling fails after several hours with "Please specify a BAM file or files" error message when the bed file contains headers:

$ head -n 5 ~/SureSelect_AllExon_V5_hg19_target_coordinates.bed
browser position chr1:65510-65625
track name="Covered" description="Agilent SureSelect DNA - SureSelectXT Human All Exon V5 - Genomic regions covered by probes" color=0,0,128
chr1    65509   65625   -
chr1    65831   65973   -
chr1    69481   69600   ens|ENST00000335137,ccds|CCDS30547.1,ref|NM_001005484,ref|OR4F5

installing root for CNVnator

Nice pipeline, especially the CNV part! CNVnator gives many valid CNV calls complementary to lumpy.

When installing root locally for CNVnator, it has to be compiled without prefix, otherwise it won't run through the entire speedseq pipeline (at least for me). I know it is counterintuitive. Just type
./configure
make

no "make install" required. Everything gets compiled locally and with source /pathto/root/bin/thisroot.sh libs gets linked.

change pairend_distro -x

use wider stdev, either 4 or 5 in constructing the insert distribution for lumpy

make merging splitters and discordants default in speedseq realign

Issue with CNVnator installation

I am having an issue identical to the closed issue #28 . I have sourced the root installation and tired the suggestion from the seqanswers link you found (http://seqanswers.com/forums/showthread.php?t=16665).

When we run the make cnvnator-multi we are getting the error:
g++ -m64 -O3 -DCNVNATOR_VERSION="v0.3" -I/net/gs/vol3/software/modules-sw/ROOT/5.34.14/Linux/RHEL6/x86_64/include -Isrc/samtools -c src/cnvnator.cpp -o src/obj/cnvnator.o
In file included from src/cnvnator.cpp:8:
src/HisMaker.hh:11:20: error: TFrame.h: No such file or directory

I'm wondering if it may be related to the version of Root we are using (5.34.14) or if that matters.

Otherwise, the speedseq installation is working and we can run the speedseq sv without the "-d" flag.

Also, other than reading about the impact on sensitivity and FDR in the Lumpy paper, I don't have a full understanding of how the read depth is utilized by Lumpy the the impact on the output. I'm sure there is a line in the README or something I have missed - could you point me to the best spot for that?

Thanks very much for your support and work on this software.

Kind regards,
Seamus Ragan

Allow user to specify CNVnator window size

Particularly useful for low read sequencing analysis.

pysam not installed issue

Hi there,

I was running speedseq realign, and it complains that pysam not installed. But I do have pysam installed, and the error message looks truncated... I can open python and import pysam without problem.
Do you have an idea what's the problem? Thanks.

Sourcing executables from /risapps/rhel6/speedseq/0.0.3/bin/speedseq.config ...

Checking for required python modules ()...

Program: speedseq
Version: 0.0.3a
Author: Colby Chiang ([email protected])

usage: speedseq [options]

command: align align FASTQ files with BWA-MEM
var call SNV and indel variants with FreeBayes
somatic call somatic SNV and indel variants in a tumor/normal pair with FreeBayes
sv call SVs with LUMPY
realign re-align from a coordinate sorted BAM file

options: -h show this message

Error: pysam is not installed for

speedseq sv on BAM without splitters/discordant

Do auto name-sorting and split/duplicate read extraction

LICENSE?

Hi!

In the paper you state that the code is open source

It would be awesome to see an explicit license for the repo 😃

running speedseq_setup.py twice does not re-download everything

Would it be possible for speedseq_setup to check pre-existing versions against the version that speedseq is trying to install. Additionally, can you pass a make -j <num_cores> into the make commands or run certain steps in parallel since installation takes a while.

Output splitters and discordants without query or quality strings

SAM spec allows * in the query and quality strings. Since these fields are not used by LUMPY, we can use a * and greatly reduce file size.

However, sambamba 0.4.7 has a bug that causes it to barf with those BAMs. Need to upgrade to sambamba 0.5.1, which has patched it. However, there are issues with creation and destruction of sort temp directories with sambamba 0.5.1 which need to be resolved before upgrading.

Error: pysam is not installed for

Hi,
I was used "speedseq sv" command to call SVs on the test data. It reported that pysam is not installed, while I really installed pysam (version 0.8.3) and imported it locally without reporting any errors.
The version of speedseq I used is 0.1.0. Does anyone have the same problem in running speedseq?
Many thanks.

allow non-interleaved fastq for speedseq aln

Query

Just a quick q...what kind of run times are you getting for freebayes on 100bp 30x PE exome-seq, for example?

Cheers

Steve

error writing all requested bytes to file

Thanks for developing speedseq, I've been using it to call SVs in quads and it usually works except occasionally I get the following error (pasted below). Could you please help resolve this issue? N.b. sometimes I resubmit failed jobs and they work.

Error in TFile::WriteBuffer: error writing all requested bytes to file /oasis/tscc/scratch/wb/lumpy/temp/speedseq_sv_74-0115/cnvnator-temp/03C14334-sorted-rmdups-realigned-bqsr.bam.root, wrote 1230 of 4272
Error in TTree::Fill: Failed filling branch:7.rd_parity, nbytes=-1, entry=4405788
This error is symptomatic of a Tree created as a memory-resident Tree Instead of doing:
TTree *T = new TTree(...)
TFile *f = new TFile(...) you should do:
TFile *f = new TFile(...)
TTree *T = new TTree(...)R__unzip: error -5 in inflate (zlib)
Error in TBasket::ReadBasketBuffers: fNbytes = 4272, fKeylen = 73, fObjlen = 31926, noutot = 0, nout=0, nin=4199, nbuf=31
926Error in TBranch::GetBasket: File: /oasis/tscc/scratch/wb/lumpy/temp/speedseq_sv_74-0115/cnvnator-temp/03C14334-so
rted-rmdups-realigned-bqsr.bam.root at byte:2721053490, branch:rd_parity, entry:4389825, badread=1, nerrors=1, basketnumber
=275R__unzip: error -5 in inflate (zlib)
Error in TBasket::ReadBasketBuffers: fNbytes = 4272, fKeylen = 73, fObjlen = 31926, noutot = 0, nout=0, nin=4199, nbuf=31
926
.
.
.
.
(and so on...)

ploidy level

Speedseq should have a command line parameter to pass the ploidy level to freebayes.

Awk for loop issue

On Ubuntu Linux, the for awk for-loops in the execution script are making problems, as the used syntax is only supported by an extension to the GNU-awk, as described in this issue at stack overflow: http://stackoverflow.com/questions/16921493/awk-illegal-reference-to-array-a

I changed the loops to the following notation: for(i=1;i in fmt;i++) and it seems to work. Could you adapt that to the source code in order to support all awk flavours?

Thanks

Cnvnator Error

We have installed Root and tested that it works. The SpeedSeq sv also works without the -d option. We have added
source /mnt/pan/Data4/speedseq/root-v5-34/bin/thisroot.sh to the end of the speedseq.config.
When we run the speedseq example (run_speedseq.sh) with the -d option to get CNV, we receive the following message:
--Example script:
../bin/speedseq sv
-o example
-B example.bam
-S example.splitters.bam
-D example.discordants.bam
-R data/human_g1k_v37_20_42220611-42542245.fasta
-d
***_Below is the message on the terminal_
Calculating read depth
Traceback (most recent call last):
File "/mnt/pan/Data4/local/vxv89/sxs1528/speedseq/speedseq/bin/cnvnator_wrapper.py", line 350, in
chroms_list = get_chroms_list(args.bam)
File "/mnt/pan/Data4/local/vxv89/sxs1528/speedseq/speedseq/bin/cnvnator_wrapper.py", line 153, in get_chroms_list
proc = subprocess.Popen(['samtools', 'view', '-H', bam_fn], stdout = subprocess.PIPE)
File "/home/sxs1528/anaconda/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/home/sxs1528/anaconda/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Please give us suggestions as what to do next.

Thank you

check for BWA index

check for BWA index before alignment rather than hanging forever while throwing a completely uninformative error

terriblecoding

is glia going to be part of speedseq var?

Thanks.