Giter VIP home page Giter VIP logo

bowtie's People

Contributors

aidanreilly8 avatar alagu avatar benlangmead avatar bmwiedemann avatar ch4rr0 avatar christopherwilks avatar daissi avatar ianml avatar knkarthik avatar mlafave avatar val-antonescu avatar wookietreiber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bowtie's Issues

Incorrect results when using large index

I found that when mapping a pair of reads using bowtie 1 (tested with 1.1.1 and 1.1.2), the results when using a large index (a truly large index, not a large-index created from a small fasta) are incorrect.

The command used here was:

bowtie-1.1.1/bowtie --tryhard -a --quiet -X 2000 -v 3 --12 /var/tmp/job_o45b8iO6CN_bowtieIn.tmp Ensembl/Homo_sapiens/release-84/cdna_unspliced/lib > bowtie_o45b8iO6CN_complete_largeidx_output.txt

To prove that the large index was the problem, I split the input fasta I used to generate the large index in two parts, and generated small indexes from each part. Then I ran the same bowtie analysis on each small-index part and concatenated the results.

The two-part small index analysis returned correct results (see bowtie_o45b8iO6CN_split_merged_output.txt), returning perfect matches on ENST00000551148, ENST00000549155, ENST00000546991, ENST00000392979 and ENST00000392977.
The large-index analysis however, returned

  • a perfect match for ENST00000551148 and ENST00000549155
  • a match with mismatches for ENST00000392977
  • no match for ENST00000546991 and ENST00000392979

This page proves that the read-pair from the input file should match perfectly to the five above mentioned transcripts.

Can anyone please look into this?

ps. I would add the fasta-file I used to generate this large index but the file is 5.8Gb so not fit for upload to github. If you want it, let me know, then we'll see how I could share it.
bowtie_o45b8iO6CN_complete_largeidx_output.txt
bowtie_o45b8iO6CN_split_merged_output.txt
job_o45b8iO6CN_bowtieIn.txt

bowtie-1.2.1.1 fails w/ SIGPROF and Python 2.7.12

I've installed bowtie-1.2.1.1 with our Python 2.7.12 installation on RHEL7. I've seen this weird behavior where bowtie --version can be run fine several times in a row, but then suddenly crashes with a SIGPROF (Profiling timer expired) from the kernel. This does never occurs with Python 2.5 that comes with the OS. Any explanation? bowtie2 appears to run fine under Python 2.7.12.

Bowtie 1.2 tutorial "No such file or directory"

Attempting to run the Bowtie 1.2 tutorial, I run into an early issue.

fisherlabs-imac:bowtie-1.2 fisherlab$ bowtie e_coli reads/e_coli_1000.fq
Traceback (most recent call last):
File "/Users/fisherlab/Desktop/bowtie-1.2/bowtie", line 91, in
main()
File "/Users/fisherlab/Desktop/bowtie-1.2/bowtie", line 88, in main
os.execv(bin_spec, bowtie_args)
OSError: [Errno 2] No such file or directory

I believe my path is correct. I experience this issue whether or not I use ./ before the bowtie command. I may just be inexperienced, but it would be great to have this issue clarified. If there is a bug, I hope it can be corrected!

small bug in bowtie script affecting identification of large indexes

The current bowtie script is setting its last argument as the index file basename, which, for a command like this

bowtie e_coli reads/e_coli_1000.fq

will return the name of the reads file as the index basename, rather than the basename provided on the command line.

This doesn't matter for small indexes, because the small index binary is called by default, and this binary is able to find the appropriate indexes. However, this issue causes bowtie to miss large indexes, unless the --large-index flag is used.

For the command above, this could be fixed by changing:

idx_basename = arguments[-1]

to

idx_basename = arguments[-2]

but this would still fail for large indexes with commands like:

bowtie -t e_coli reads/e_coli_1000.fq e_coli.map

This issue surfaced for me while attempting some large Trinity assemblies that call bowtie. With the introduction of large indexes, it looks like a different method should be used for identifying idx_basename in the bowtie command.

if '--large-index' in options:
        bin_spec = os.path.join(ex_path,bin_l)
    elif len(arguments) >= 1:
        idx_basename = arguments[-1]
        large_idx_exists = os.path.exists(idx_basename + idx_ext_l)
        small_idx_exists = os.path.exists(idx_basename + idx_ext_s)
        if large_idx_exists and not small_idx_exists:
            bin_spec = os.path.join(ex_path,bin_l)

Bowtie cannot be killed on MacOS

I am running bowtie2.2.9 on MacOS 10.12.2. I wanted to abort the program, which was running in the background, but there was no way to do so. I tried to "Quit" and "Force Quit" through the task manager but the program kept running. Then I tried kill -9 and kill -15 in the terminal window, but the program still kept running at 100% CPU usage. The only way to stop the program was to reboot my machine. I have never seen this with any program. Does anyone knows the problem?

problem with fastq input

I have a fastq file like this, it's from a crispr screening:
@ERR376999.1 1/1
CCTGACAGTCTCCCGCGCT
+
EEEEEFFEFEEFEEEEEEF
@ERR376999.2 2/1
CATTTATTTTTCGGAGTGC
+
EEEEEEFEEFEEEEEFEFF
@ERR376999.3 3/1
AGTTCAAAATCCAGTCTAC

The file can be aligned with bowtie2 without any problem but when I'm trying to align it using bowtie, it reported error:
Saw ASCII character 10 but expected 33-based Phred qual.

I tried the untrimmed fastq file, it also reported error:
Error: reads file does not look like a FASTQ file

the untrimmed file looks like this:
@ERR376999.1 1/1
CTTGTGGAAAGGACGAAACACCGCCTGACAGTCTCCCGCGCTGTTTTAGA
+
B@CDDDDEEEEEEEEEEEEEEFEEEEEEFFEFEEFEEEEEEFEEEEEDED
@ERR376999.2 2/1
CTTGTGGAAAGGACGAAACACCGCATTTATTTTTCGGAGTGCGTTTTAGA
+
B@CDDDDEDEEEEEEEEEEEFFFEEEEEEFEEFEEEEEFEFFEEEFFEEE
@ERR376999.3 3/1
CTTGTGGAAAGGACGAAACACCGAGTTCAAAATCCAGTCTACGTTTTAGA

what could be the problem? Thanks

No reads processed from stdin in tab5 format when using --12 -

It appears reads in tab5 format are processed when using --12 filename but not when read from stdin using --12 -.

$ bowtie --version
bowtie-align version 1.2
64-bit
Built on ElCapitan.local
Sun Feb 19 14:05:23 GMT 2017
Compiler: InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Options: -O3 -m64  -stdlib=libstdc++ -DWITH_TBB -DPOPCNT_CAPABILITY -DNO_SPINLOCK -DWITH_QUEUELOCK=1  
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

$ cat fragments.tab5 
H06HDADXX130110:2:2116:3345:91806	GTTAGGGTTAGGGTTGGGTTAGGGTTAGGGTTAGGGTTAGGGGTAGGGTTAGGGTTAGGGGTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTAGGGCTAGGGTTAAGGGTAGGGTTAGCGAAAGGGCTGGGGTTAGGGGTGCGGGTACGCGTAGCATTAGGGCTAGAAGTAGGATCTGCAGTGCCTGACCGCGTCTGCGCGGCGACTGCCCAAAGCCTGGGGCCGACTCCAGGCTGAAGCTCAT	>=<=???>?>???=??>>8<?>=2=<===1194<?;:?>>?#3==>###########################################################################################################################################################################################################	TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTACCCCTAACCCTAACCCTAACCCTAACCCGTACCCTAAACCCAACCCTAACCACAAAGCAAATCCCAACCTTAACCGGAACCCGAAATCTCGCAGCAAATCTGCAGTAGAGACGCAGACTCAACCATGCGTCTATTAGTACGCATTATCATTGCCTCATGCTTCTTAAGTACAGAGAGATGAC	==;<?>@@@<>>@??<>>???<=>>?>:><@?4=:>7=5=>:=@;'@A?########################################################################################################################################################################################################
...

$ bowtie -S GCA_000001405.25_GRCh38.p10_genomic --12 fragments.tab5 
@HD	VN:1.0	SO:unsorted
@SQ	SN:CM000663.2	LN:248956422
...
@PG	ID:Bowtie	VN:1.2	CL:"bowtie-align --wrapper basic-0 -S GCA_000001405.25_GRCh38.p10_genomic --12 fragments.tab5"
H06HDADXX130110:2:2116:3345:91806	77	*	0	0	*	*	0	0	GTTAGGGTTAGGGTTGGGTTAGGGTTAGGGTTAGGGTTAGGGGTAGGGTTAGGGTTAGGGGTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGGTAGGGCTAGGGTTAAGGGTAGGGTTAGCGAAAGGGCTGGGGTTAGGGGTGCGGGTACGCGTAGCATTAGGGCTAGAAGTAGGATCTGCAGTGCCTGACCGCGTCTGCGCGGCGACTGCCCAAAGCCTGGGGCCGACTCCAGGCTGAAGCTCAT	>=<=???>?>???=??>>8<?><=2=<===1194<?;:?>>?#3==>###########################################################################################################################################################################################################	XM:i:0
H06HDADXX130110:2:2116:3345:91806	141	*	0	0	*	*	0	0	TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTACCCCTAACCCTAACCCTAACCCTAACCCGTACCCTAAACCCAACCCTAACCACAAAGCAAATCCCAACCTTAACCGGAACCCGAAATCTCGCAGCAAATCTGCAGTAGAGACGCAGACTCAACCATGCGTCTATTAGTACGCATTATCATTGCCTCATGCTTCTTAAGTACAGAGAGATGAC	==;<?>@@@<>>@??<>>???<=>>?>:><@?4=:>7=5=>:<=@;'@A?########################################################################################################################################################################################################	XM:i:0
...
# reads processed: 4
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 4 (100.00%)
No alignments

$ cat fragments.tab5 | bowtie -S GCA_000001405.25_GRCh38.p10_genomic --12 -
@HD	VN:1.0	SO:unsorted
@SQ	SN:CM000663.2	LN:248956422
...
@PG	ID:Bowtie	VN:1.2	CL:"bowtie-align --wrapper basic-0 -S GCA_000001405.25_GRCh38.p10_genomic --12 -"
# reads processed: 0
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 0 (0.00%)
No alignments

bowtie_inspect.cpp:197: bad delete ?

bowtie_inspect.cpp:197:2: warning: 'delete' applied to a pointer that was allocated with 'new[]'; did you mean 'delete[]'? [-Wmismatched-new-delete]

From

delete buf;

to

delete [] buf;

Add Support for gzipped FASTQ Files

It would be convenient if the input files could be compressed, when Bowtie is used within RSEM and I have no control of the Bowtie command line being executed.

Could not locate a Bowtie index corresponding to basename "genome"

Hi
Recently, i was using bowtie 1.2.1.1 to map fastq files to mouse genome. The following are my command lines.
bowtie-build --threads 10 Mus_musculus.GRCm38.dna_rm.toplevel.fa genome
image

nohup bowtie genome /data/disk4/chip/SRR3051599.fastq -S nibpl.sam -p 20 -m 1 &

However, i got error message "Could not locate a Bowtie index corresponding to basename "genome"

I do not know how to solve this problem.

Appreciate for your help

bowtie-1.2.1.1 randomly reports wrong lines of output when processing reads from stdin

I've used bowtie 1.2.1.1 with input from stdin and with the following parameters:
zcat DATAFILE.fastq.gz | bowtie -q --sam --phred33-quals --tryhard --best --strata --chunkmbs 256 -p 8 -v 1 -M 1 PATHTOINDEX/GRCm38.p4 -
The call reported in the resulting sam file is the following:
bowtie-align --wrapper basic-0 -q --sam --phred33-quals --tryhard --best --strata --chunkmbs 256 -p 8 -v 1 -M 1 PATHTOINDEX/GRCm38.p4 -
Unfortunately, this command produced random erroneous output, which is not reproducible since it happens randomly. An example of a wrong output line would be:
NB501946:83:HLJCMBGX3:1:11101:24657:11782 16 NB501946: 3:HLJCMBGX3: :11101:23739:11977 0 EEEEEA XA:i: TAAGTGCGTGCATGTATATGA AEEEEEEEEEEEEEEEEEEEE XA:i:1 MD:Z:20T0 NM:i:1 XM:i:2
After re-running the same command, the line is output correctly as:
NB501946:83:HLJCMBGX3:1:11101:24657:11782 16 chr1 49321593 0 18M * 0 0 TGGGGGCTCGTCCGGGAT EAEEEEEEEEEEEEEEEA XA:i:0 MD:Z:18 NM:i:0 XM:i:2
All of this was run in a HPC environment running slurm and integrated in a pipeline based on bpipe.
Do you have any idea what could be the problem? Running the same commands with bowtie 1.1.2, everything ran smoothly and without any errors.

pending difficulty with finding long index

Following an earlier issue report the line
idx_basename = arguments[-1]
in the script "bowtie" was changed into
idx_basename = arguments[-2]
This however does not solve the problem, because a command line is like
bowtie [options]* {-1 -2 | --12 | } []
Hence the index can have a position index anything from -1 to -5.

Trying to run bowtie source files

I would like to know how to run the source files of bowtie. I downloaded the source code files and I am trying to run the source code files but till now i am not successful. There are many CPP and python files such as

bowtie_build_main.cpp
bowtie.py 

and many others.

When i try to run

g++ bowtie_build_main.cpp -o bowtie_build_main

I get the following error:

/tmp/ccpcVAHD.o: In function `main':
bowtie_build_main.cpp:(.text+0x355): undefined reference to `bowtie_build'
bowtie_build_main.cpp:(.text+0x420): undefined reference to `bowtie_build'
collect2: error: ld returned 1 exit status

or when i try to run bowtie.py
I get

python: can't open file 'bowtie.py': [Errno 2] No such file or directory

How can i run the source code files . Thanks

Do fast bowtie with option -c

Hello

I tried to use these command like a subrutine in my python script
seq1 = CATGTAGCTAG
subprocess.call(bowtie -c index.bwt seq1)

But it takes around 3s. for each search, And I have a lot of seqs. And I need do fast. Exist a possible do more efficient in my code.

Pd: I now that if I use a file that contain all my seqs, bowtie is fast and efficient.

Saludos.

How to use Windows binaries

I'm trying to use the CasFinder software (http://arep.med.harvard.edu/CasFinder/) on Windows. This tool requires bowtie for genome aligments, and you have to configure CasFinder to point to the bowtie executable. Under OS X, the following works:

#### EXECUTABLES
bowtie_executable   /Users/rintzezelle/Documents/Mascoma/bowtie-1.1.0/bowtie

The bowtie documentation says that there are Windows binaries, so I downloaded and extracted bowtie-1.1.0-mingw-x86_64.zip, and pointed CasFinder to "...\bowtie-1.1.0-mingw-x86_64\bowtie" but got the error "...\bowtie-1.1.0-mingw-x86_64\bowtie" is not recognized as an internal or external command, operable program or batch file."

I looked at the "bowtie" file, and it's just a piece of Python code. How is one supposed to use the Windows binaries? Should I change CasFinder to directly call one of the .exe files?

Version 1.2 Causes Improperly Paired Alignments Error in RSEM

Alignments generated by version 1.2 - but not version 1.0.1 - cause an error to be emitted by rsem-parse-alignments and therefore preventing transcript estimation from going ahead.

rsem-parse-alignments /home2/starfish/RSEM/starfish S1.temp/S1 S1.stat/S1 S1.temp/S1.bam 3 -tag XM
Warning: Detected a read pair whose two mates have different names--3NH4HQ1:246:C59MKACXX:5:1101:2705:2182 and 3NH4HQ1:246:C59MKACXX:5:1101:1263:2096!
Read 3NH4HQ1:246:C59MKACXX:5:1101:2705:2182: The two mates do not align to a same transcript! RSEM does not support discordant alignments.

Another user of RSEM had the exact same problem and he had to abandon Bowtie in favour of kallisto because of this blocker. He also found that the problem could be avoided by using Bowtie 2. Can this regression be fixed in Bowtie 1 by version 1.3?

Reads Can Map Entirely Beyond Reference Sequence End

I have a set of thousands of reference sequences with high sequence similarity (i.e. alleles of HLA genes). I notice that bowtie sometimes maps reads beyond the ends of a small number of reference sequences. If I make a minimal example with only one reference sequence and one pair of reads that were mapped beyond the boundary of that particular reference sequence, then bowtie doesn't align the read pair to the reference sequence it mistakenly did before (the read is reported as unaligned). I used the command bowtie -v 0 -a -S indexes/IMGT-HLA/hla -1 R1.fq -2 R2.fq  test.sam. I used version 1.2 downloaded from the website which is pre-compiled for Linux.

beyond

CasFinder specificity filter is not working

I run a CasFinder with installation test file, and my results are not same with provided test results. The program executes normally (no errors in log file), however it shows all target seq. without sorting out off targets and target_threshold_rejection_score is always Inf 1. (running it on windows). so according to test file i have to receive 146 possible cas targets and among them 39 pass specificity conditions. But in my case all 146 pass specificity condition. How can i solve this issue?

Crash when built with clang

Just a heads-up: When building bowtie 1.1.2 with clang 3.4, we're seeing intermittent crashes. Below
is a backtrace from lldb. Maybe it will reveal a bug in the code that others are encountering as well.

For now, we're working around this by compiling with gcc 4.8.

Cheers,

Jason

FreeBSD unixdev.local bacon ~/trinity-test 408: lldb37 -f /usr/local/bin/bowtie-align-s -c trinity_out_dunn/bowtie-align-s.core
(lldb) target create "/usr/local/bin/bowtie-align-s" --core "trinity_out_dunn/bowtie-align-s.core"
Core file '/home/bacon/trinity-test/trinity_out_dunn/bowtie-align-s.core' (x86_64) was loaded.
(lldb) bt

  • thread #1: tid = 0, 0x000000080147e6ca libc.so.7`__sys_thr_kill + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT

    • frame #0: 0x000000080147e6ca libc.so.7__sys_thr_kill + 10 frame #1: 0x0000000801553219 libc.so.7abort + 73
      frame #2: 0x000000000048e82f bowtie-align-stthread::thread::wrapper_function(aArg=<unavailable>) + 79 at tinythread.cpp:175 frame #3: 0x00000008008e34f5 libthr.so.3??? + 277
      (lldb) bt all
  • thread #1: tid = 0, 0x000000080147e6ca libc.so.7`__sys_thr_kill + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT

    • frame #0: 0x000000080147e6ca libc.so.7__sys_thr_kill + 10 frame #1: 0x0000000801553219 libc.so.7abort + 73
      frame #2: 0x000000000048e82f bowtie-align-stthread::thread::wrapper_function(aArg=<unavailable>) + 79 at tinythread.cpp:175 frame #3: 0x00000008008e34f5 libthr.so.3??? + 277

    thread #2: tid = 1, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x0000000884416080, p=0x0000000884438000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x0000000884428240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007ffffedf6e08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b2a0) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #3: tid = 2, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x0000000843816080, p=0x0000000843838000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x0000000843828240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007ffffeff7e08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b260) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #4: tid = 3, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x00000008e4c16080, p=0x00000008e4c38000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x00000008e4c28240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007fffff1f8e08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b220) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #5: tid = 4, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x00000008a4816080, p=0x00000008a4838000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x00000008a4828240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007fffff3f9e08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b1e0) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #6: tid = 5, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x0000000803416080, p=0x0000000803438000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x0000000803428240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007fffff5fae08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b1a0) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #7: tid = 6, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x0000000884016080, p=0x0000000884038000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x0000000884028240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007fffff9fce08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b120) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #8: tid = 7, 0x000000080147ef2a libc.so.7_sched_yield + 10, name = 'bowtie-align-s', stop reason = signal SIGABRT frame #0: 0x000000080147ef2a libc.so.7_sched_yield + 10
    frame #1: 0x0000000000493fc5 bowtie-align-sSAMHitSink::reportUnOrMax(this=0x0000000801cfc1c0, p=<unavailable>, hs=<unavailable>, un=<unavailable>) + 1701 at fast_mutex.h:186 frame #2: 0x000000000042b4ce bowtie-align-sHitSinkPerThread::finishRead(this=0x0000000803016080, p=0x0000000803038000, report=, dump=) + 334 at hit.h:1011
    frame #3: 0x0000000000444396 bowtie-align-sUnpairedAlignerV2<EbwtRangeSource>::advance(this=0x0000000803028240) + 438 at aligner.h:515 frame #4: 0x0000000000431089 bowtie-align-sMixedMultiAligner::run(this=0x00007fffffbfde08, verbose=) + 409 at aligner.h:244
    frame #5: 0x0000000000416577 bowtie-align-sseededQualSearchWorkerFullStateful(vp=<unavailable>) + 2071 at ebwt_search.cpp:2233 frame #6: 0x000000000048e7f3 bowtie-align-stthread::thread::wrapper_function(aArg=0x0000000801c1b0e0) + 19 at tinythread.cpp:169
    frame #7: 0x00000008008e34f5 libthr.so.3`??? + 277

    thread #9: tid = 8, 0x00000008008ee8cc libthr.so.3, name = 'bowtie-align-s', stop reason = signal SIGABRT
    frame #0: 0x00000008008ee8cc libthr.so.3
    frame #1: 0x000000000041056b bowtie-align-svoid driver<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> > >(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) [inlined] void seededQualCutoffSearchFull<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> > >(seedLen=int at scalar, seedMms=int at scalar) + 1012 at ebwt_search.cpp:2335 frame #2: 0x0000000000410177 bowtie-align-svoid driver<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc > >(type=, ebwtFileBase=, query=, queries=, qualities=, outfile=) + 10391 at ebwt_search.cpp:2749
    frame #3: 0x0000000000406f7e bowtie-align-s::bowtie(argc=17, argv=0x00007fffffffe678) + 5726 at ebwt_search.cpp:2933 frame #4: 0x00000000004979bd bowtie-align-smain(argc=17, argv=0x00007fffffffe678) + 77 at bowtie_main.cpp:55
    frame #5: 0x0000000000404edf bowtie-align-s`_start + 367

Eliminate `--refout`

--refout is not frequently used and overcomplicates the output code in Bowtie, especially as it relates to locking and batch output. I suggest we publicly deprecate it and remove it from the manual for the next release, then remove it entirely in the following release.

misbehavior of --un parameter when more than 1 core is used

According to the manual "As always, the --un, --max and --al parameters print reads exactly as they appeared in the input file."

Turns out it is not exactly true for [--un]. If more than a single processor is used, bowtie apparently sorts (or randomizes?) reads by names before reporting to a file. Moreover, it does so in chunks limited by the number of active processors. If the original fastq file was unsorted or partially sorted then output fastq will have the different order or reads. This is not the issue with a single core because then the chunks consist of only one read and no sorting is done. It feels like an unintended behavior.
Example with bowtie 1.1.2
Read names extracted from the original fastq. As you can tell - partially sorted
@HISEQ:942:HFKY5BCXY:1:1101:1247:2108
@HISEQ:942:HFKY5BCXY:1:1101:1089:2111
@HISEQ:942:HFKY5BCXY:1:1101:1142:2118
@HISEQ:942:HFKY5BCXY:1:1101:1237:2121
@HISEQ:942:HFKY5BCXY:1:1101:1162:2124
@HISEQ:942:HFKY5BCXY:1:1101:1118:2124

bowtie -p 1 --un gives exact same order (if all of them unmapped of course)
@HISEQ:942:HFKY5BCXY:1:1101:1247:2108
@HISEQ:942:HFKY5BCXY:1:1101:1089:2111
@HISEQ:942:HFKY5BCXY:1:1101:1142:2118
@HISEQ:942:HFKY5BCXY:1:1101:1237:2121
@HISEQ:942:HFKY5BCXY:1:1101:1162:2124
@HISEQ:942:HFKY5BCXY:1:1101:1118:2124

bowtie -p 50 --un gives scrambled order, different from the original
@HISEQ:942:HFKY5BCXY:1:1101:1089:2111
@HISEQ:942:HFKY5BCXY:1:1101:1103:2154
@HISEQ:942:HFKY5BCXY:1:1101:1083:2150
@HISEQ:942:HFKY5BCXY:1:1101:1118:2124
@HISEQ:942:HFKY5BCXY:1:1101:1120:2216
@HISEQ:942:HFKY5BCXY:1:1101:1162:2124

PS. I just noticed that the newer version of bowtie is available (1.2.0), haven't tested it yet.

Option To Exclude Unmapped Reads From SAM

Currently, unmapped reads are included in the SAM file. I have a scenario where 99% of the reads won't map to the reference sequences used (i.e. mapping only to a gene family). This creates unnecessarily large files which need to be filtered to reduce their size (e.g. sed). It'd be nice to be able to specify an option to bowtie that would prevent it from ever outputting those reads to the SAM file to increase the speed of the pipeline by doing less postprocessing (filtering) to the resulting SAM file produced by bowtie.

Bowtie 1.2 error when using symlink

If I call bowtie using a symlink in Linux (Ubuntu 14.04 LTS) it does not work. I used the binary version. Also the "legacy" package shows the same error.

$ ln -s bowtie-1.2-legacy/bowtie bowtie
$ ./bowtie
Traceback (most recent call last):
File "./bowtie", line 91, in
main()
File "./bowtie", line 88, in main
os.execv(bin_spec, bowtie_args)
OSError: [Errno 2] No such file or directory

With v1.1.2 it is working.

Enrico

Error while writing string output

I'm receiving an error when running with the -p option on an Ubuntu 14.04 machine (tested on a couple of servers)

Here's an example (filenames and paths redacted to spare you those :) ):

command:
./bowtie --verbose -p $(nproc) -v2 -m1 [index file] [fastq file] [output file]

output:
INFO: Command: /bowtie/bowtie-align-s --wrapper basic-0 -p 6 -v2 -m1 [index file] [fastq file] [output file]
Error while writing string output; 1695188 characters in string, 8192 written
terminate called after throwing an instance of 'int'

A quick search shows this error being written to stderr on line 472 of filebuf.h

Thanks for your help!

invalid fastq files produced using --un

When I map a sample of reads, I see a small number of fastq entries in the umapped file that do not have a leading @ symbol...

The only odd thing I can think of about this dataset is that it has been adapter trimmed and has some 0 length reads. I removed those 0 length reads and do not see the missing prefix reads.

Reads and the corrupt unmapped reads file can be found at:

https://nebiolabs-my.sharepoint.com/personal/langhorst_neb_com/_layouts/15/guestaccess.aspx?guestaccesstoken=WrkBbCgRVJAXVnxTAVmzpyv9DTdHraI8KfQp7KeSimY%3d&docid=0a458680972664713b9951644f44dab26

Sorry for the terrible link - it seems that github won't take a tgz file attachment (3M)

I tried to map these to rhodo CGA009 - http://www.ncbi.nlm.nih.gov/nuccore/39933080?report=fasta

I think this is a bug.

Bowtie-1.2 will not compile with Intel

Is there any instructions around for building bowtie-1.2 with the Intel C++ compiler. I am moving forward slowly by modifying header files, which does not seem to be the optimal solution. Any help in this area would be appreciated.

Manual formatting issue

Starting from around the description of the XA:i:<N> extra flag, the formatting goes weird, with some monospace sections and over-indenting.

Add support for named pipes / process substitution

bowtie-1.0.0 currently does not support named pipes or process substitution. These are useful when one wants to uncompressing/trim/filter reads on-the-fly, but can't pipe all input data to bowtie's stdin (e.g., when aligning mates).

Here's an example of the error that occurs when the shell (ksh93 in this case) implements process substitution using named pipes:

$ bowtie -S TRANS -1 <(gzip -dc left.fq.gz | my-filter-script) -2 <(gzip -dc right.fq.gz | my-filter-script)
...
@PG     ID:Bowtie       VN:1.0.0        CL:"bowtie -S TRANS -1 /tmp/ksh.f0fiaun -2 /tmp/ksh.f0lliut"
Warning: Could not open read file "/tmp/ksh.f0fiaun" for reading; skipping...
Command: bowtie -S -1 /tmp/ksh.f0fiaun -2 /tmp/ksh.f0lliut TRANS

Tracing system calls with the "truss" command revealed the following sequence of open and close commands:

36612: open("/tmp/ksh.f0fiaun",O_RDONLY,0666)    = 3 (0x3)
36612: open("/tmp/ksh.f0eo6sb",O_RDONLY,0666)    = 4 (0x4)
...
36612: close(3)                                  = 0 (0x0)
36612: open("/tmp/ksh.f0lliut",O_RDONLY,0666)    ERR#2 'No such file or directory'

The named pipe /tmp/ksh.f0fiaun was opened, closed (after which ksh removes the named pipe), and opened again (error, since the named pipe no longer exists). If, instead of using process substitution, named pipes are directly created by the user with the mkfifo command, then there is no error upon the second open(), but bowtie blocks while trying to read from the named pipe, as there would be no writer process (the writer would have received a SIGPIPE when bowtie close()es the named pipe and would have probably terminated).

Updating bowtie to call open() / fopen() only once for a given input reads file would resolve the issue.

Failure to truncate R2 read names with spaces

According to the documentation, bowtie truncates read names if they contain spaces. However, it works only for R1 and not R2. For example, the following mates taken from a standard NCBI data set are aligned against GRCm38.p5 transcripts with bowtie v1.2.1.1

bowtie -q --phred33-quals -n 2 -e 200 -l 25 -I 1 -X 1000 --sam-nosq -a -m 200 \
   -S transcripts -1 R1.fq -2 R2.fq > output.sam

R1.fq

@SRR388248.10 110426_HWI-ST642_29.FF:8:1101:1965:1986 length=100
GCGGGAATAGTGGGTACTGCACTAAGTATTTTAATTCGAGCAGAATTTGG
+
CCCCCCCCCC?CCCC>066:.70.8'(.14A?8?=?7=6?;?;<?0??3:

R2.fq

@SRR388248.10 110426_HWI-ST642_29.FF:8:1101:1965:1986 length=100
CAACCAGGTGCACTTTTAGGAGATGACCAAATTCTGCTCGAATTAAAATA
+
HHHHHHHHFHHHHHHHHGHHHHHHHHHHHHHHHHGHHHHHHFHHHHHHEF

output.sam (lines truncated for clarity)

@HD     VN:1.0  SO:unsorted
@PG     ID:Bowtie       VN:1.2.1.1      CL:"bowtie-align --wrapper basic-0 -q --phred33-quals -n 2 ..."
SRR388248.10    77      *       0       0       *       *       0       0       ...
SRR388248.10 110426_HWI-ST642_29.FF:8:1101:1965:1986 length=100 141     *       0       0       *       *       0       0       ...

Wont read fastq file from STDIN in v1.2.0

Hi,
Since the update to v1.2.0 I cannot complete alignments runs that worked with bowtie v1.1.2 and that read alignment files from STDIN.

e.g.
Works with Bowtie 1.1.2:
cat reads.left.fastq | bowtie --phred33-quals -t -p 24 -n 3 -l 20 -e 300 --best --sam --chunkmbs 128 --minins 101 --maxins 400 default_db -1 - -2 reads.right.fastq > test

But does not work with Bowtie 1.2.0:
Time loading reference: 00:00:00
Time loading forward index: 00:00:04
Time loading mirror index: 00:00:02
Error: reads file does not look like a FASTQ file
Error: reads file does not look like a FASTQ file
Error: reads file does not look like a FASTQ file
...

But, of course this does work in both version:
bowtie --phred33-quals -t -p 24 -n 3 -l 20 -e 300 --best --sam --chunkmbs 128 --minins 101 --maxins 400 default_db -1 reads.left.fastq -2 reads.right.fastq > test

bowtie 1.2.1.1 --strata reports all valid alignments instead of best stratum

Sorry for the brief example, but the way --strata operates seems to have changed between version 1.2 and version 1.2.1.1:

First, expected behaviour, of version 1.2:

~/src/bowtie-1.2-legacy/bowtie -f -v 1 -a --suppress 2,5,6 --best nHp.2.0 seq1.fa 
seq1	nHp.2.0.scaf00033	485587	0	
seq1	nHp.2.0.scaf04212	22562	0	8:G>T
seq1	nHp.2.0.scaf00720	38269	0	14:T>A
~/src/bowtie-1.2-legacy/bowtie -f -v 1 -a --suppress 2,5,6 --best --strata nHp.2.0 seq1.fa 
seq1	nHp.2.0.scaf00033	485587	0	

(i.e. --strata only reports the first hit, with 0 mismatches)

Now, unexpected behaviour, of version 1.2.1.1:

~/src/bowtie-1.2.1.1/bowtie -f -v 1 -a --suppress 2,5,6 --best nHp.2.0 seq1.fa 
seq1	nHp.2.0.scaf00033	485587	0	
seq1	nHp.2.0.scaf04212	22562	0	8:G>T
seq1	nHp.2.0.scaf00720	38269	0	14:T>A
~/src/bowtie-1.2.1.1/bowtie -f -v 1 -a --suppress 2,5,6 --best --strata nHp.2.0 seq1.fa 
seq1	nHp.2.0.scaf00033	485587	2	
seq1	nHp.2.0.scaf04212	22562	2	8:G>T
seq1	nHp.2.0.scaf00720	38269	2	14:T>A

(i.e. even when --strata is used, all valid alignments are still reported, and they are being treated as the same stratum even though one has a perfect hit and the other two have one mismatch)

Many thanks,

Cei

simple_tests.pl fails on Mac OS Yosemite

❯❯❯ perl /usr/local/Cellar/bowtie/1.1.1/libexec/scripts/test/simple_tests.pl 2>&1 |tee simple_tests.log
/usr/local/Cellar/bowtie/1.1.1/bin/bowtie-build --large-index  --quiet  .simple_tests.pl.fa .simple_tests.tmp
/usr/local/Cellar/bowtie/1.1.1/bin/bowtie --large-index  -v 0 -c --quiet -a .simple_tests.tmp -1 AACGAAAG -2 CCATCTA
Died at /usr/local/Cellar/bowtie/1.1.1/libexec/scripts/test/simple_tests.pl line 255, <BT> line 1.

If I compile bowtie on Mavericks and copy the resulting executables to the Yosemite machine, it works.
See https://github.com/Homebrew/homebrew-science/pull/1627#issuecomment-68784032

Alignment results vary depending on read names?

I am finding that varying the read name of a .fasta entry causes differences in the output number of multi-mapping sites reported by bowtie.

Here's an example input:

>p
TCTTGGATACGATATATAAGAAAAATG
>p1
TCTTGGATACGATATATAAGAAAAATG
>p2
TCTTGGATACGATATATAAGAAAAATG
>p12
TCTTGGATACGATATATAAGAAAAATG

I aligned to an index built with bowtie-build version 1.1.0 and the alignment was tested with bowtie 1.1.0 and 1.2.1.1.

The sequence for the genome we used is here.

Here is the output of the number of reported reads for each name:

$ bowtie --verbose -k 10 -f -S -p 10 $INDEXPATH/Ghirsutum_458_v1.0 tmp.fa |grep -v "@" |cut -f 1 |sort |uniq -c
INFO: Command: /usr/bin/bowtie/bowtie-align-s --wrapper basic-0 -k 10 -f -S -p 10 $INDEXPATH/Ghirsutum_458_v1.0 tmp.fa
# reads processed: 4
# reads with at least one reported alignment: 4 (100.00%)
# reads that failed to align: 0 (0.00%)
Reported 21 alignments to 1 output stream(s)
      3 p
      9 p1
      6 p12
      3 p2

Is this the expected behavior?

Create Per-Sample Read Mapping Summary Files

It'd be convenient if the number of reads processed and mapped was automatically recorded on disk, like STAR does by creating a sampleLog.final.out file for each sample. Calculating such statistics with samtools commands is time-consuming if the -a reporting mode is used for mapping to a transcriptome assembly, for example.

wrong report of reads processed when "-k" is used in Bowtie 1.2.1.1

It looks like Bowtie 1.2.1.1. (the pre-built binaries for x64 Linux) reports wrongly the number of processed when the command line parameter -k is used.

Here is how it can be reproduced:

wget ftp://ftp.ensembl.org/pub/release-81/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz

wget https://sourceforge.net/projects/fusioncatcher/files/test/reads_1.fq.gz

gzip -d *.gz

mkdir transcriptome

/tools/bowtie-1.2/bowtie-build \
-f \
--quiet \
--offrate 1 \
--ftabchars 7 \
Homo_sapiens.GRCh38.cdna.all.fa \
transcriptome/

When running bowtie 1.2 as follows:

/tools/bowtie-1.2/bowtie \
-k 10 \
transcriptome/ \
reads_1.fq  \
b12.map

one has:

# reads processed: 12546
# reads with at least one reported alignment: 11906 (94.90%)
# reads that failed to align: 640 (5.10%)
Reported 77282 alignments to 1 output stream(s)

When running bowtie 1.2.1.1. as follows:

/tools/bowtie-1.2.1.1/bowtie \
-k 10 \
transcriptome/ \
reads_1.fq  \
b1211.map

one has:

# reads processed: 77922
# reads with at least one reported alignment: 77282 (99.18%)
# reads that failed to align: 640 (0.82%)
Reported 77282 alignments to 1 output stream(s)

The input FASTQ file reads_1.fq has 12546 reads and not 77922 as Bowtie 1.2.1.1 reports!

When running bowtie 1.2.1.1 without -k then it reports correctly the number of reads processed

/tools/bowtie-1.2.1.1/bowtie \
transcriptome/ \
reads_1.fq  \
b1211_.map

one has:

# reads processed: 12546
# reads with at least one reported alignment: 11906 (94.90%)
# reads that failed to align: 640 (5.10%)
Reported 11906 alignments to 1 output stream(s)

This bug might be related to #54 .

Error for colorspace data

Hi,

After version 1.2.0, an error occurs when aligning colorspace data generated by SOLiD.
When using version 1.1.2, there is no problem.
Could you solve this problem?

$ bowtie-1.2.1.1/bowtie -C hg19-cs sample.fastq -n2 -k1 > /dev/null
Reads file contained a pattern with more than 1024 quality values.
Please truncate reads and quality values and and re-run Bowtie
terminate called after throwing an instance of 'int'

bowtie will not stop when mutlithreading is enabled

Bowtie version

$ bowtie/bowtie-1.2/bowtie --version
bowtie-align version 1.2
64-bit
Built on mint
Tue Dec 27 17:03:06 UTC 2016
Compiler: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
Options: -O3 -m64  -Wl,--hash-style=both -DWITH_TBB -DPOPCNT_CAPABILITY -DNO_SPINLOCK -DWITH_QUEUELOCK=1
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
ldd bowtie/bowtie-1.2/bowtie-align-l
        linux-vdso.so.1 =>  (0x00007ffe1a54b000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f726f110000)
        libtbb.so.2 => /opt/intel/composer_xe_2015.3.187/tbb/lib/intel64/gcc4.4/libtbb.so.2 (0x00007f726eebe000)
        libtbbmalloc_proxy.so.2 => /opt/intel/composer_xe_2015.3.187/tbb/lib/intel64/gcc4.4/libtbbmalloc_proxy.so.2 (0x00007f726ecbb000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f726e9b2000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f726e6af000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f726e499000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f726e0d8000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f726f340000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f726ded3000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f726dccb000)
        libtbbmalloc.so.2 => /opt/intel/composer_xe_2015.3.187/tbb/lib/intel64/gcc4.4/libtbbmalloc.so.2 (0x00007f726da77000)

Input files

$cat csfasta/test.csfasta
>1_31_4883_F3
T101233113312.2000.0210.0233.3320.1332.2002.2022.21
>1_35_1288_F3
T1302121212330131211331.0002.0330.3020.3310.2022.22
>1_35_3461_F3
T1200022003313030133213.2120.0020.0000.1123.0033.32
>1_41_288_F3
T31130302013113322033110033112302133301022321121322
>1_43_3903_F3
T03132220100301000300331322100111213231200302130313

$ cat index/random.fna
>random
ttgagggagagtagcactgtcgcaagtatgactggggcctcagggaaaatttagagagcg
tataccgagtcacgaggccacctccgttctttgccgcttcgtctcgaaccaggtttgttc
cgagacctatcgatagtagcgcctacactcgggaaatgtaatcgatatggactaatagcc
cacgttaacactaagttgtcctattccgcccttgggtcctccccttgagtccgatgtcca
actcaagccacctgataaacgaaacgggcatttccgtcacaatgacaagtcatcgggctc
cctttgatattcacaaccctttctgatgactggcttattggaaggacatcgaaactcaac
acaggcacgattaagaaaaattctacgtactgagactattacagtcccctaccggtagtc
ctggccagcgctcagagtgaagaccgttttcacctagatcaatttaggttactccgttca
tgtgtatgcttgaagtcaacactctgcctccacgcgaaggggtatcgagattgttacagc
atacaggaacaactagcactaacgcacatgtcgtcgacgagccagtagttagtatacccg
acgctaaattcgcgtgtgaacttgggtcctagttgctaaaaggcgcgggagcgcgcatgg
cgaatgggtaggtactcacacagtgtgtaggccgaacattgattgctgaatacttaacac
cgatctactcagtcaagccatttctccacatggtgctcggtgagcgttcttgatgccagg
gcccttactagcaagtggtctgctccgagaaggcttggcactttcaccgtttctgggcga
ccagcacgggcaccgtcgagctttaataataaaccagttagcctaaatgagggtcgacga
gacagcgtcaagtacgtggttcaagccttaccagtccacccgtcgacgggtgtctattat
ttttcaggtgatacatagtcgatattttcgtttcgaccgg

Commands run

$bowtie/bowtie-1.2/bowtie-build -C index/random.fna  index/random.fna
...
Total time for backward call to driver() for mirror index: 00:00:00
$ echo $?
0

Without multithreading (-p 1)

$ bowtie/bowtie-1.2/bowtie --verbose  -f -C -S -v 3 -p 1 -3 15  index/random.fna csfasta/test.csfasta
INFO: Command: bowtie/bowtie-1.2/bowtie-align-s --wrapper basic-0 -f -C -S -v 3 -p 1 -3 15 index/random.fna csfasta/test.csfasta
@HD     VN:1.0  SO:unsorted
@SQ     SN:random       LN:1000
@PG     ID:Bowtie       VN:1.2  CL:"bowtie-align --wrapper basic-0 -f -C -S -v 3 -p 1 -3 15 index/random.fna csfasta/test.csfasta"
1_31_4883_F3    4       *       0       0       *       *       0       0       ACGTTCCTTCGNGAAANAGCANAGTTNTTGANCT      IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII      XM:i:0
1_35_1288_F3    4       *       0       0       *       *       0       0       TAGCGCGCGTTACTCGCCTTCNAAAGNATTANTA      IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII      XM:i:0
1_35_3461_F3    4       *       0       0       *       *       0       0       GAAAGGAATTCTATACTTGCTNGCGANAAGANAA      IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII      XM:i:0
1_41_288_F3     4       *       0       0       *       *       0       0       CCTATAGACTCCTTGGATTCCAATTCCGTAGCTT      IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII      XM:i:0
1_43_3903_F3    4       *       0       0       *       *       0       0       TCTGGGACAATACAAATAATTCTGGCAACCCGCT      IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII      XM:i:0
# reads processed: 5
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 5 (100.00%)
No alignments

With multithreading (-p 2)

$ bowtie/bowtie-1.2/bowtie --verbose  -f -C -S -v 3 -p 2 -3 15  index/random.fna csfasta/test.csfasta
INFO: Command: bowtie/bowtie-1.2/bowtie-align-s --wrapper basic-0 -f -C -S -v 3 -p 2 -3 15 index/random.fna csfasta/test.csfasta

bowtie never stops and uses 100% of 2 CPUs

stack trace with multithreading

Thread 8 (Thread 0x7fec6fd2c700 (LWP 5008)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298e2ac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298e2ac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7fec6f92b700 (LWP 5015)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298ebac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298ebac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7fec6f52a700 (LWP 5016)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298ddac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298ddac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7fec6f129700 (LWP 5019)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298deac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298deac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7fec6e927700 (LWP 5017)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298dfac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298dfac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7fec6ed28700 (LWP 5018)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298e1ac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298e1ac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7fec6e526700 (LWP 5020)):
#0  0x00007fec832aebf9 in syscall () from /lib64/libc.so.6
#1  0x00007fec83fbeb04 in futex_wait (futex=<optimized out>, comparand=<optimized out>, $O9=<optimized out>, $P0=<optimized out>) at ../../include/tbb/machine/linux_common.h:60
#2  P (this=<optimized out>, $N3=<optimized out>) at ../../src/tbb/semaphore.h:206
#3  commit_wait (this=<optimized out>, c=..., $N1=<optimized out>, $N2=...) at ../../src/rml/include/../server/thread_monitor.h:259
#4  tbb::internal::rml::private_worker::run (this=0x7fec8298e0ac) at ../../src/tbb/private_server.cpp:282
#5  0x00007fec83fbe8d6 in tbb::internal::rml::private_worker::thread_routine (arg=0x7fec8298e0ac) at ../../src/tbb/private_server.cpp:228
#6  0x00007fec841fadc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fec832b473d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fec84613740 (LWP 4973)):
#0  0x00007fec832992e7 in sched_yield () from /lib64/libc.so.6
#1  0x00007fec83fca6d2 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::receive_or_steal_task (this=0x7fec82992c80, completion_ref_count=@0x7fec82992f00: 140653781188224) at ../../src/tbb/custom_scheduler.h:284
#2  0x00007fec83fc9f16 in local_wait_for_all (this=0x0, parent=..., child=<optimized out>, $▒1=<optimized out>, $▒2=..., $▒3=<optimized out>) at ../../src/tbb/custom_scheduler.h:592
#3  tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::wait_for_all (this=0x7fec82992c80, parent=..., child=0x0) at ../../src/tbb/custom_scheduler.h:81
#4  0x0000000000415081 in void twoOrThreeMismatchSearchFull<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> > >(PatternComposer&, HitSink&, Ebwt<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> > >&, Ebwt<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> > >&, std::vector<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna5>, seqan::Alloc<void> >, std::allocator<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna5>, seqan::Alloc<void> > > >&, bool) ()
#5  0x000000000041e978 in void driver<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> > >(char const*, std::string const&, std::string const&, std::vector<std::string, std::allocator<std::string> > const&, std::vector<std::string, std::allocator<std::string> > const&, std::string const&) [clone .isra.1133] [clone .constprop.1195] ()
#6  0x0000000000421ec7 in bowtie ()
#7  0x00000000004065da in main ()

Have you ever uncountered such issue?
Thanks for your help.

Examples 6 and 9

Please compare bowtie 1.2.1.1 results with the results in the manual:

https://github.com/BenLangmead/bowtie/blob/master/MANUAL

Example 6:
bowtie-1.2.1.1$ ./bowtie -a --best --strata -v 2 --suppress 1,5,6,7 e_coli -c ATGCATCATGCGCCAT
-	gi|110640213|ref|NC_008253.1|	2852852	8:T>A
-	gi|110640213|ref|NC_008253.1|	148810	10:A>G,13:C>G
+	gi|110640213|ref|NC_008253.1|	1093035	2:T>G,15:A>T
-	gi|110640213|ref|NC_008253.1|	905664	6:A>G,7:G>T
-	gi|110640213|ref|NC_008253.1|	4930433	4:G>T,6:C>G
# reads processed: 5
# reads with at least one reported alignment: 5 (100.00%)
# reads that failed to align: 0 (0.00%)
Reported 5 alignments to 1 output stream(s)
Example 9:
bowtie-1.2.1.1$ ./bowtie -a -m 3 --best --strata -v 2 e_coli --suppress 1,5,6,7 -c ATGCATCATGCGCCAT
# reads processed: 1
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 0 (0.00%)
# reads with alignments suppressed due to -m: 1 (100.00%)
No alignments

Elimiate -f for bowtie-build

By default, it expects FASTA files, so it's unclear why there is a -f flag in the Options section. Also, for bowtie, it expects FASTQ files by default, but it has a -q flag in its Options section. These two options seem redundant.

bowtie map files in binary format

Hello, when I use Bowtie installed by homebrew on OSX El Capitan the map file is all zeros, however if I used Bowtie installed from source the map file is completed correctly. Attached is a sample output map file.
testmap.txt

corrupted MAP output is sometime generated by bowtie 1.2.1

Sometime randomly, Bowtie 1.2.1. generates a corrupted MAP file (that is the default Bowtie output) where for example the read id is on second column instead of first column.

If Bowtie 1.2.1 is re-run second time with the same command line parameters and inputs as first time, Bowtie 1.2.1 will run just fine and produce a good MAP file. In first and second case no error or warning is shown.

This behavior has not been noticed with Bowtie 1.X or 1.2.

Interleaved fastq support

In BWA, there is a parameter -p to support a single fastq file which is the interleaved fastq (single paired end).

e.g. bwa mem -M -t 16 -p ref.fa read.fq > aln.sam

How can we use the interleaved fastq with bowtie?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.