Giter VIP home page Giter VIP logo

jellyfish's Introduction

CI workflow

Jellyfish

Overview

Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. Jellyfish can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the "compare-and-swap" CPU instruction to increase parallelism.

JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in a binary format, which can be translated into a human-readable text format using the "jellyfish dump" command, or queried for specific k-mers with "jellyfish query". See the documentation for details.

If you use Jellyfish in your research, please cite:

Guillaume Marcais and Carl Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 27(6): 764-770 (first published online January 7, 2011) doi:10.1093/bioinformatics/btr011

Installation

Linux Binaries

On Debian and Ubuntu with apt:

sudo apt update
sudo apt install jellyfish

On Arch, it is available from AUR.

FreeBSD

Jellyfish can be installed on FreeBSD via the FreeBSD ports system.

To install via the binary package, simply run:

pkg install Jellyfish

To install from source:

cd /usr/ports/biology/jellyfish
make install

Windows

With Cygwin, Jellyfish can be compiled from source as explained below. The simpler way on Windows 10 is to first install WSL and then install a Linux distribution that carries Jellyfish (e.g., Ubuntu) from the Windows Store. Finally, install with:

sudo apt update
sudo apt install jellyfish

From source

To get an easier to compiled packaged tar ball of the source code, download a release from the github release. You need make and g++ version 4.4 or higher. To install in your home directory, do:

./configure --prefix=$HOME
make -j 4
make install

To compile from the git tree, you will also need autoconf, automake, libool, gettext, pkg-config and yaggo. Then to compile and install (in /usr/local in that example) with:

autoreconf -i
./configure
make -j 4
sudo make install

If the software is installed in system directories (hint: you needed to use sudo to install), like the example above, then the system library cache must be updated like such:

sudo ldconfig

Usage

Instruction of use are available in the doc directory.

Extra / Examples

In the examples directory are potentially useful extra programs to query/manipulates output files of Jellyfish, using the shared library of Jellyfish in C++ or with scripting languages. The examples are not compiled by default. Each subdirectory of examples is independent and is compiled with a simple invocation of 'make'.

Binding to script languages

Bindings to Ruby, Python and Perl are provided. This binding allows to read the output file of Jellyfish directly in a scripting language. Compilation of the bindings is easier from the release tarball. The development files of the target scripting language are required.

Compilation of the bindings from the git tree requires SWIG version 3 and adding the switch --enable-swig to the configure command lines show below.

To compile all three bindings, configure and compile with:

./configure --enable-ruby-binding --enable-python-binding --enable-perl-binding
make -j 4
sudo make install

By default, Jellyfish is installed in /usr/local and the bindings are installed in the proper system location. When the --prefix switch is passed, the bindings are installed in the given directory. For example:

./configure --prefix=$HOME --enable-python-binding
make -j 4
make install

This will install the python binding in $HOME/lib/python2.7/site-packages (adjust based on your Python version).

Then, for Python, Ruby or Perl to find the binding, an environment variable may need to be adjusted (PYTHONPATH, RUBYLIB and PERL5LIB respectively). For example:

export PYTHONPATH=$HOME/lib/python2.7/site-packages

See the swig directory for examples on how to use the bindings.

jellyfish's People

Contributors

cerebis avatar emollier avatar gmarcais avatar heuermh avatar lamby avatar martin-steghoefer avatar pkubaj avatar rafaeldelucena avatar sebastien-lemieux avatar tillea avatar wwood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jellyfish's Issues

MerDNA has nondeterministic behavior (Python binding)

Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import jellyfish
>>> mer, count = jellyfish.ReadMerFile('1.jf').next() #deal me a mer, one off the top
>>> print count
1
>>> mer, count
(<jellyfish.jellyfish.MerDNA; proxy of <Swig Object of type 'MerDNA *' at 0x7fed0dded090> >, 1L)
>>> print mer
AAAAAAAAACTTTTGTCAATCGGTTCCCTTAGA
>>> print mer
TAAAAAAAAAAAAAAAAAAACAAAGCATGTTAA
>>> mer == mer
True
>>> str(mer)
'AAAAAAAAAAAAAAAAAAAACCGTTCTGCTCAA'
>>> print mer
TAAAAAAAAAAAAAAAAAAACCGTTCGAAGGAA

What's going on here?

[2.2.3] error during make check, struct std::basic_stringbuf<_CharT, _Traits, _Alloc>::__xfer_bufptrs’ redeclared with different access

$ make check
make  check-recursive
make[1]: Entering directory '/tmp/makepkg/jellyfish/src/jellyfish-2.2.3'
Making check in .
make[2]: Entering directory '/tmp/makepkg/jellyfish/src/jellyfish-2.2.3'
make  libgtest.la libgtest_main.la bin/generate_sequence bin/test_all
make[3]: Entering directory '/tmp/makepkg/jellyfish/src/jellyfish-2.2.3'
  CXXLD  libgtest.la
  CXXLD  libgtest_main.la
  CXXLD  bin/generate_sequence
  CXX    unit_tests/bin_test_all-test_main.o
In file included from ./unit_tests/gtest/gtest.h:308:0,
                 from unit_tests/test_main.cc:20:
/usr/include/c++/5.1.0/sstream:335:7: error: ‘struct std::basic_stringbuf<_CharT, _Traits, _Alloc>::__xfer_bufptrs’ redeclared with different access
       struct __xfer_bufptrs
       ^
Makefile:1264: recipe for target 'unit_tests/bin_test_all-test_main.o' failed
make[3]: *** [unit_tests/bin_test_all-test_main.o] Error 1
make[3]: Leaving directory '/tmp/makepkg/jellyfish/src/jellyfish-2.2.3'
Makefile:2234: recipe for target 'check-am' failed
make[2]: *** [check-am] Error 2
make[2]: Leaving directory '/tmp/makepkg/jellyfish/src/jellyfish-2.2.3'
Makefile:1774: recipe for target 'check-recursive' failed
make[1]: *** [check-recursive] Error 1
make[1]: Leaving directory '/tmp/makepkg/jellyfish/src/jellyfish-2.2.3'
Makefile:2237: recipe for target 'check' failed
make: *** [check] Error 2

Prebuilt Windows Version

I would really like to have a Windows version of this software. My development environment is Windows (no gcc or cygwin) and the Linux machines I use I do not have sudo privileges on. Any chance we could get the occasional Windows .exe compilation to make things easier for those of us who do do work on Windows?

dependencies too deep

Dear Guillaume,

I'm looking forward to test the jellyfish 2.0 but I'm finding really complicated to get it running in my universities server due to the high number of, mostly ruby, dependencies your project has now. I understand that in most university's environments we researchers do not have administrative power.

Is it possible to reduce the number of dependencies so compilation is easier and the programs as a whole is more stable?

Sorry if this request seems unreasonable.

Regards

query command functionality

Dear Guillaume,

Would you consider the following functionality in the query command? I would like to get the kmers present in each sequence of a multiline fasta and fastq.

Would it be possible to modify the current jellyfish output to have each sequence separated by the read_id or an added first column with the read id.

OR

even a more simply display
read_id count_of_kmers_present_in_jellyfish_db

thanks for humoring me. I have a python wrapper that does this by querying each sequence individual but the speed is not as good as I would like.
Keith

Feature/Option Ideas

I've come across a few use cases for Jellyfish that were not practical with the current executable, so I wrote the functionality as a seperate program, using Jellyfish as a library. I'd be happy to integrate them into the master if you feel they belong. They are as follows:

  1. Count k-mers found only in a .jf file. Currently this is an option, but if you want to do this for many different samples, you waste overhead of re-counting the input file. Iterating the binary dump is much quicker to prime the hash.
  2. Use the hash matrix from another .jf file. This is advantageous when you'd like the output to be in the same sorted order.
  3. Limit the count stored in the hash. Specifically, my use case has been to see which k-mers appear in a sample without caring about their counts. This would allow a single value bit to be used and prevent very large counts from using many entries. The "generalized" case is to disallow overflow of the counter, but the (easier) way I'm doing it now (specific to a single bit) is to first check if the k-mer value is 1, and add 1 if it is not.

error in old rhel5

Hi, I tried to install the latest jellyfish by downloading the release from git.
However, I could not find any executable: configure that I can use to invoke the ./configure command.

I tried autoreconf after commenting the binding part and several others.

Then ./configure produced the following:
...
checking for _NSGetExecutablePath... no
checking execinfo.h ext/stdio_filebuf.h usability... no
checking execinfo.h ext/stdio_filebuf.h presence... no
checking for execinfo.h ext/stdio_filebuf.h... no
configure: error: cannot find required header execinfo.h ext/stdio_filebuf.h

What is this? Where can I find execinfo.h ext/stdio_filebuf.h?

Trouble installing 2.1.2 on Red Hat 6.5 (Santiago)

I would appreciate your help with the following install problem.

This error occurs when trying to make jellyfish 2.1.2. gcc version 4.4.7.
Jellyfish 2.0 installed no problem.

thanks

make all-am
make[1]: Entering directory `/home/jellyfish-2.1.2'
CXX sub_commands/count_main.o
In file included from ./include/jellyfish/jellyfish.hpp:24,
from sub_commands/count_main.cc:40:
./include/jellyfish/binary_dumper.hpp: In member function ‘bool jellyfish::binary_query_base<Key, Val>::val_id(const Key&, Val_, uint64_t_) const’:
./include/jellyfish/binary_dumper.hpp:170: error: there are no arguments to ‘lrint’ that depend on a template parameter, so a declaration of ‘lrint’ must be available
./include/jellyfish/binary_dumper.hpp:170: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
sub_commands/count_main.cc: At global scope:
sub_commands/count_main.cc:352: fatal error: opening dependency file sub_commands/.deps/count_main.Tpo: Permission denied
compilation terminated.
make[1]: *** [sub_commands/count_main.o] Error 1

[request] Example of string to map of kmer and counts

Before I dive into the source, I wanted to see how hard it would be to use Jellyfish as a library to go from sequence to kmer and count.

My function interface would look something like:

#include <string>
#include <map>
std::map <std::string, unsigned int> count_kmers( std::string &sequence );

And then in the definition, perhaps something like:

#include ....
#include "ext/jellyfish/whole_sequence_parser.hpp"
#include "ext/jellyfish/jellyfish.hpp"
#include ...

using std::map;
using std::string;

map <string, unsigned int> count_kmers( string &sequence ) {
    map <string, unsigned int> counted_kmers;
    // open file
    // jellyfish code?
    return counted_kmers;
}

Right now I use system() but I would love to include it as a native library instead.

Btw, great work! I looked at a few other k-mer counters and came back to Jellyfish shortly after 👍

Edit: removed some unnecessary code.

error in building

I ran into this error while building jellyfish on my linux system:

build> make
YAGGO sub_commands/count_main_cmdline.hpp
YAGGO sub_commands/info_main_cmdline.hpp
YAGGO sub_commands/dump_main_cmdline.hpp
YAGGO sub_commands/histo_main_cmdline.hpp
YAGGO sub_commands/stats_main_cmdline.hpp
YAGGO sub_commands/merge_main_cmdline.hpp
YAGGO sub_commands/bc_main_cmdline.hpp
YAGGO sub_commands/query_main_cmdline.hpp
YAGGO sub_commands/cite_main_cmdline.hpp
YAGGO sub_commands/mem_main_cmdline.hpp
YAGGO jellyfish/generate_sequence_cmdline.hpp
YAGGO unit_tests/test_main_cmdline.hpp
make all-am
make[1]: Entering directory /local0/sw/Jellyfish/build' CXX lib/rectangular_binary_matrix.lo CXX lib/mer_dna.lo CXX lib/storage.lo CXX lib/allocators_mmap.lo CXX lib/misc.lo ../lib/misc.cc: In function ‘std::string jellyfish::quote_arg(const std::string&)’: ../lib/misc.cc:86: error: ‘all_of’ is not a member of ‘std’ make[1]: *** [lib/misc.lo] Error 1 make[1]: Leaving directory/local0/sw/Jellyfish/build'
make: *** [all] Error 2

Pls kindly advise how can i resolve it.

Thank you.

Error in config due to swig

I'm trying to use Jellyfish on a system where I can't mess with the system, so I'm trying to do a local installation. Unfortunately, the following configure call fails:

$ ./configure --prefix=/path/local --enable-python-binding=/path/local/lib/python3.4
...
configure: creating ./config.status
config.status: creating Makefile
config.status: creating tests/compat.sh
config.status: creating jellyfish-2.0.pc
config.status: error: cannot find input file: `swig/Makefile.in'

Just for the fun of it, I tried --enable-swig:

checking for swig... /usr/bin/swig
checking SWIG version... 1.3.29
configure: WARNING: SWIG version >= 3.0.0 is required.  You have 1.3.29.
configure: error: SWIG version 3 is required

So, I modified configure to prevent it (faster and easier than trying to figure out which swig conditional was failing):

# Enable compilation of SWIG and bindings
# Added empty assignment to prevent making swig [~line16560]
maybe_swig=
MAYBE_SWIG=$maybe_swig

if test -n "$maybe_swig"; then :
  ac_config_files="$ac_config_files swig/Makefile"
fi

which works fine:

...
config.status: creating Makefile
config.status: creating tests/compat.sh
config.status: creating jellyfish-2.0.pc
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands

Thanks!

count in file command

Dear Guillaume,

I wish I had a better handle on the bug, but I noticed something wonky with the count_in_file command. I found it is related to the initial hash size used when calling jf count. I illustrate it below.

If I start with a small fasta file.
>test
AGTGAAGCCAATTGATTTTTTAGACCCC


I build 4 equivalent jf databases with the following commands. Notice -s is the only option that changes.
$ jellyfish count -C -s 10M -m 21 -o jf_test_10M.jf jf_test.fa
$ jellyfish count -C -s 20M -m 21 -o jf_test_20M.jf jf_test.fa
$ jellyfish count -C -s 30M -m 21 -o jf_test_30M.jf jf_test.fa
$ jellyfish count -C -s 300M -m 21 -o jf_test_300M.jf jf_test.fa

run count_in_file. I should get all kmers represented once [notice that some of the kmers are repeated].
./count_in_file/count_in_file jf_test_10M.jf jf_test_20M.jf jf_test_30M.jf jf_test_300M.jf
GCCAATTGATTTTTTAGACCC 1 1 1 0
CTAAAAAATCAATTGGCTTCA 1 0 0 0
GAAGCCAATTGATTTTTTAGA 1 0 0 0
AAAAAATCAATTGGCTTCACT 1 1 1 0
AGCCAATTGATTTTTTAGACC 1 1 1 0
CTAAAAAATCAATTGGCTTCA 0 1 1 0
AAGCCAATTGATTTTTTAGAC 1 0 0 0
GTGAAGCCAATTGATTTTTTA 1 0 0 0
CCAATTGATTTTTTAGACCCC 1 0 0 0
GAAGCCAATTGATTTTTTAGA 0 1 1 0
GTGAAGCCAATTGATTTTTTA 0 1 1 0
AAGCCAATTGATTTTTTAGAC 0 1 1 0
CCAATTGATTTTTTAGACCCC 0 1 1 1
GAAGCCAATTGATTTTTTAGA 0 0 0 1
GCCAATTGATTTTTTAGACCC 0 0 0 1
CTAAAAAATCAATTGGCTTCA 0 0 0 1
AGCCAATTGATTTTTTAGACC 0 0 0 1
AAGCCAATTGATTTTTTAGAC 0 0 0 1
AAAAAATCAATTGGCTTCACT 0 0 0 1
GTGAAGCCAATTGATTTTTTA 0 0 0 1

Thanks Keith

int histo_main(int, char**): ‘out’ was not declared in this scope

sub_commands/histo_main.cc: In function ‘int histo_main(int, char*)’:
sub_commands/histo_main.cc:59:7: error: ‘out’ was not declared in this scope
sub_commands/histo_main.cc:82:7: error: ‘out’ was not declared in this scope
sub_commands/histo_main.cc:85:3: error: ‘out’ was not declared in this scope
make[1]: *
* [sub_commands/histo_main.o] Error 1

isn't it c++11 missing?

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.7.3-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --with-system-zlib --enable-objc-gc --with-cloog --enable-cloog-backend=ppl --disable-cloog-version-check --disable-ppl-version-check --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1)

2.1.2 is not buildable

Hi,

I noticed that 2.1.2 release was pushed two hours ago, so I tried to build it. However, there is an autotools problem. Configuration scripts are missing. After an autoreconf run followed by configure there is still an issue:

$ make
make: *** No rule to make target false', needed bysub_commands/count_main_cmdline.hpp'. Stop.

Parser interprets empty read as end-of-file

Hi Guillaume,

@roryk noticed this issue in RapMap (COMBINE-lab/RapMap#19), and has a nice sample dataset to reproduce it. Basically, when a read is empty, but not the actual end of the file, the parser thinks it's done. Is there an easy way to change the parser logic to account for this situation?

--Rob

count: unrecognized option '--both-strands'

I built jellyfish 2.1.1. However, it looks like the 'count --both-strands' option is not available even though it's listed in the documentation and many external programs that use jellyfish depend on it.

jellyfish count --both-strands
count: unrecognized option '--both-strands'

int histo_main(int, char**): invalid use of ‘class histo_main_cmdline::error

make all-am
make[1]: Entering directory /home/aflit001/nobackup/progs/jelly/Jellyfish' CXXLD libjellyfish-2.0.la CXX sub_commands/histo_main.o sub_commands/histo_main.cc: In function ‘int histo_main(int, char**)’: sub_commands/histo_main.cc:57:10: error: invalid use of ‘class histo_main_cmdline::error’ make[1]: *** [sub_commands/histo_main.o] Error 1 make[1]: Leaving directory/home/aflit001/nobackup/progs/jelly/Jellyfish'
make: *** [all] Error 2

jellyfish::cooperative::hash_counter destructor hangs in local scope

Hi Guillaume,

I'm encountering the following very strange issue. I'm using a hash_counter inside a function to maintain counts for some k-mers and everything is working great. Until the point where the function should return. For some reason, the function will never return, and I've deduced that it is because it is waiting on the hash_counter destructor to finish (which it never does). The process for the program goes into S state, and an strace reveals the following is what's happening wait4(-1,. Do you have any idea what might be going on here? Interestingly, if I make a pointer to the hash_counter, and simply don't delete it (so that the memory escapes the scope of the function and "dangles"), then the program runs to completion (and, presumably, the normal process cleanup reclaims the memory). Any thoughts on what might be happening here are greatly appreciated!

Example compile error

Hi, i'm trying to compile the examples, and i keep getting this error from the compiler

In file included from query_per_sequence.cc:22:
include/jellyfish-2.2.0/jellyfish/stream_manager.hpp:98:12: error: calling a private constructor of class 'std::__1::unique_ptrstd::__1::basic_istream<char,
std::__1::default_deletestd::__1::basic_istream >'
return res;
^
Does anybody have any idea what i'm missing!??
Thanks
Gonza.-

config error

jellyfish]$ autoreconf -i
gtest.mk:5: error: Libtool library used but 'LIBTOOL' is undefined
gtest.mk:5: The usual way to define 'LIBTOOL' is to add 'LT_INIT'
gtest.mk:5: to 'configure.ac' and run 'aclocal' and 'autoconf' again.
gtest.mk:5: If 'LT_INIT' is in 'configure.ac', make sure
gtest.mk:5: its definition is in aclocal's search path.
Makefile.am:198: 'gtest.mk' included from here
autoreconf: automake failed with exit status: 1

Unclear error message if yaggo is missing

I was compiling Jellyfish with autoreconf, ./configure and make, and yaggo was not installed -- neither of these commands indicated a missing dependency. make gave me this ambiguous error:

make: *** No rule to make targetfalse', needed by sub_commands/count_main_cmdline.hpp'. Stop.

I went through configure's output and saw that yaggo was missing. Installing it solved the problem, but maybe configure could instead report a missing dependency?

configure issue from 2.2.3 tarball when building with bindings

Not my day for builds ... :-) I'm having trouble getting a fresh 2.2.3 tarball to configure when enabling Perl and Python bindings:

config.status: error: cannot find input file: `swig/Makefile.in'
milou2: /sw/apps/bioinfo/jellyfish/src/jellyfish-2.2.3 $ ls swig/
perl5  python  ruby

The long version:

milou2: /sw/apps/bioinfo/jellyfish/src $ wget https://github.com/gmarcais/Jellyfish/releases/download/v2.2.3/jellyfish-2.2.3.tar.gz
milou2: /sw/apps/bioinfo/jellyfish/src $ tar xzf jellyfish-2.2.3.tar.gz
milou2: /sw/apps/bioinfo/jellyfish/src $ cd jellyfish-2.2.3/
milou2: /sw/apps/bioinfo/jellyfish/src/jellyfish-2.2.3 $ ./configure --enable-perl-binding --enable-python-binding
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking how to print strings... printf
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 3458764513820540925
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... no
checking if : is a manifest tool... no
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld -m elf_x86_64
checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... yes
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... (cached) GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking for md5sum... md5sum
checking for yaggo... /sw/apps/bioinfo/jellyfish/src/yaggo/yaggo
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for __int128... yes
checking for std::numeric_limits<__int128>... no
checking for _NSGetExecutablePath... no
checking for execinfo.h... yes
checking for ext/stdio_filebuf.h... yes
checking for siginfo_t.si_int... yes
checking for python... /usr/bin/python
checking for a version of Python >= '2.1.0'... yes
checking for the distutils Python package... yes
checking for Python include path... -I/usr/include/python2.6
checking for Python library path... -L/usr/lib64 -lpython2.6
checking for Python site-packages path... /usr/lib/python2.6/site-packages
checking python extra libraries... -lpthread -ldl  -lutil -lm
checking python extra linking flags... -Xlinker -export-dynamic
checking consistency of all components of python development environment... yes
checking for perl... /usr/bin/perl
checking for Perl prefix... /usr
checking for Perl extension include path... /usr/lib64/perl5/CORE
checking for Perl extension target directory... /usr/local/lib64/perl5
checking for Perl extensions C preprocessor flags... -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
checking for Perl extensions linker flags... -shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
configure: creating ./config.status
config.status: creating Makefile
config.status: creating tests/compat.sh
config.status: creating jellyfish-2.0.pc
config.status: error: cannot find input file: `swig/Makefile.in'
milou2: /sw/apps/bioinfo/jellyfish/src/jellyfish-2.2.3 $ ls swig/
perl5  python  ruby

Why Jellyfish brew bottle does not symlink?

Why an installed brew bottle does not symlink?

Seems Jellyfish manual says I need GCC 4.8. As far as I know I have GCC 4.2.1

    dhcp17-207:minikraken_20141208 bernardo$ gcc -v
    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/c++/4.2.1
    Apple LLVM version 7.0.2 (clang-700.1.81)
    Target: x86_64-apple-darwin15.4.0
    Thread model: posix

From Jellyfish manual:

To install on Mac OS X: Jellyfish 2.0 does not compile with Apple's
Xcode GCC 4.2. Instead, the easiest thing to do is to install GCC 4.8
using MacPorts (http://www.macports.org) using the following commands:

And the brew install of Jellyfish

    dhcp17-207:minikraken_20141208 bernardo$ brew install jellyfish-1.1
    ==> Installing jellyfish-1.1 from homebrew/science
    ==> Downloading http://www.cbcb.umd.edu/software/jellyfish/jellyfish-1.1.11.tar.gz
    Already downloaded: /Library/Caches/Homebrew/jellyfish-1.1-1.1.11.tar.gz
    ==> ./configure --prefix=/usr/local/Cellar/jellyfish-1.1/1.1.11
    ==> make
    ==> make install
    ==> Caveats
    This formula is keg-only, which means it was not symlinked into /usr/local.

    It conflicts with jellyfish.

    Generally there are no consequences of this for you. If you build your
    own software and it requires this formula, you'll need to add to your
    build variables:

        LDFLAGS:  -L/usr/local/opt/jellyfish-1.1/lib
        CPPFLAGS: -I/usr/local/opt/jellyfish-1.1/include

    ==> Summary
    🍺  /usr/local/Cellar/jellyfish-1.1/1.1.11: 62 files, 2.3M, built in 22 seconds
    dhcp17-207:minikraken_20141208 bernardo$ jellyfish
    -bash: /usr/local/bin/jellyfish: No such file or directory

An alternative is to install from source, but I would need to install GCC 4.8 and don't know the posible consequences to my system.

trouble building the git tree

Hi, having trouble getting the git tree built and it doesn't seem to be something with the compiler (gcc 4.4.7, 4.8.3 both fail), perhaps something with yaggo? Or perhaps I am confused.

Short version:

milou2: /sw/apps/bioinfo/jellyfish/src $ git clone https://github.com/gmarcais/Jellyfish.git
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish $ mkdir build
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish $ cd build
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish/build $ yaggo -v
yaggo 1.5.6
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish/build $ ../configure
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish/build $ make -j 4
  YAGGO    sub_commands/count_main_cmdline.hpp
...
  CXX    sub_commands/info_main.o
In file included from ../sub_commands/count_main.cc:44:
./sub_commands/count_main_cmdline.hpp: In constructor ‘count_main_cmdline::count_main_cmdline()’:
./sub_commands/count_main_cmdline.hpp:353: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:354: error: expected primary-expression before ‘)’ token
...

Gcc 4.8.3 gives a more verbose error:

In file included from sub_commands/count_main.cc:44:0:
./sub_commands/count_main_cmdline.hpp: In constructor ‘count_main_cmdline::count_main_cmdline()’:
./sub_commands/count_main_cmdline.hpp:353:26: error: expected primary-expression before ‘)’ token
     mer_len_arg((uint32_t)), mer_len_given(false),
                          ^
./sub_commands/count_main_cmdline.hpp:354:23: error: expected primary-expression before ‘)’ token
     size_arg((uint64_t)), size_given(false),
                       ^
...

Same happens when I clone develop using git clone -b develop https://github.com/gmarcais/Jellyfish.git.

The long version for master, with gcc 4.4.7:

milou2: /sw/apps/bioinfo/jellyfish/src $ git clone https://github.com/gmarcais/Jellyfish.git
Initialized empty Git repository in /pica/sw/apps/bioinfo/jellyfish/src/Jellyfish/.git/
remote: Counting objects: 4662, done.
remote: Total 4662 (delta 0), reused 0 (delta 0), pack-reused 4662
Receiving objects: 100% (4662/4662), 3.26 MiB | 1.60 MiB/s, done.
Resolving deltas: 100% (3365/3365), done.
milou2: /sw/apps/bioinfo/jellyfish/src $ cd Jellyfish/
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish $ autoreconf -i
libtoolize: putting auxiliary files in `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
configure.ac:2: installing `./config.guess'
configure.ac:2: installing `./config.sub'
configure.ac:4: installing `./install-sh'
configure.ac:4: installing `./missing'
swig/Makefile.am: installing `./depcomp'
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish $ mkdir build
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish $ cd build
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish/build $ yaggo -v
yaggo 1.5.6
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish/build $ ../configure
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for style of include used by make... GNU
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 3458764513820540925
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for ar... ar
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking whether we are using the GNU C++ compiler... (cached) yes
checking whether g++ accepts -g... (cached) yes
checking dependency style of g++... (cached) gcc3
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld -m elf_x86_64
checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... yes
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking for md5sum... md5sum
checking for yaggo... /sw/apps/bioinfo/jellyfish/src/yaggo/yaggo
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for __int128... no
checking for std::numeric_limits<__int128>... no
checking for _NSGetExecutablePath... no
checking execinfo.h usability... yes
checking execinfo.h presence... yes
checking for execinfo.h... yes
checking ext/stdio_filebuf.h usability... yes
checking ext/stdio_filebuf.h presence... yes
checking for ext/stdio_filebuf.h... yes
checking for siginfo_t.si_int... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating tests/compat.sh
config.status: creating jellyfish-2.0.pc
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
milou2: /sw/apps/bioinfo/jellyfish/src/Jellyfish/build $ make -j 4
  YAGGO    sub_commands/count_main_cmdline.hpp
  YAGGO    sub_commands/info_main_cmdline.hpp
  YAGGO    sub_commands/dump_main_cmdline.hpp
  YAGGO    sub_commands/histo_main_cmdline.hpp
  YAGGO    sub_commands/stats_main_cmdline.hpp
  YAGGO    sub_commands/merge_main_cmdline.hpp
  YAGGO    sub_commands/bc_main_cmdline.hpp
  YAGGO    sub_commands/query_main_cmdline.hpp
  YAGGO    sub_commands/cite_main_cmdline.hpp
  YAGGO    jellyfish/generate_sequence_cmdline.hpp
  YAGGO    sub_commands/mem_main_cmdline.hpp
  YAGGO    unit_tests/test_main_cmdline.hpp
make  all-recursive
make[1]: Entering directory `/pica/sw/apps/bioinfo/jellyfish/src/Jellyfish/build'
Making all in .
make[2]: Entering directory `/pica/sw/apps/bioinfo/jellyfish/src/Jellyfish/build'
  CXX    lib/rectangular_binary_matrix.lo
  CXX    lib/mer_dna.lo
  CXX    lib/storage.lo
  CXX    lib/allocators_mmap.lo
  CXX    lib/misc.lo
  CXX    lib/int128.lo
  CXX    lib/thread_exec.lo
  CXX    lib/jsoncpp.lo
  CXX    lib/time.lo
  CXX    lib/generator_manager.lo
  CXX    sub_commands/jellyfish.o
  CXX    sub_commands/count_main.o
  CXX    sub_commands/info_main.o
In file included from ../sub_commands/count_main.cc:44:
./sub_commands/count_main_cmdline.hpp: In constructor ‘count_main_cmdline::count_main_cmdline()’:
./sub_commands/count_main_cmdline.hpp:353: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:354: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:365: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:374: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:375: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp: In constructor ‘count_main_cmdline::count_main_cmdline(int, char**)’:
./sub_commands/count_main_cmdline.hpp:382: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:383: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:394: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:403: error: expected primary-expression before ‘)’ token
./sub_commands/count_main_cmdline.hpp:404: error: expected primary-expression before ‘)’ token
make[2]: *** [sub_commands/count_main.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory `/pica/sw/apps/bioinfo/jellyfish/src/Jellyfish/build'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/pica/sw/apps/bioinfo/jellyfish/src/Jellyfish/build'
make: *** [all] Error 2

Trouble installing jellyfish from source on Mac OS

I downloaded source from: http://www.genome.umd.edu/jellyfish.html

After going into the directory, I tried:
$ ./configure #seems to have worked
$ make #this generates the error message below.
make all-am
CXX lib/rectangular_binary_matrix.lo
In file included from lib/rectangular_binary_matrix.cc:22:
./include/jellyfish/rectangular_binary_matrix.hpp:264:5: warning: 'register' storage class specifier is deprecated [-Wdeprecated-register]
register xmm_t acc = acc ^ acc; // Set acc to 0
^~~~~~~~~
./include/jellyfish/rectangular_binary_matrix.hpp:265:5: warning: 'register' storage class specifier is deprecated [-Wdeprecated-register]
register xmm_t load = load ^ load;
^~~~~~~~~
2 warnings generated.
CXX lib/mer_dna.lo
CXX lib/storage.lo
CXX lib/allocators_mmap.lo
CXX lib/misc.lo
lib/misc.cc:86:11: error: no member named 'all_of' in namespace 'std'
if(std::all_of(arg.begin(), arg.end(), isblunt))
~~~~~^
1 error generated.
make[1]: *** [lib/misc.lo] Error 1
make: *** [all] Error 2

I have also tried installing via homebrew, but it also throws an error. I know its possible to install with MacPorts, but I was hoping I could install this program without downloading/installing/using another package manager (I already have HomeBrew, which I try to use sparingly). Any help would be appreciated.

jellyfish count returns empty file when using --min-qual-char on fasta file

It appears that jellyfish (version 2.1.1) fails when using the flag --min-qual-char on a fasta file. For example,
jellyfish count test.fasta -m 20 -s 100M -t 20 -Q C -o test.jf
results in file test.jf with no kmers in it.

Perhaps it would be better for jellyfish to either throw an error when combining -Q with a fasta file, or else simply ignore the -Q flag (I think the latter might be better).

[2.2.3] Installation error (No rule to make target `python/swig_wrap.cpp')

Hi, thank you very much for developing this awesome software.
I tried this program on my laptop, and am satisfied by the python-binding function.
I am now trying to install it with python-binding on server.

I encountered an installation error by using the latest release of tarball. I executed the command as you instructed.

wget -O - "https://github.com/gmarcais/Jellyfish/archive/v2.2.3.tar.gz" | tar xzvf -
cd Jellyfish-2.2.3/
$HOME/bin/autoreconf -i
./configure --prefix=$HOME --enable-python-binding=$HOME/bin/python
make

Then, I encountered the following error massage.

Making all in swig
make[2]: Entering directory `/home/yt/Jellyfish-2.2.3/swig'
make[2]: *** No rule to make target `python/swig_wrap.cpp', needed by `all'.  Stop.
make[2]: Leaving directory `/home/yt/Jellyfish-2.2.3/swig'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/yt/Jellyfish-2.2.3'
make: *** [all] Error 2

I think I have a proper version of g++/make/yaggo/autoconf/autoreconf as follows.

g++ --version
g++ (GCC) 4.9.3
make --version
GNU Make 3.81
yaggo --version
yaggo 1.5.8
$HOME/bin/autoconf --version
autoconf (GNU Autoconf) 2.69

I would be very happy if you could help me to install your awesome software.
Thank you in advance.

Removing dependency on yaggo

Hello Guillaume,
first, thanks for contributing Jellyfish, it is a great piece of software that opens up great possibilities!

The introduction of the dependency on yaggo, makes the install process a bit more complex. Would it be possible to include a working version of yaggo with the jellyfish distribution?

This is more a suggestion for enhancement than an actual issue!

jellyfish 2 run error

hi, I have installed jellyfish along with MaSuRCA, but am getting the following error from jellyfish-2.0:

scott@biolinux3:~/software/masurca/bin$ ./jellyfish-2.0
./jellyfish-2.0: error while loading shared libraries: libjellyfish-2.0.so.2: cannot open shared object file: No such file or directory

The MaSuRCA install suggests that paths should be in the sr_example_configuration but there are no paths in that file... any suggestions?

  • echo creating sr_config_example.txt with correct PATHS
    creating sr_config_example.txt with correct PATHS
  • /home/scott/software/masurca/bin/masurca -g sr_config_example.txt

Segmentation fault

After binding jellyfish with perl, getting segmentation fault in this part of code:

foreach my $m (@ARGV) {
  my $mer = Jellyfish::MerDNA->new($m);
  `$mer->canonicalize;`
  print($mer, " ", $qf->get($mer), "\n");
}

The possible cause is in:
$mer->canonicalize;

And there is one typo error too if you see this line:
my $mer = Jellyfish::MerDNA->new($m);
i.e. "Jellyfish" should be "jellyfish".

Thanks

yaggo error: info_main_cmdline.yaggo:6: syntax error

I get the following yaggo syntax error when building Jellyfish head. I tried both Yaggo 1.5.4 and Yaggo head.

❯❯❯ yaggo --version
yaggo 1.5.4
❯❯❯ ruby --version
ruby 2.0.0p481 (2014-05-08 revision 45883) [universal.x86_64-darwin14]
❯❯❯ yaggo info_main_cmdline.yaggo
info_main_cmdline.yaggo:6: syntax error, unexpected tIDENTIFIER, expecting keyword_do or '{' or '('
file was created. Without any argument, it displays the command line
                                                   ^
info_main_cmdline.yaggo:7: syntax error, unexpected keyword_when, expecting '='
used, when and where it was run.
          ^

Cannot be built on i686, mips, and armhf.

Jellyfish cannot be built on any other architecture than x86_64. On at least three other platforms building fails because of embedded assembly instructions in ./include/jellyfish/rectangular_binary_matrix.hpp. Is there any way to make this work on ARM, i686, or mips64el? Or is x86_64 the only supported architecture?

Here are complete build logs for the three failing architectures:

ARM:
http://hydra.gnu.org/build/908724/log/raw

i686:
http://hydra.gnu.org/build/908727/log/raw

mips64el:
http://hydra.gnu.org/build/908726/log/raw

merge error

Hello,

When running jellyfish with the following command, piping fastq out of a bam file, I get the following error:

time /home/aflit001/dev/phylogenomics2/jellyfish count -t 10 -F 900 -m 11 -s 128M --disk --counter-len=1 --out-counter-len=1 --canonical -o sa.fa.bam.filtered.bam.11.jf.tmp --timing=sa.fa.bam.filtered.bam.11.jf.t <( bamToFastq -i ./denovo/sa.fa.bam.filtered.bam -fq /dev/stdout )

terminate called after throwing an instance of 'MergeError'
what(): Failed to open input file 'sa.fa.bam.filtered.bam.11.jf.tmp1020'
Aborted (core dumped)

Can you help with this?

Using jellyfish count --threads on a distributed network

I am running jellyfish on a HTCondor high-throughput distributed computing pool.

For different sequencing projects, I will want to run it with different values for the --threads parameter. For example, for some projects, I might split the work into a small number of multithreaded jobs, but for a project with many small FastQ inputs, I might get higher throughput with --threads 1 because the pool has many more single-thread node available.

How should I build jellyfish to support this?

  1. Do I need to build it on a node with multiple cores? with the maximum number of cores that I will use? Or can I build it on a node with a single core?
  2. What flags do I use for configure and make? Do I need to use
    make -j where N is the maximum number of cores that I will use?
  3. Is there a way to check whether a given binary supports multithreading?

Thanks,
Steve Goldstein
University of Wisconsin-Madison

high memory when k is small

Jellyfish is an excellent software regarding the memory consumption. It uses very small memory when I used version 1.1.11, it also works well for the latest version when k is large (e.g. k=25), but it is very strange that it consumes large memory when k is small (e.g. k=9). Can you fix this? Thanks.

stats

Is there an easy way to get the info that I used to, using jellyfish stats? I guess I could sum the histogram, but maybe there is another way.

libjellyfish-2.0.so.2: cannot open shared object file error

So i have been trying to use jellyfish to count kmers of size 4, so i have been using the following command
jellyfish count -s 400000 -t 32 -C -m 4 -o 4mers.txt
But i keep getting the following error libjellyfish-2.0.so.2: cannot open shared object file error. Can you help me understand this error and how to get past it

Can't merge hash with different size

Hi!

I am trying to merge two jellyfish databases, but it exits with this message:
Can't merge hash with different size (4294967296, 2147483648)
Is there any way to work it out?

I am using v2.2.4.

Thank you in advance.

Using gzip streams with the Jellyfish parser

Hi Guillaume,

We have a number of Sailfish and Salmon users asking for us to directly support gzipped fasta/q files. Right now, the only way to achieve this online (i.e. have sailfish or salmon read directly from the compressed file) is to use process substitution or named pipes. I like this solution, as I find it rather elegant (and it moves the decompression burden --- and the burden of supporting many different file formats --- off of the tool itself). However, this seems to be a very popular request, and the process substitution syntax does sometimes wreak havoc when it needs to be escaped (in qsub scripts or gnu-parallel commands etc.). Since we already rely on Boost, I was wondering if it is possible to use the Boost IOStream filters with the existing Jellyfish parser? For example, would it be possible to check the format of a file, if it is gzip, wrap it in the gzip decompression filter, and then pass it off to the Jellyfish parser somehow?

Thanks,
Rob

Error at test stage (tests/big.sh)

Dear Guillaume Marcais,

I am Ram, I have recently started to work on metagenomics analysis. I would like to thank you and your research team for coming up with a wonderful application for microbial community.

I have installed the latest version of JELLYFISH 1.1.11. from this url (http://www.cbcb.umd.edu/software/jellyfish/). I followed the instructions mentioned in the readme file. After the installation steps, I performed 2 different tests.

  1. To run the built-in tests. This test was successful. All the tests were pass except one tests/big.sh was skipped.
  2. To tests on large dataset. In this test one failed (tests/big.sh). Remaining 19 tests were pass.

I have pasted the error for your reference.
tests_big_sh

1 of 20 tests failed
See ./test-suite.log

Please report to [email protected]

make[2]: *** [test-suite.log] Error 1
make[2]: Leaving directory /home/wenchenaafc/jellyfish-1.1.11' make[1]: *** [check-TESTS] Error 2 make[1]: Leaving directory/home/wenchenaafc/jellyfish-1.1.11'
make: *** [check-am] Error 2

Can you please help me in resolving this error? Thanks in advance.

Cheers,
Ram

Make error

Hi, I got make error when compiling the git version today. Please HELP.
cmd:

git clone https://github.com/gmarcais/Jellyfish.git
autoreconf -i
./configure --prefix=/path/to/jellyfish/v2.1.4/x86_64
make

CONFIGURE

checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking how to print strings... printf
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking dependency style of gcc... gcc3
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @file support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... mt
checking if mt is a manifest tool... no
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld -m elf_x86_64
checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... yes
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... (cached) GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking for md5sum... md5sum
checking for yaggo... /usr/users/celldev/luf/local/yaggo/v1.5.4/x86_64/bin/yaggo
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for __int128... yes
checking for std::numeric_limits<__int128>... no
checking for _NSGetExecutablePath... no
checking for execinfo.h... yes
checking for ext/stdio_filebuf.h... yes
checking for siginfo_t.si_int... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating tests/compat.sh
config.status: creating jellyfish-2.0.pc
config.status: creating config.h
config.status: executing depfiles commands

config.status: executing libtool commands

Make error:

YAGGO sub_commands/count_main_cmdline.hpp
YAGGO sub_commands/info_main_cmdline.hpp
YAGGO sub_commands/dump_main_cmdline.hpp
YAGGO sub_commands/histo_main_cmdline.hpp
YAGGO sub_commands/stats_main_cmdline.hpp
YAGGO sub_commands/merge_main_cmdline.hpp
YAGGO sub_commands/bc_main_cmdline.hpp
YAGGO sub_commands/query_main_cmdline.hpp
YAGGO sub_commands/cite_main_cmdline.hpp
YAGGO sub_commands/mem_main_cmdline.hpp
YAGGO jellyfish/generate_sequence_cmdline.hpp
YAGGO unit_tests/test_main_cmdline.hpp
make all-am
make[1]: Entering directory `/usr/users/celldev/luf/local/jellyfish/v2.1.4/Jellyfish'

CXX lib/rectangular_binary_matrix.lo

CXX lib/mer_dna.lo

CXX lib/storage.lo

CXX lib/allocators_mmap.lo

CXX lib/misc.lo

CXX lib/int128.lo

CXX lib/thread_exec.lo

CXX lib/jsoncpp.lo

CXX lib/time.lo

CXX lib/generator_manager.lo

CXX sub_commands/jellyfish.o

CXX sub_commands/count_main.o

sub_commands/jellyfish.cc: In function ‘int version(int, char**)’:

sub_commands/jellyfish.cc:130:16: error: ‘PACKAGE_STRING’ was not declared in this scope
std::cout << PACKAGE_STRING << std::endl;
^

make[1]: *** [sub_commands/jellyfish.o] Error 1

make[1]: *** Waiting for unfinished jobs....

make[1]: Leaving directory `/usr/users/celldev/luf/local/jellyfish/v2.1.4/Jellyfish'

make: *** [all] Error 2

Error while loading shared libraries

Hi,

I found why we have this error

jellyfish: error while loading shared libraries: libjellyfish-2.0.so.2: cannot open shared object file: No such file or directory

After compiling Jellyfish when you type "(sudo) make install", it's the dynamic linked binary that is installed.

oxis@soyah:[~/etc/src/Jellyfish]$ ldd ./bin/.libs/jellyfish
    linux-vdso.so.1 =>  (0x00007ffeb51c1000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1186a77000)
    libjellyfish-2.0.so.2 => not found
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1186773000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f118646d000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1186257000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1185e92000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1186c95000)

oxis@soyah:[~/etc/src/Jellyfish]$ ldd ./bin/jellyfish
        not a dynamic executable
oxis@soyah:[~/etc/src/Jellyfish]$ md5sum ./bin/.libs/jellyfish
    8c89ef850aba61e4e0f7dd63c7098244  ./bin/.libs/jellyfish

oxis@soyah:[~/etc/src/Jellyfish]$ md5sum /usr/local/bin/jellyfish 
    8c89ef850aba61e4e0f7dd63c7098244  /usr/local/bin/jellyfish

Note the "libjellyfish-2.0.so.2 => not found".
Sure we can put export LD_LIBRARY_PATH="/usr/local/lib in our .bashrc instead, or fix the "not found".

I tried to find where you say which binary you want to install with no success. I'm not used to automake (more a cmake guy 😄)

Cheers!

Ben

RE: 2.1.2 is not buildable #11

I am in awe of the providers of software.
Having said that, as a 'potential' biologist user "what a nightmare!"
I would love to try Jellyfish
I am using a mid-2009 MacBook Pro with El Capitan, Xcode CLI MacPorts ...
I've just updated everything (except the gcc, which is gcc5 experimental :-( I don't seem to get it in CLI or MacPorts - g++ is in fink, I seem to have clang )
and followed stuff at issue #11
to try and solve the No rule to make target false' needed bysub_commands/count_main_cmdline.hpp'
I downloaded the tar from the Jellyfish 2.0 web site (but it takes you to Git)
and it turns out that is where yaggo is used in the makefile

yaggo install
Quick and easy
To install the yaggo script into your home directory, do:
make DEST=$HOME/bin

I am now lost at gmarcais/yaggo I can't even see what to download and I have no idea whether it will work on the mac, and when I do, do I now have to install ruby ... it is never ending (and of course, for the next programme, completely different) - sorry for the moan :-(
Any help gratefully received
Thanks
Alan Ward

SWIG wrappers fail to link on CentOS 6

I can't seem to get the Python swig wrappers to link on CentOS 6.

SWIG 3.0.7, gcc 4.4.7, Python 2.7.6.

I've googled for a while but nothing is turning up about how to fix this issue.

make[1]: Entering directory `Jellyfish-2.2.3/swig'
  CXX    python/python__jellyfish_la-swig_wrap.lo
  CXXLD  python/_jellyfish.la
/usr/bin/ld: lib/libpython2.7.a(abstract.o): relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making a shared object; recompile with -fPIC
lib/libpython2.7.a: could not read symbols: Bad value

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.