Giter VIP home page Giter VIP logo

dummybinner's Introduction

dummybinner

Dummy binner of contigs and reads that uses kmer frequencies, GC content and coverage. Few dependencies, basic results.

Citation

Dummy binner is GNU publicly licensed. Enjoy it.

In case you cite it:

Santos-Junior, C.D. (2023) Dummy binner - Using tetranucleotide frequencies, GC and
coverage to cluster contigs and reads. Software in GitHub Repository available
in <https://github.com/celiosantosjr/dummybinner>. Accessed in Month/Year.

Installing

Dummy binner uses python>3.9 and a set of packages easilly installed using

$ pip install biopython numpy pandas scipy scikit-learn tqdm pysam

The versions required follow:

Requirement Version
biopython 1.79
numpy 1.22.4
pandas 1.4.4
scipy 1.9.1
sklearn 0.0
tqdm 4.64.0
pysam 0.16.0.1

Usage

usage: dummy_binner2.py [-h] [--r R] --mode MODE [--otag OTAG]
                        [--minlen MINLEN] [--k K] [--threshold THRESHOLD]
                        [--mincontigs MINCONTIGS] [--cov COV]
                        (--infile INFILE | --f F)

Dummy binner can be used in the reads mode, where it accepts the minimum sequence length (--minlen), the R1 and R2 files (--f and --r, respectively), and you can also specify the output tag for the output filename (--otag). The basic usage cases are:

python3 dummy_binner2.py --f <R1.fq.gz> --r <R2.fq.gz> --mode reads

For contigs, the binning is a bit more parameterized, you should input a contigs file (--infile), can also specify the minimum contig length to be accepted (--minlen) and the minimum number of contigs in a bin to be considered (--mincontigs). You also can alternatively give the kmer size (--k) and the maximum distance between two contigs (--threshold). The basic usage cases are:

python3 dummy_binner2.py --infile <contigs.fa> --mode contigs

To generate coverage profiles to be applied in the binner, please use the script make_coverages.py. This script works in a set of bam files pre-sorted and indexed. The basic usage is as follows:

python3 make_coverages.py [-h] -c CONTIGS -o OUTPUT bam_files [bam_files ...]

Contact

In case any issues, please contact me.

dummybinner's People

Contributors

celiosantosjr avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.