Giter VIP home page Giter VIP logo

hts-nim-tools's Introduction

hts-nim-tools

This repository contains a number of tools created with hts-nim intended to serve as examples for using hts-nim as well as to be useful tools.

These tools are:

hts-nim utility programs.
version: $version

	• bam-filter    : filter BAM/CRAM/SAM files with a simple expression language
	• count-reads   : count BAM/CRAM reads in regions given in a BED file
	• vcf-check     : check regions of a VCF against a background for missing chunks

each of these is described in more detail below.

bam-filter

Use simple expressions to filter a BAM/CRAM file:

bam-filter

  Usage: bam-filter [options] <expression> <BAM-or-CRAM>

  -t --threads <threads>       number of BAM decompression threads [default: 0]
  -f --fasta <fasta>           fasta file for use with CRAM files [default: $env_fasta].

valid expressions may access the bam attibutes:

  • mapq / start / pos / end / flag / insert_size (where pos is the 1-based start)
  • is_aligned is_read1 is_read2 is_supplementary is_secondary is_dup is_qcfail
  • is_reverse is_mate_reverse is_pair is_proper_pair is_mate_unmapped is_unmapped

to use aux tags, indicate them prefixed with 'tag_', e.g.:

tag_NM < 2. Any tag present in the bam can be used in this manner.

example:

bam-filter "tag_NM == 2 && tag_RG == 'SRR741410' && is_proper_pair" tests/HG02002.bam

count-reads

Count reads reports the number of reads overlapping each interval in a BED file.

count-reads

  Usage: count-reads [options] <BED> <BAM-or-CRAM>

Arguments:                                                                                                                                                 

  <BED>          the bed file containing regions in which to count reads.
  <BAM-or-CRAM>  the alignment file for which to calculate depth.

Options:

  -t --threads <threads>      number of BAM decompression threads [default: 0]
  -f --fasta <fasta>          fasta file for use with CRAM files [default: ].
  -F --flag <FLAG>            exclude reads with any of the bits in FLAG set [default: 1796]
  -Q --mapq <mapq>            mapping quality threshold [default: 0]
  -h --help                     show help

This is output a line with a count of reads for each line in .

vcf-check

vcf-check is useful as a quality control for large projects which have done variant calling in regions where each region is called in parallel. With many regions, and large projects, some regions can error and this might be unknown to the analyst.

This tools takes a background VCF, such as gnomad, that has full genome (though in some cases, users will instead want whole exome) coverage and uses that as an expectation of variants. If the background has many variants across a long stretch of genome where the query VCF has no variation, we can expect that region is missed in the query VCF.

Check a VCF against a background to make sure that there are no large missing chunks.

  vcf-check

  Usage: vcf-check [options] <BACKGROUND_VCF> <VCF>

Arguments:                                                                                                                                                 
  <BACKGROUND_VCF>        population VCF/BCF with expected sites
  <VCF>                   query VCF/BCF to check

Options:

  -c --chunk <INT>        chunk size for genome [default: 100000]
  -m --maf <FLOAT>        allele frequency  cutoff [default: 0.1]

This will output a tab-delimited file of chrom\tposition\tbackground-count\tquery-count.

The user can find regions that might be problematic by plotting or with some simple awk commands.

hts-nim-tools's People

Contributors

brentp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hts-nim-tools's Issues

Opening BAMs doesn't appear to return bool

var bam: Bam
assert open(bam, "myBam.bam", index = true)

gives

Error: type mismatch: got (void)
but expected one of: 
template assert(cond: bool; msg = "")

Am I missing something obvious? Thanks!

example tools

It would be nice to have a suite of example tools to further show off the syntax and utility of hts-nim.
Please leave a comment here with a description.

The strengths of hts-nim are speed, so things that are too slow in python would be good candidates. It has very good syntax for VCF and BAM.

Question: Convert a seq[CigarElement] to Cigar

Hi Brent,

First of all, thanks for hts-nim!

I'm writing a realigner using hts-nim. For the realignment I have a seq[CigarElement] but am struggling to convert that to Cigar. newCigar() takes a ptr uint32, so I don't know how to go about this. It would be great to see examples of how to create Cigars from scratch. Ultimately I would like to replace a Cigar in a given read with this new Cigar. Any chance you would have some example code for that scenario as well?

Many thanks,
Andreas

trying `hts_nim_tools count-reads <bed> <bam>` only return 0 counts

Do we need specific version hts-nim-tools to get this working? I use conda install -c conda-forge -c bioconda hts-nim-tools to install the hts-nim-tools.

My bed files are in the folllowing format:

chr5    38702540        38722603
chr12   98136449        98156541
chr8    91644923        91665040
...

Installation error

Hi, I tried to install hts-nim-tools. My operating system is CentOS 7.2.

My procedure:

curl https://nim-lang.org/choosenim/init.sh -sSf | sh
export PATH=$PATH:$HOME/.nimble/bin

nimble install -y https://github.com/brentp/hts-nim-tools --nimbleDir:$HOME/.nimble

Downloading https://github.com/brentp/hts-nim-tools using git
Verifying dependencies for [email protected]
Info: Dependency on c2nim@>= 0.9.10 already satisfied
Verifying dependencies for [email protected]
Info: Dependency on docopt@any version already satisfied
Verifying dependencies for [email protected]
Info: Dependency on regex@>= 0.7.4 already satisfied
Verifying dependencies for [email protected]
Info: Dependency on unicodedb@>= 0.7.2 already satisfied
Verifying dependencies for [email protected]
Info: Dependency on unicodeplus@>= 0.5.0 already satisfied
Verifying dependencies for [email protected]
Info: Dependency on unicodedb@>= 0.7 already satisfied
Verifying dependencies for [email protected]
Info: Dependency on lapper@any version already satisfied
Verifying dependencies for [email protected]
Info: Dependency on hts@any version already satisfied
Verifying dependencies for [email protected]
Info: Dependency on kexpr@any version already satisfied
Verifying dependencies for [email protected]
Installing [email protected]
Building hts_nim_tools/hts_nim_tools using c backend
Error: Build failed for package: hts_nim_tools
... Details:
... Execution failed with exit code 1
... Command: "/home/yupeng/.nimble/bin/nim" c --noBabelPath -d:release --path:"/home/yupeng/.nimble/pkgs/c2nim-0.9.14" --path:"/home/yupeng/.nimble/pkgs/docopt-0.6.8" --path:"/home/yupeng/.nimble/pkgs/regex-0.12.0" --path:"/home/yupeng/.nimble/pkgs/unicodedb-0.7.2" --path:"/home/yupeng/.nimble/pkgs/unicodeplus-0.5.1" --path:"/home/yupeng/.nimble/pkgs/unicodedb-0.7.2" --path:"/home/yupeng/.nimble/pkgs/lapper-0.1.4" --path:"/home/yupeng/.nimble/pkgs/hts-0.2.21" --path:"/home/yupeng/.nimble/pkgs/kexpr-0.0.2" -o:"/tmp/nimble_39277/githubcom_brentphtsnimtools/hts_nim_tools" "/tmp/nimble_39277/githubcom_brentphtsnimtools/src/hts_nim_tools.nim"
... Output: Hint: used config file '/home/yupeng/.choosenim/toolchains/nim-0.20.2/config/nim.cfg' [Conf]

... Hint: util [Processing]
... Hint: kexpr [Processing]
... Hint: version [Processing]
... Hint: lapper [Processing]
... /tmp/nimble_23162/githubcom_brentphtsnimtools/src/count_reads.nim(73, 19) Error: attempting to call undeclared routine: 'querys'

new tools: split vcf

from @davemcg's post:

split a vcf into n pieces:

hts-nim-tools vcf-split -n 10 --prefix split $vcf

will create split.1.vcf .. split.10.vcf.

Could not import: bam_hdr_destroy

Hi, when I tried to run the latest hts_nim_tools, it reports: "could not import: bam_hdr_destroy"

I searched it on the web and found the following: "Rename bam_hdr_init/_destroy/_dup() to sam_hdr_init/_destroy/_dup()" in htslib.

samtools/htslib@58b64d8

Is that the cause? Thanks.

could not import: sam_hdr_destroy

Hi there,
first of all, thank you very much for this repository! You made me curious about nim some times ago and I gave it a try yesterday for the first time and I already love it.

I tried building the tool of this repository, I don't get compilation errors but at runtime:

could not import: sam_hdr_destroy

this happened both under macOS Catalina and ubuntu 16.04. In both cases automatic installation (via nimble task named_build) of c2nim failed, but nimble install c2nim didn't.
Any hints?

Thanks again

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.