Giter VIP home page Giter VIP logo

pybam's People

Contributors

johnlonginotto avatar muffato avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pybam's Issues

Conversion to Python3

I converted Pybam to Python3. It is not fully functional (only dynamic parsing is functional).
It is currently available https://github.com/luidale/pybam.

Would it be reasonable to incorporate/add to original pybam repository? Update to fully functional?

pybam parse error?

I've implemented your pybam module as a simple check for my package to look if BAM headers match fasta headers, a user has this error, any idea what it means?

raceback (most recent call last):
  File "/mnt/apps/funannotate/bin/funannotate-predict.py", line 223, in
<module>
    if not lib.BamHeaderTest(args.input, args.rna_bam):
  File "/mnt/apps/funannotate/lib/library.py", line 457, in BamHeaderTest
    bam_file = pybam.bgunzip(bamin)
  File "/mnt/apps/funannotate/lib/pybam.py", line 88, in __init__
    self.header_text =
struct.unpack(str(length_of_header)+'s',first_chunk[8:8+length_of_header])[0]
struct.error: unpack requires a string argument of length 4906118

They thought it was due to length of header, which said user truncated, but still got same error. My thought is the BAM file is malformed but I don't know exactly.

Letting you know

File "pybam.py", line 265
print 'PYBAM ERROR: Odd bzgip block detected! The author of pybam didnt think this would ever happen... please could you let me know?'

how to write the alignment back to a bam?

Hi John,
Is there a write function to write the record to a bam file?

This will not give a bam file

for read in pybam.read('my.bam'):
        fout.write(read.bam)

Thanks!

gzip: error writing to output: Broken pipe

Hi John,
Thanks for such a simple option for parsing a BAM file. I'm trying to use pybam to just do something really really simple, essentially I just want to parse the chromosome_headers and check them against fasta headers. What I have so far is working, but I keep getting a gzip error at the end of my script.
Here is my little function to compare a fasta header to the chromosome headers in the BAM file to make sure they are the same:

def BamHeaderTest(genome, mapping):
    import pybam
    from Bio import SeqIO
    #get list of fasta headers from genome
    genome_headers = []
    with open(genome, 'rU') as input:
        for rec in SeqIO.parse(input, 'fasta'):
            if rec.id not in genome_headers:
                genome_headers.append(rec.id)
    #get list of fasta headers from BAM
    bam_file = pybam.bgunzip(mapping)
    bam_headers = bam_file.chromosomes_from_header
    #now compare lists, basically if BAM headers not in genome headers, then output bad names to logfile and return FALSE
    genome_headers = set(genome_headers)
    diffs = [x for x in bam_headers if x not in genome_headers]
    if len(diffs) > 0:
        log.debug("ERROR: These BAM headers not found in genome FASTA headers\n%s" % ','.join(diffs))
        return False
    else:
        return True

So the above works as it is suppose to, however, after the script finishes I always get these two lines that output in terminal:

Using gzip!
[10:58:16 PM]: Fasta headers in BAM file do not match genome, exiting.
gzip: error writing to output: Broken pipe
gzip: genome4.badname.bam: uncompress failed

I can see the call to sys.stderr printing the 'Using gzip!' message, but I can't seem to figure out where the gzip errors are coming from. Even when the script runs all the way to completion, the same messages show up at the end of my log file:

[10:16:48 PM]: Re-naming gene models
[10:16:52 PM]: Converting to final Genbank format
[10:16:57 PM]: Collecting final annotation files
[10:16:57 PM]: Funannotate predict is finished, output files are in the badname/predict_results folder
[10:16:57 PM]: Note, you should fix any tbl2asn errors now before running functional annotation.
gzip: error writing to output: Broken pipe
gzip: genome4.cleaned.bam: uncompress failed

unqualified exec is not allowed in function 'compile_parser'

exec(code,exec_dict) in globals(), locals()

Unless I add in globals(), locals() bits to the above line (line 585, pybam.py), I get the following error:

SyntaxError: unqualified exec is not allowed in function 'compile_parser' it contains a nested function with free variables

I found this workaround on the following discussion:
http://stackoverflow.com/questions/4484872/in-python-why-doesnt-exec-work-in-a-function-with-a-subfunction
I was using Python 2.7.5

--mahmut

'No such process' error

if self._subprocess.returncode is None:
    self._subprocess.kill()

Unless I add above if-statement in line 728(pybam.py), I get the following error:

Exception OSError: (3, 'No such process') in <bound method read.del of <lib.pybam.pybam.read instance at 0x3b1fcb0>> ignored

I found above workaround on the following documentation page:
https://docs.python.org/2.7/library/subprocess.html#subprocess.Popen.poll
which says

A None value indicates that the process hasn’t terminated yet

I was using Python 2.7.5 with Mixed Parser Example pattern given in the readme file

--mahmut

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.