Giter VIP home page Giter VIP logo

Comments (17)

zhpn1024 avatar zhpn1024 commented on July 19, 2024

Are you runing ribotish quality or predict? You can use '-p' option to speed up. The '-v' option can show more information of the progress.
The "Wrong exon strand" messages means the gtf annotation has different strand in a same gene. The "stop codon error" message means the stop codon annotation is not consistant with CDS annotation. These are warnings when reading gtf file, and do not affect much.

Peng

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

Dear Peng

Thank you very much for clarifying. I am working with a non-model organism, so that is perhaps why these errors occurs.

In any case:
I am using ribotish quality, and I managed to get the job to finish, however not succesfully. I am running the following script on a HPC cluster. Below the script I have inserted the resulting errors, and after that head of the .gtf file. I hope you can help me figure out what is wrong.

Cheers
Sjannie

#!/bin/bash
#SBATCH --account=nn9244k
#SBATCH --time=06:00:00
#SBATCH --cpus-per-task=5
#SBATCH --mem-per-cpu=4G
#SBATCH --job-name=ribotish
#SBATCH --array=0-23

set -o errexit # Exit the script on any error
set -o nounset # Treat any unset variables as an error

module --quiet purge # Reset the modules to the system default

module load Python/3.7.4-GCCcore-8.3.0

NAMES=($(cat /cluster/work/users/sjannies/rfp/rfp_sample_names.list))

echo running ribotish on sample ${NAMES[${SLURM_ARRAY_TASK_ID}]}

ribotish quality --geneformat gtf -p 5 -b /cluster/work/users/sjannies/rfp/star/bams/${NAMES[${SLURM_ARRAY_TASK_ID}]}_rfp_goldfish_Aligned.sortedByCoord.out.bam -g /cluster/home/sjannies/blast_databases/GCF_003368295.1_ASM336829v1_genomic.gtf

echo finished ribotish

The following errors came up:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/cluster/software/Python/3.7.4-GCCcore-8.3.0/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/cluster/software/Python/3.7.4-GCCcore-8.3.0/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/ribo.py", line 942, in _lendis_trans
for r in bam.transReadsIter(bamfile, t, compatible=False, maxNH=maxNH, minMapQ=minMapQ, secondary=secondary, paired=pai
red, flank=flank):
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/bam.py", line 496, in transReadsIter
for read in rds: #yield read
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/bam.py", line 45, in fetch_reads
rds = self.fetch(reference=chr, start=start, end=stop) #, multiple_iterators=multiple_iterators)
File "pysam/libcalignmentfile.pyx", line 1081, in pysam.libcalignmentfile.AlignmentFile.fetch
File "pysam/libchtslib.pyx", line 692, in pysam.libchtslib.HTSFile.parse_region
ValueError: start out of range (-34)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/cluster/home/sjannies/.local/bin/ribotish", line 56, in
main()
File "/cluster/home/sjannies/.local/bin/ribotish", line 34, in main
commands[cmd].run(args)
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/run/quality.py", line 85, in run
cdsBins = args.bins, numProc = args.numProc, verbose = args.verbose, geneformat = args.geneformat)
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/ribo.py", line 980, in lendis
for result in len_iter:
File "/cluster/software/Python/3.7.4-GCCcore-8.3.0/lib/python3.7/multiprocessing/pool.py", line 354, in
return (item for chunk in result for item in chunk)
File "/cluster/software/Python/3.7.4-GCCcore-8.3.0/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
ValueError: start out of range (-34)

Head of .gtf file:
#gtf-version 2.2
#!genome-build ASM336829v1
#!genome-build-accession NCBI_Assembly:GCF_003368295.1
#!annotation-source NCBI Carassius auratus Annotation Release 100
NC_039243.1 Gnomon gene 5705 16574 . - . gene_id "LOC113109012"; db_xref "GeneID:113109012"; gbkey "Gene"; gene "LOC113109012"; gene_biotype "protein_coding";
NC_039243.1 Gnomon exon 16363 16574 . - . gene_id "LOC113109012"; transcript_id "XM_026272525.1"; db_xref "GeneID:113109012"; gbkey "mRNA"; gene "LOC113109012"; model_evidence "Supporting evidence includes similarity to: 8 Proteins, and 100% coverage of the annotated genomic feature by RNAseq alignments, including 74 samples with support for all annotated introns"; product "OX-2 membrane glycoprotein-like, transcript variant X1"; exon_number "1";

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

The error is because the tool is trying to get the 5'UTR region (upstream) of a transcript, while the CDS start is too close to the chromosome start, so a error occur when trying to fetch a negative position.
I'll fix it soon. Or you may find out the gene and remove it in quality step.

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

Ok! Thank you very much for explaining. I'll run it again with -v to get more info, and do as you suggest and/or explore other tools meanwhile.
Thanks again for answering so quickly!

Cheers
Sjannie

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

I have committed an update in zbio/bam.py. Replace the file and try again.

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

Hi Peng

Unfortunately, I still get the same error. I tried uninstalling and installing again after downloading the package. I do not get information about the offending transcript, even with -v option, so cannot remove it easily.

Cheers
Sjannie

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

I do not mean install again. Just download the latest src/zbio/bam.py file, and replace the old file in your computer ("/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/bam.py" as in your error message).

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

For me, it was easier to just download the repository, this should have included the file, right? There is no option to just download the 'bam.py' file, from what I can see.... But that is why I first downloaded the repository again , but it still did not work. So I uninstalled and installed, and still get the same error...

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

Where do you download the repository? Try git clone:
git clone https://github.com/zhpn1024/ribotish

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

I downloaded it from this site, from the same tab where the repository is cloned, I just press download zip instead. I am not sure how to use git on a remote computer cluster. In any case, after removing everything, and downloading it, the file must have been replaced.

Still the errors:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/cluster/software/Python/3.7.2-GCCcore-8.2.0/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/cluster/software/Python/3.7.2-GCCcore-8.2.0/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/ribo.py", line 942, in _lendis_trans
for r in bam.transReadsIter(bamfile, t, compatible=False, maxNH=maxNH, minMapQ=minMapQ, secondary=secondary, paired=pai
red, flank=flank):
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/bam.py", line 496, in transReadsIter
for read in rds: #yield read
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/bam.py", line 45, in fetch_reads
rds = self.fetch(reference=chr, start=start, end=stop) #, multiple_iterators=multiple_iterators)
File "pysam/libcalignmentfile.pyx", line 1081, in pysam.libcalignmentfile.AlignmentFile.fetch
File "pysam/libchtslib.pyx", line 692, in pysam.libchtslib.HTSFile.parse_region
ValueError: start out of range (-34)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/cluster/home/sjannies/.local/bin/ribotish", line 56, in
main()
File "/cluster/home/sjannies/.local/bin/ribotish", line 34, in main
commands[cmd].run(args)
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/run/quality.py", line 85, in run
cdsBins = args.bins, numProc = args.numProc, verbose = args.verbose, geneformat = args.geneformat)
File "/cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/ribo.py", line 980, in lendis
for result in len_iter:
File "/cluster/software/Python/3.7.2-GCCcore-8.2.0/lib/python3.7/multiprocessing/pool.py", line 354, in
return (item for chunk in result for item in chunk)
File "/cluster/software/Python/3.7.2-GCCcore-8.2.0/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
ValueError: start out of range (-34)
NW_020523287.1
NW_020523288.1
/var/spool/slurmd/job353985/slurm_script: line 22: 33860 Segmentation fault (core dumped) ribotish quality -v --geneformat gtf -p 5 -b /cluster/work/users/sjannies/rfp/star/bams/${NAMES[${SLURM_ARRAY_TASK_ID}]}_rfp_goldfish_Aligned.sortedByCoord.out.bam -g /cluster/home/sjannies/blast_databases/GCF_003368295.1_ASM336829v1_genomic.gtf

I am now trying to run the analysis after having removed the two entries listed above completely from the gtf file, as I presume they are written there because they are causing the error.

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

Hmm, that did not work either, so the listing of those two entries probably had nothing to do with it.
I give up.

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

You may have downloaded the correct file, but do not replace the installed file. In your latest error report, the bam.py file is still the original version.

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

... I removed EVERYTHING from my .local python package folder. I then downloaded everything from this site. I then ran the pip install to install the package. How can it still be the old version of the file that is being used? I also previously tried downloading the new bam.py (or the code, as there is no option to download a single file) and that did not work either... I'll let the cluster managers know what you are saying. I cannot see what else I can do to replace this file, so maybe something has been installed somewhere else that I am not getting at when deleting and uninstalling...

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

That's the problem. You acturally installed from pip. The pip version is not updated yet. The new bam.py file is just in your downloaded zip.

from ribotish.

sjannielefevre avatar sjannielefevre commented on July 19, 2024

Of course. Thanks for clearing that up :)

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

So just replace the bam.py file:
cp unzippedfolder/src/zbio/bam.py /cluster/home/sjannies/.local/lib/python3.7/site-packages/ribotish/zbio/bam.py

from ribotish.

zhpn1024 avatar zhpn1024 commented on July 19, 2024

Update to v0.2.5 and try.

from ribotish.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.