Giter VIP home page Giter VIP logo

rickgelhausen / hribo Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 4.0 2.72 MB

We present HRIBO (High-throughput annotation by Ribo-seq), a workflow to enable reproducible and high-throughput analysis of bacterial Ribo-seq data. The workflow performs all required pre-processing steps and quality control. Importantly, HRIBO outputs annotation-independent ORF predictions based on two complementary prokaryotic-focused tools, and integrates them with additional computed features. This facilitates both the rapid discovery of ORFs and their prioritization for functional characterization.

License: GNU General Public License v3.0

Python 94.24% R 4.75% Shell 1.01%
bioinformatics prokaryotes ribosome-profiling snakemake

hribo's People

Contributors

eggzilla avatar rickgelhausen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hribo's Issues

TTS support

  • allow TTS and RNATTS tags in addition to TIS and RNATIS.

mapping

  • 3' end mapping
  • centroid mapping
  • 5' end mapping

config.yaml

  • the genomeindexpath variable in the config file seems to be useless

additional requests

"Distribution of aligned read lengths, to be sure they are mostly
between 26 and 34 nt, centred at 29-30 nt, and relatively even." - Sarah Svensson

Replace gffread

  • currently gffread is used for the generation of a stable gtf/gff file for further processing
  • gffread requires some post-processing
  • it might be easier to combine everything into one script

update config file

  • add config parameter for Archea sites
  • add config parameter for featurecounts ID

Check annotation coordinates

Check annotation files for minus coodinates, used for features spanning the origin, e.g:
pSYSG ena gene -1055 1914 . - . gene_id "BAD02057"; gene_name "sll8049"; gene_source "ena"; gene_biotype "protein_coding";

The associated stop codon had the coordinates -1055 -1052
ensembl #epicfail
ribotish crashes due to this

Dependencies issues - environment.yaml

Hi,

Thanks for your work on what seems to be an amazing tool! I can't wait to try it out once I figure out some small dependencies issues for which I could use some help.

I'm walking through the documentation that you provided ( https://hribo.readthedocs.io/en/latest/source/overview.html#tools ), and I am having some issues related to python dependencies.
It seemed that HRIBO relies on DeepRibo (both for DataParser.py and for DeepRibo.py), which I've cloned via git (from https://github.com/Biobix/DeepRibo ).

The issue that I'm having is that it seems DeepRibo relies on some deprecated type aliases (in particular np.int) from numpy. I've searched and these aliases seem to have been deprecated from numpy 1.20.

I breifly thought about updating these (for instance replacing np.int by np), but I was scared of going through the rabbit hole of having to fix more and more things on DeepRibo.

So I thought I'd use numpy in version 1.19.5, but then my python (which is setup from the environment.yaml provided) wasn't compatible with it. I tried to force (in the environment.yml) python to an older version (like 3.7) but I got even more dependencies issues during the execution of the command conda create --file HRIBO/environment.yaml

By the way, it seems the webpage https://hribo.readthedocs.io/en/latest/source/overview.html#tools recommends using python 3.7, but the environment.yaml provided explicitly requests python=3.11.0=he550d4f_1_cpython
Can I please ask which one are you using personally?

I'm wondering if you are still having a runnable version of HRIBO, if you could please export and publish an updated version of your working environment. That would be a great help to avoid the dependency issues.

Just a final question/remark about "conda create" versus "conda env create": the same webpage also recommends to use:
conda create --file HRIBO/environment.yaml
But that fails with the error: conda create: error: one of the arguments -n/--name -p/--prefix is required
So of course I've tried:
conda create --file HRIBO/environment.yaml -n hribo_env
But that fails with the error: CondaValueError: could not parse 'name: hribo_env' in: HRIBO/environment.yaml
I've searched a bit, and it seems (according to conda/conda#6827 but which is a fairly old thread) that "environment.yml doesn't work with conda install --file. You need to use conda env commands with environment.yml."
Therefore in everything I have attempted I have used:
conda env create --file HRIBO/environment.yaml
Is that what you would recommend to do as well?

Any help would be very much appreciated!
Kind regards.

rework tracks folder

  • re/move temporary files from tracks folder
  • make clear diagram of merging strategy (helps understanding of file names)
  • update documentation (report) to better explain each file
  • add potential names to tracks (check if predicted ORF already exists and add name)
  • add colors for different tracks

clean Snakefiles

  • delete all snakefiles that are obsolete
  • we should have:
  1. Snakefile
  2. Snakefile_nixtail (rename)
  3. Snakefile_paired_end
  4. Snakefile_paired_end_nixtail ???
  5. Snakefile_deepribo ??? (maybe combine this with Snakefile)

Use of temp

  • maybe we should use temp() on files like genomeSegemehl index to remove them after they become obsolete

create excel files

File 1: ORF information

  • unique id
  • start
  • stop
  • orf length
  • potential RBS

templates

  • add a template for paired end data

Readme/Template

Hi again,

In your config.yaml template under DE for contrasts the help is # comma-seperated list of condition contrasts: treated1_control1,treated2_control2...
but contrasts should be - separated and not _ (which is clear reading the docs however)

The --configfile HRIBO/config.yaml param for running the workflow is obsolete as the path is hardcoded in the Snakefile

rework ribotish

  • generate appropriate annotation (exon, cds, gtf format)
  • update ribotish/all/tis to use the correct parameters

gene synteny

  • würzburg is interested in gene synteny analysis

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.