Giter VIP home page Giter VIP logo

galaxy-tools's Introduction

galaxy-tools

image

Tools, tool-wrappers and tool dependency packages for Galaxy developed within the Bioinformatics Core Facility at the University of Manchester.

tools

The tools subdirectory contains the following tools which wrap 3rd-party software:

There are also tools wrapping in-house scripts and programs:

See the individual README files for information on how to install into a local Galaxy; alternatively where indicated a subset of tools are available from the main toolshed: https://toolshed.g2.bx.psu.edu/

conda-recipes

The conda-recipes subdirectory contains recipes for building conda dependencies.

packages

The packages subdirectory contains tool dependency packages:

  • package_numpy_1_8
  • package_pandaseq_2_8_1
  • package_python2_7

legacy

The legacy subdirectory contains tools and packages which are no longer supported, or which are backwardly-incompatible, or where development is now happening elsewhere.

local_dependency_installers

The local_dependency_installers subdirectory contains shell scripts with installer functions for many of the tool dependencies.

For example:

local_dependency_installers/trimmomatic.sh

contains a function install_trimmomatic_0_36, which will install Trimmomatic v0.36 in a Galaxy-style directory structure (i.e. .../trimmomatic/0.36/ which includes an env.sh which can be sourced to make the installed dependency available.

These functions are used primarily for setting up the test environments for the Planemo tests, but could be recycled e.g. for local tool installations into Galaxy using an appropriately configured galaxy_packages dependency resolver (see e.g. https://docs.galaxyproject.org/en/master/admin/dependency_resolvers.html#)

Use e.g. grep ^function local_dependency_installers/*.sh to list the available installer functions.

galaxy-tools's People

Contributors

bernt-matthias avatar cgirardot avatar gallardoalba avatar mvdbeek avatar pjbriggs avatar pvanheus avatar scholtalbers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

galaxy-tools's Issues

Trimmomatic error: unable identify Phred quality encoding

An user reported this error.

Screenshot from 2021-04-10 01-40-22

The error seems to be caused by Trimommatic's inability to identify the Phred encoding. In order to solve it, it would be necessary to include the possibility to select a specific encoding in the tool [-phred33|-phred64].

Trimmomatic: Separate report output for use in MultiQC?

The log is currently (at least as of 0.36.5) printing the report into stdout which in Galaxy is shown using the "info" icon. However, it is then not readable by the MultiQC tool, which has support for parsing Trimmomatic output.

Could you move the stdout report into a new history element? Could also be added as an option (e.g., "print report to a new history element?").

What do you think?

Migrate tools and Travis-CI tests to conda

Most of the tools in this repo are using tool_dependencies.xml to install any external dependencies, and the Travis-CI tests have installer scripts which attempt to mimic these in order to create the appropriate test environments.

Some of the tools (e.g. trimmomatic) can also use the conda dependency resolution, as an alternative to the tool_dependencies.xml mechanism; others (e.g. pal_finder, macs21) cannot (see issues #30 and #31).

For now conda is disabled in the Travis-CI tests (see PR #35) but longer term the tools should be updated to work with conda as well (or instead).

${input.name} instead of ${on_string}

Hi,
You do a great job. I was to lazy to wrap Trimmomatic although my colleagues regulary heat me :)

I'm running a RNASeq project on Galaxy to try Galaxy. I manage two instances since 3 years but I did'nt really use it. cli is so cool :)

So, I have just a suggestion on your wrapper to ease reading of ouputs:

Like that you keep a link with the input and use don't need to rename its dataset.

It's just a suggestion :)

Cheers

Gildas

Trimmomatic: input dataset names are incorrectly reported in output dataset titles?

Seen with Galaxy 20.05: the output datasets from the Trimmomatic tool have titles where the names of the input datasets are incorrectly reported, for example:

Trimmomatic on Galaxy3-ob__Illumina_SG_R1.fastq__cb.fastqsanger (R1 paired)

instead of

Trimmomatic on Illumina_SG_R1.fastq (R1 paired)

(Possibly the __ob__ and __cb__ parts are replacements for [ and [ i.e. "open bracket" and "close bracket"?)

Trimmomatic wrapper does not catch java exception

Hello,

While running trimmomatic on galaxy using the wrapper you developed we are having issues with some jobs that fail at the java level (java.io.IOException: Input/output error) without being detected by galaxy. The consequence is that the tool keeps running in galaxy (and so do galaxy_X.sh and tool_script.sh) even though the trimmomatic jar already failed.

@AjitPS and I have been looking at the wrapper and we think the reason for this is the && in the following line:

tee trimmomatic.log && if [ -z "\$(tail -1 trimmomatic.log | grep "Completed successfully")" ]; then echo "Trimmomatic did not finish successfully" >&2

As we understand it, the if after the && will only be processed if the pipe (which included the jar) finishes successfully, which will not happen if it fails. We think that this could be fixed replacing the && by a semicolon, as follows:

tee trimmomatic.log; if [ -z "\$(tail -1 trimmomatic.log | grep "Completed successfully")" ]; then echo "Trimmomatic did not finish successfully" >&2

In this case, the if will be processed once the pipe finishes.

What do you think? We are happy to fork the repository, make the changes and open a PR if you want.

Travis-CI builds fail for macs21 tool (Galaxy 17.01)

The Travis-CI builds are now failing for the pal_finder tool, see:

https://travis-ci.org/fls-bioinformatics-core/galaxy-tools/jobs/197609339

This has first been observed with Galaxy 17.01 release but the problems might predate this release. All the tests fail.

The test failures appear to be related to small differences in floating point values in the tool output e.g.:

--- local_file
+++ history_data
@@ -5,70 +5,70 @@
 chr26	4104449	4105233	1944.00000
 chr26	4105233	4105326	2430.00000
 chr26	4105326	4105398	2916.00000
-chr26	4105398	4105644	3402.00024
+chr26	4105398	4105644	3402.00000
...

Pal_finder fails if number of requested N-mers is zero but (N+1)-mers is non-zero

There is a bug in the pal_finder Perl script when the minimum number of 2-mer repeat units to detect is zero but the number of 3-mer units is non-zero.

In these cases the script issues a large number of error messages of the form:

Use of uninitialized value $currStart in addition (+) at /XXXX/pal_finder/0.02.04/pjbriggs/pal_finder/67ab365c29a7/bin/pal_finder_v0.02.04.pl line 1896, <PE2> line 5.
Use of uninitialized value $currType in concatenation (.) or string at /XXXX/pal_finder/0.02.04/pjbriggs/pal_finder/67ab365c29a7/bin/pal_finder_v0.02.04.pl line 1901, <PE2> line 5.
...
Argument "" isn't numeric in subtraction (-) at /XXXX/pal_finder/0.02.04/pjbriggs/pal_finder/67ab365c29a7/bin/pal_finder_v0.02.04.pl line 1987, <PE2> line 5.
...

and finally terminates with the error message:

Illegal division by zero at /XXXX/pal_finder/0.02.04/pjbriggs/pal_finder/67ab365c29a7/bin/pal_finder_v0.02.04.pl line 1518, <PR3IN> line 56.

This also seems to occur when non-zero numbers of 2-mers and 4-mers is requested but the number of 3-mers is zero.

ceas: missing python module six when running data_manager_ceas_fetch_annotations.py

Hi,

I have the following error message when running the data manager "Fetch ceas annotations" (installed from https://toolshed.g2.bx.psu.edu/view/pjbriggs/ceas/f411ce97a351):

Traceback (most recent call last): File "/w/galaxy/galaxydev/shed_tools/toolshed.g2.bx.psu.edu/repos/pjbriggs/ceas/f411ce97a351/ceas/data_manager/data_manager_ceas_fetch_annotations.py", line 13, in <module> from galaxy.util.json import from_json_string, to_json_string File "/w/galaxy/galaxydev/galaxy/lib/galaxy/util/__init__.py", line 30, in <module> from six import binary_type, iteritems, PY3, string_types, text_type ImportError: No module named six

If I set the environment properly with the dependencies, run python and try to load the six module, I have the same error:

[lgueguen@galaxy2 694]$ PACKAGE_BASE=/w/galaxy/galaxydev/shed_tools_tool_dependency_dir/python_mysqldb/1.2.5/pjbriggs/ceas/f411ce97a351; export PACKAGE_BASE; . /w/galaxy/galaxydev/shed_tools_tool_dependency_dir/python_mysqldb/1.2.5/pjbriggs/ceas/f411ce97a351/env.sh; PACKAGE_BASE=/w/galaxy/galaxydev/shed_tools_tool_dependency_dir/bx_python/0.7.1/pjbriggs/ceas/f411ce97a351; export PACKAGE_BASE; . /w/galaxy/galaxydev/shed_tools_tool_dependency_dir/bx_python/0.7.1/pjbriggs/ceas/f411ce97a351/env.sh; PACKAGE_BASE=/w/galaxy/galaxydev/shed_tools_tool_dependency_dir/cistrome_ceas/1.0.2.d8c0751/pjbriggs/ceas/f411ce97a351; export PACKAGE_BASE; . /w/galaxy/galaxydev/shed_tools_tool_dependency_dir/cistrome_ceas/1.0.2.d8c0751/pjbriggs/ceas/f411ce97a351/env.sh
[lgueguen@galaxy2 694]$ which python
/w/galaxy/galaxydev/shed_tools_tool_dependency_dir/bx_python/0.7.1/pjbriggs/ceas/f411ce97a351/venv/bin/python
[lgueguen@galaxy2 694]$ python
Python 2.7.2 (default, Oct 25 2013, 17:46:02)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more #information.

import six
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named six

A new addition in tool_dependencies.xml (in tag for ) could perhaps solve this issue:

six==1.10.0

or add a dependency to package https://toolshed.g2.bx.psu.edu/view/iuc/package_python_2_7_six_1_10_0/6944ba405057

Regards,

Loraine Guéguen

motif_tools: Travis-CI tests frequently time out fetching [email protected]

I'm currently seeing failures for the motif_tools tests on Travis-CI, which seem to be related to the test framework timing out when trying to install the tool dependencies - for example:

...
2018-09-27 12:29:33,242 INFO  [galaxy.tools.actions] Flushed transaction for job Job[id=2,tool_id=fasta_scan_iupac_each] (27.356 ms)
2018-09-27 12:29:34,363 INFO  [galaxy.jobs.handler] (2) Job dispatched
2018-09-27 12:29:34,586 DEBUG [galaxy.tools.deps.conda_util] Executing command: /home/travis/miniconda3/bin/conda create -y --override-channels --channel iuc --channel bioconda --channel conda-forge --channel defaults --name [email protected] perl-bioperl=1.6.924
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Running the test locally confirms that creating the [email protected] conda environment seems very slow, however it's unclear why this takes so long now.

Pal_finder: improve error message when no microsatellites found

In cases where no microsatellites are found, the error message is somewhat cryptic:

FATAL ERROR pal_finder failed to complete successfully

It would better to check for the final line from pal_finder and then pass this out as the error message.

Example output in this case looks like:

### Output from pal_finder ###
Configuration File Read.
Creating Microsat Hash...
Finding primers for Illumina paired end reads...
Scanning reads for microsatellites...
No microsatellites found in any reads. Ending script.

Unable to access jarfile /trimmomatic-0.32.jar

Hi,
We just installed the latest revision of the Trimmomatic tool available in the Tool Shed (revision 4:14d05f2d511d) and are getting the following error when trying to run the tool:

discarding /opt/galaxy/tool_dependencies/_conda/bin from PATH
prepending /opt/galaxy/galaxy-app/database/jobs/000/32/conda-env/bin to PATH
Arguments:
* -mx8G
* -jar
* /trimmomatic-0.32.jar
* PE
* -threads
* 24
* -phred33
* /opt/galaxy/galaxy-app/database/datasets/000/dataset_1.dat
* /opt/galaxy/galaxy-app/database/datasets/000/dataset_2.dat
* /opt/galaxy/galaxy-app/database/datasets/000/dataset_36.dat
* /opt/galaxy/galaxy-app/database/datasets/000/dataset_38.dat
* /opt/galaxy/galaxy-app/database/datasets/000/dataset_37.dat
* /opt/galaxy/galaxy-app/database/datasets/000/dataset_39.dat
* SLIDINGWINDOW:4:20
Error: Unable to access jarfile /trimmomatic-0.32.jar
Exit status: 0

Any suggestions on how to resolve this?

Pal_finder: enable tool to operate on a subset of read pairs

For large (many gigabyte-sized) Fastq files, the pal_finder tool can take a long time to complete and in some cases (e.g. on cluster systems with time limits imposed) may be terminated before generating any output.

One workaround for this is to allow users to operate on just a subset of read pairs from the input Fastqs, rather than the whole dataset.

Trimmomatic: use Galaxy datatype to set quallity encoding

If the user has explicitly set the flavour of FASTQ (fastqsanger, fastqillumina, etc), we can use this to automatically set the quality encoding, rather than have them set a parameter or relying on the autodetect of trimmomatic.

Ran into this when creating a tutorial. I set the quality encoding in Galaxy, but the tool failed because it failed to autodetect the encoding scheme.

Weeder2: bioconda updates break tool

Updates to the bioconda weeder package break the weeder2 tool:

  • Egregious bug in bioconda weeder2 Python wrapper caused an infinite recursion which ultimately consumed all memory on the host running the tool (NB fixed by bioconda/bioconda-recipes#11039)
  • Wrapper script attempts to enables running weeder2 from arbitrary location by moving to the installation directory, but this breaks the tool if user-defined frequency files are specified.

Is trimmomatic synced with toolshed?

The tool outputs two collections with identical names: one with paired and another with unpaired reads. Yet because they has identical names it is almost unusable in a large history. We have just pulled the recent (36.2) version from toolshed, yet it also outputs two identiacally named collection. I see the following in toolshed:

 <output_collection name="fastq_out_paired" type="paired">
        <element name="forward" file="trimmomatic_pe_r1_paired_out1.fastq" />
        <element name="reverse" file="trimmomatic_pe_r2_paired_out1.fastq" />
      </output_collection>
      <output_collection name="fastq_out_unpaired" type="paired">
        <element name="forward" file="trimmomatic_pe_r1_unpaired_out1.fastq" />
        <element name="reverse" file="trimmomatic_pe_r2_unpaired_out1.fastq" />
      </output_collection>

but in github version we see this:

<collection name="fastq_out_paired" type="paired" label="${tool.name} on ${readtype.fastq_pair.name}: paired">
      <filter>readtype['single_or_paired'] == "collection"</filter>
      <data name="forward" label="${tool.name} on ${readtype.fastq_pair.forward.name} (R1 paired)" format_source="fastq_pair['forward']"/>
      <data name="reverse" label="${tool.name} on ${readtype.fastq_pair.reverse.name} (R2 paired)" format_source="fastq_pair['reverse']"/>
    </collection>
      <collection name="fastq_out_unpaired" type="paired" label="${tool.name} on ${readtype.fastq_pair.name}: unpaired">
        <filter>readtype['single_or_paired'] == "collection"</filter>
        <data name="forward" label="${tool.name} on ${readtype.fastq_pair.forward.name} (R1 unpaired)" format_source="fastq_pair['forward']"/>
        <data name="reverse" label="${tool.name} on ${readtype.fastq_pair.reverse.name} (R2 unpaired)" format_source="fastq_pair['reverse']"/>
    </collection>

both seem to be 0.36.2. Can you guys please update toolshed version so we can use the proper collection naming.

Trimmomatic Galaxy tool: zlib issue when using container

Version: 0.38.1

We have a Galaxy server that runs all tools using Singularity. Trimmomatic failed with this error:
java: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory

I fixed it by adding zlib as a requirement:

--- trimmomatic.xml.orig	2020-12-09 10:08:10.905331722 +0100
+++ trimmomatic.xml	2020-12-09 09:49:58.914738876 +0100
@@ -11,6 +11,7 @@
 	https://github.com/galaxyproject/tools-iuc/commit/b5e2080a7afdea9fa476895693b6115824c6fbb9
     -->
     <requirement type="package" version="8.25">coreutils</requirement>
+    <requirement type="package" version="1.2.11">zlib</requirement>

   </requirements>
   <command detect_errors="aggressive"><![CDATA[

I'm not sure where zlib should really be added (perhaps to the trimmomatic conda package?). But let me know if you want a pull-request for this here.

trimmomatic input types (and phred64)

Currently only fastqsanger is supported as input. Could also fastqillumina be added to the allowed input types?

Also the -phred64 option is currently not supported by the wrapper. This would allow to add more supported types.

Travis-CI builds fail for pal_finder tool (Galaxy 17.01)

The Travis-CI builds are now failing for the pal_finder tool, see:

https://travis-ci.org/fls-bioinformatics-core/galaxy-tools/jobs/197609334

This has first been observed with Galaxy 17.01 release but the problems might predate this release. All the tests fail.

The logs contain multiple occurrences of

|  ### Running filtering & assembly script ###
|  File "/home/travis/build/fls-bioinformatics-core/galaxy-tools/tools/pal_finder/pal_filter.py", line 129
|  print "\n~~~~~~~~~~"
|  ^ 
| SyntaxError: Missing parentheses in call to 'print'

which looks like a Python 3 error trying to run the pal_filter.py utility.

There might also be issues with the Perl version but it's not clear if these are causing errors or are just warnings:

Use of uninitialized value $ampedLoci in concatenation (.) or string at /home/travis/build/fls-bioinformatics-core/galaxy-tools/test.tool_dependencies.pal_finder/pal_finder/0.02.04/bin/pal_finder_v0.02.04.pl line 718.
Use of uninitialized value $ampedBases in concatenation (.) or string at /home/travis/build/fls-bioinformatics-core/galaxy-tools/test.tool_dependencies.pal_finder/pal_finder/0.02.04/bin/pal_finder_v0.02.04.pl line 718.

Trimmomatic wrapper bug

Can you quotes the two instances of $@ in trimmomatic.sh? This would allow for paths that include spaces. As is, we can't link fastq files into Galaxy and use trimmomatic if they have a path like "/there are spaces/sample.fastq.gz". Using "$@" instead of $@ fixes this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.