Giter VIP home page Giter VIP logo

ebi-metagenomics / workflow-is-cwl Goto Github PK

View Code? Open in Web Editor NEW
6.0 7.0 7.0 31.34 MB

This repository contains CWL descriptions of the various tools which will allow you to build workflows for the annotation of transcripts

Home Page: https://www.elixir-europe.org/

License: Apache License 2.0

Common Workflow Language 4.36% Shell 0.26% Pep8 0.60% Roff 0.75% Dockerfile 0.05% HTML 93.98%
cwl-workflow common-workflow-language cwl workflow cwl-descriptions transdecoder diamond phmmer cwl-description marine

workflow-is-cwl's Introduction

Build Status

ebi-metagenomics

Metagenomics at the EMBL-EBI

working name "MGportal"

Introduction

What's metagenomics?

The study of all genomes present in any given environment without the need for prior individual identification or amplification is termed metagenomics. For example, in its simplest form a metagenomic study might be the direct sequence results of DNA extracted from a bucket of sea water.

What is the EBI doing for metagenomic researchers?

The EBI resources of the European Nucleotide Archive (in particular Sequence Read Archive and EMBL-Bank), UniProt, InterPro, Ensembl Genomes and IntAct are all used for analysis by metagenomic researchers, but not in an integrated manner. We intend to provide a user friendly interface to these services, promoting their utility in the field of metagenomics. It will enable protein prediction, function analysis, comparison to complete reference genomes and metabolic pathway analysis.

This Project

The code here will include everything that has been generated to fulfill the above criteria, including the design of the web portal to submit and browse data.

MGportal is being developed as an open source project at EMBL European Bioinformatics Institute (EBI).

The web portal

The web portal is under development. Please visit us to see the latest version on http://www.ebi.ac.uk/metagenomics/.

workflow-is-cwl's People

Contributors

arnaudmeng avatar hmenager avatar mscheremetjew avatar stain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

workflow-is-cwl's Issues

cwlexec: InterProScan5 result file is not mapped to the output binding

A reason for that could be that the I5 steps runs in scattered mode.
Have a look at the produced JSON output (cwlexec) of the TranscriptsAnnotation workflow->i5Annotations:

{
  "coding_regions" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.cds",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.cds",
    "basename" : "test_transcripts.cleaned.fasta.transdecoder.cds",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned.fasta.transdecoder",
    "nameext" : ".cds",
    "checksum" : "sha1$a5c91d3096d3f691bb2fe751f589b3a5cceff905",
    "size" : 4883,
    "class" : "File"
  },
  "peptide_sequences" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.pep",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.pep",
    "basename" : "test_transcripts.cleaned.fasta.transdecoder.pep",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned.fasta.transdecoder",
    "nameext" : ".pep",
    "checksum" : "sha1$34b74df01848af46572d8a90749b5dc0a36519de",
    "size" : 1713,
    "class" : "File"
  },
  "reformatted_sequences" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.pep.reformatted_seqs",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.pep.reformatted_seqs",
    "basename" : "test_transcripts.cleaned.fasta.transdecoder.pep.reformatted_seqs",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned.fasta.transdecoder.pep",
    "nameext" : ".reformatted_seqs",
    "checksum" : "sha1$63510367a96fb7ac855bd87c5bd7ada924828ea9",
    "size" : 1713,
    "class" : "File"
  },
  "i5Annotations" : {
    "size" : 0,
    "secondaryFiles" : [ ],
    "class" : "File"
  },
  "cleaned_transcripts_file" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta",
    "basename" : "test_transcripts.cleaned.fasta",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned",
    "nameext" : ".fasta",
    "checksum" : "sha1$79fd0ed1d5b160b45f02f4ce627ac0bf2343411e",
    "size" : 15453,
    "format" : "http://edamontology.org/format_1929",
    "class" : "File"
  },
  "gff3_output" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.gff3",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.gff3",
    "basename" : "test_transcripts.cleaned.fasta.transdecoder.gff3",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned.fasta.transdecoder",
    "nameext" : ".gff3",
    "checksum" : "sha1$3b157e697608f1170664bc7acbf97466ee604b14",
    "size" : 741,
    "class" : "File"
  },
  "bed_output" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.bed",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.transdecoder.bed",
    "basename" : "test_transcripts.cleaned.fasta.transdecoder.bed",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned.fasta.transdecoder",
    "nameext" : ".bed",
    "checksum" : "sha1$733e6fcbb99cc4063e528a00bac66de9a2a9901c",
    "size" : 212,
    "class" : "File"
  },
  "diamond_matches" : {
    "location" : "file:///home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.diamond_matches",
    "path" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0/test_transcripts.cleaned.fasta.diamond_matches",
    "basename" : "test_transcripts.cleaned.fasta.diamond_matches",
    "dirname" : "/home/user/maxim/cwlexec-test/TranscriptsAnnotation-i5only-wf-426dab18-eb67-4274-8525-be8873ceeed0",
    "nameroot" : "test_transcripts.cleaned.fasta",
    "nameext" : ".diamond_matches",
    "checksum" : "sha1$0ba85f5d0453f9f74d68b94d8215ceb3223c1df1",
    "size" : 286,
    "class" : "File"
  }
}

Validating the complete assembly workflow

I'm working on the complete workflow validation. I've difficulties to figure out how to fix the issue below. Any idea ?

Tool definition failed validation: ../workflow-is-cwl-assembly/workflows/TranscriptomeAssembly-wf.cwl:155:5: Type property "['null', 'seq_type']" not a valid Avro schema: Union item must be a valid Avro schema: Could not make an Avro Schema object from seq_type.

Runned command :

cwltool --validate ../workflow-is-cwl-assembly/workflows/TranscriptomeAssembly-wf.cwl ../workflow-is-cwl-assembly/workflows/TranscriptomeAssembly-wf.test.job.paired-end.yaml

BUSCO: Backslash char is not supported in FASTA header - Add FASTA header cleaning step

While running the annotation pipeline on a full transcriptome assembly, the assembly assessment step (BUSCO) failed, complaining about unsupported characters in the FASTA header

INFO	Configuration loaded from /software/applications/busco/3.0.2/scripts/../config/config.ini
INFO	Init tools...
INFO	Check dependencies...
INFO	Check input file...
ERROR	The character '/' is present in the fasta header >MMETSP0795-doi:10.5281/zenodo.249982-Transcript_0, which will crash Reader. Please clean the header of your input file.
ERROR	BUSCO analysis failed !
ERROR	Check the logs, read the user guide, if you still need technical support, then please contact mailto:[email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.