jdiez / rubra Goto Github PK
View Code? Open in Web Editor NEWThis project forked from bjpop/rubra
Infrastructure code to support DNA pipeline
License: MIT License
This project forked from bjpop/rubra
Infrastructure code to support DNA pipeline
License: MIT License
Rubra: a bioinformatics pipeline. --------------------------------- https://github.com/bjpop/rubra License: -------- Rubra is licensed under the MIT license. See LICENSE.txt. Description: ------------ Rubra is a pipeline system for bioinformatics workflows. It is built on top of the Ruffus (http://www.ruffus.org.uk/) Python library, and adds support for running pipeline stages on a distributed compute cluster. Authors: -------- Bernie Pope, Clare Sloggett, Gayle Philip, Matthew Wakefield Usage: ------ usage: rubra [-h] PIPELINE_FILE --config CONFIG_FILE [CONFIG_FILE ...] [--verbose {0,1,2}] [--style {print,run,touchfiles,flowchart}] [--force TASKNAME] [--end TASKNAME] [--rebuild {fromstart,fromend}] A bioinformatics pipeline system. optional arguments: -h, --help show this help message and exit PIPELINE_FILE Your Ruffus pipeline stages (a Python module) --config CONFIG_FILE [CONFIG_FILE ...] One or more configuration files (Python modules) --verbose {0,1,2} Output verbosity level: 0 = quiet; 1 = normal; 2 = chatty (default is 1) --style {print,run,touchfiles,flowchart} Pipeline behaviour: print; run; touchfiles; flowchart (default is print) --force TASKNAME tasks which are forced to be out of date regardless of timestamps --end TASKNAME end points (tasks) for the pipeline --rebuild {fromstart,fromend} rebuild outputs by working back from end tasks or forwards from start tasks (default is fromstart) Example: -------- Below is a little example pipeline which you can find in the Rubra source tree. It counts the number of lines in two files (test/data1.txt and test/data2.txt), and then sums the results together. rubra example_pipeline.py --config example_config.py --style run There are 2 lines in the first file and 1 line in the second file. So the result is 3, which is written to the output file test/total.txt. The --pipeline argument is a Python script which contains the actual code for each pipeline stage (using Ruffus notation). The --config argument is a Python script which contains configuration options for the whole pipeline, plus options for each stage (including the shell command to run in the stage). The --style argument says what to do with the pipeline: "run" means "perform the out-of-date steps in the pipeline". The default style is "print" which just displays what the pipeline would do if it were run. You can get a diagram of the pipeline using the "flowchart" style. You can touch all files in order using the "touchfiles" style, which is mostly useful for forcing Ruffus to acknowledge that a set of steps is up to date. Configuration: -------------- Configuration options are written into one or more Python scripts, which are passed to Rubra via the --config command line argument. Some options are required, and some are, well, optional. Options for the whole pipeline: ------------------------------- pipeline = { "logDir": "log", "logFile": "pipeline.log", "procs": 2, "end": ["total"], } Options for each stage of the pipeline: --------------------------------------- stageDefaults = { "distributed": False, "walltime": "00:10:00", "memInGB": 1, "queue": "batch", "modules": ["python-gcc"] } stages = { "countLines": { "command": "wc -l %file > %out", }, "total": { "command": "./test/total.py %files > %out", }, }
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.