Giter VIP home page Giter VIP logo

cluster_tools's Introduction

Anaconda-Server Badge

Cluster Tools

Workflows for distributed Bio Image Analysis and Segmentation. Supports Slurm, LSF and local execution, easy to extend to more scheduling systems.

Workflows

Installation

You can install the package via conda:

conda install -c conda-forge cluster_tools

To set-up a develoment environment with all necessary dependencies, you can use the environment.yml file:

conda env create -f environment.yml

and then install the package in development mode via

pip install -e . --no-deps

Citation

If you use this software in a publication, please cite

Pape, Constantin, et al. "Solving large multicut problems for connectomics via domain decomposition." Proceedings of the IEEE International Conference on Computer Vision. 2017.

For the lifted multicut workflows, please cite

Pape, Constantin, et al. "Leveraging Domain Knowledge to improve EM image segmentation with Lifted Multicuts." arXiv preprint. 2019.

You can find code for the experiments in publications/lifted_domain_knowledge.

If you are using another algorithom not part of these two publications, please also cite the appropriate publication (see the links here).

Getting Started

This repository uses luigi for workflow management. We support different cluster schedulers, so far

  • slurm
  • lsf
  • local (local execution based on ProcessPool)

The scheduler can be selected by the keyword target. Inter-process communication is achieved through files which are stored in a temporary folder and most workflows use n5 storage. You can use z5 to convert files to it with python.

Simplified, running a workflow from this repository looks like this:

import json
import luigi
from cluster_tools import SimpleWorkflow  # this is just a mock class, not actually part of this repository

# folder for temporary scripts and files
tmp_folder = 'tmp_wf'

# directory for configurations for workflow sub-tasks stored as json
config_dir = 'configs'

# get the default configurations for all sub-tasks
default_configs = SimpleWorkflow.get_config()

# global configuration for shebang to proper python interpreter with all dependencies,
# group name and block-shape
global_config = default_configs['global']
shebang = '#! /path/to/bin/python'
global_config.update({'shebang': shebang, 'groupname': 'mygroup'})
with open('configs/global.config', 'w') as f:
  json.dump(global_config, f)
  
# run the example workflow with `max_jobs` number of jobs
max_jobs = 100
task = SimpleWorkflow(tmp_folder=tmp_folder, config_dir=config_dir,
                      target='slurm', max_jobs=max_jobs,
                      input_path='/path/to/input.n5', input_key='data',
                      output_path='/path/to/output.n5', output_key='data')
luigi.build([task])

For a list of the available segmentation worklfows, have a look at this. Unfortunately, there is no proper documentation yet. For more details, have a look at the examples, in particular this example. You can donwload the example data (also used for the tests) here.

cluster_tools's People

Contributors

constantinpape avatar martinschorb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.