Giter VIP home page Giter VIP logo

mycluster's Introduction

MyCluster

PyPI version

Library and command line interface to support interacting with multiple HPC clusters

Provides the ability to interact with the most popular HPC job scheduling systems using a single interface and enables the creation of job submission scripts.

Tested with SGE, LSF and slurm (PBS/TORQUE support under development)

Getting started

MyCluster can be installed from PyPi.

pip install mycluster

Configuration

Storing your email address

MyCluster will write your email address into any submission files so you can recieve updates from the schedulers. You can supply this on the command line or store it in a configuration file. To store your email in the configuration file run:

mycluster configure

Setting a custom scheduler

By default MyCluster will try and detect the underlying scheduler but this can be overridden by setting the MYCLUSTER_SCHED environment variable. This should be set to a string name of a Python class that implements the mycluster.schedulers.base.Scheduler class.

Override the submission template

In some cases you may want to override the submission templates, for example if you want to include additional parameters or scheduler commands. To do this set the MYCLUSTER_TEMPLATE environment variable to the jinja template you wish to use. See mycluster/schedulers/templates for the base templates.

Command Line

MyClusyter installs the "mycluster" cli command to interact with the local scheduler via the command line.

Print command help

mycluster <command> --help

List all queues

mycluster queues

List jobs

mycluster list

Create a new submission file, see --help for more submission options.

mycluster create JOBFILE QUEUE RUNSCRIPT

Submit a job file

mycluster submit JOBFILE

Cancel a job

mycluster cancel JOBID

The RUNSCRIPT to be executed by the JOB_SCRIPT can make use of the following predefined environment variables

export NUM_TASKS=
export TASKS_PER_NODE=
export THREADS_PER_TASK=
export NUM_NODES=

# OpenMP configuration
export OMP_NUM_THREADS=$THREADS_PER_TASK

# Default mpiexec commnads for each flavour of mpi
export OMPI_CMD="mpiexec -n $NUM_TASKS -npernode $TASKS_PER_NODE -bysocket -bind-to-socket"
export MVAPICH_CMD="mpiexec -n $NUM_TASKS -ppn $TASKS_PER_NODE -bind-to-socket"
export IMPI_CMD="mpiexec -n $NUM_TASKS -ppn $TASKS_PER_NODE"

API

Mycluster can be used programatically using the mycluster module. All schedulers implement the base mycluster.schedulers.base.Scheduler class.

import mycluster

# Detect the local scheduler
scheduler = mycluster.detect_scheduling_sys()

print(f"Scheduler loaded: {scheduler.scheduler_type()}")

# Create a batch script to submit a 48 task run of script.sh to the skylake queue
script = scheduler.create("skylake", 48, "my_job", "script.sh", "01:00:00", tasks_per_node=24)

# Write to a file
with open("mysub.job", "w") as f:
    f.write(script)

# Submit the batch script
job_id = scheduler.submit("mysub.job")

# Check the status of the job
print(scheduler.get_job_details(job_id))

# Cancel the job
scheduler.delete(job_id)

mycluster's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mycluster's Issues

mycluster -q fails with GE 6.2u5

vagrant@sgemaster:/usr/share/gridengine$ mycluster -q

Local database in: /home/vagrant/.mycluster/
User: Fred Bloggs
Email: [email protected]
Local job scheduler: sge
Site name: unknown

Queue Name | Node Max Task | Node Max Thread | Node Max Memory | Max Task | Available Task
error: no queues remaining after -pe queue selection
Traceback (most recent call last):
File "/usr/local/bin/mycluster", line 94, in
main()
File "/usr/local/bin/mycluster", line 42, in main
mycluster.print_queue_info()
File "/usr/local/lib/python2.7/dist-packages/mycluster/mycluster.py", line 221, in print_queue_info
nc = scheduler.node_config(q)
File "/usr/local/lib/python2.7/dist-packages/mycluster/sge.py", line 127, in node_config
config['max task'] = int(new_line.split(' ')[4])
ValueError: invalid literal for int() with base 10: '490.0M'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.