Giter VIP home page Giter VIP logo

jaspy-manager's Introduction

jaspy

Conda environments for JASMIN (and beyond)

Quickstart

If you need a quick Python environment, try these...

Quickstart: Python2.7

$ git clone https://github.com/cedadev/jaspy
$ cd jaspy/src/deployment/
$ export JASPY_BASE_DIR=/usr/local/jaspy
$ ./install-miniconda.sh py2.7
$ ./install-jaspy-env.sh jaspy2.7-m3-4.5.11-r20181219
$ python -c 'import sys; print(sys.version)'

Quickstart: Python3.7

$ git clone https://github.com/cedadev/jaspy
$ cd jaspy/src/deployment/
$ export JASPY_BASE_DIR=/usr/local/jaspy
$ ./install-miniconda.sh py3.7
$ ./install-jaspy-env.sh jaspy3.7-m3-4.5.11-r20181219
$ python -c 'import sys; print(sys.version)'

Overview

This package provides the instructions and full specifications for a collection of software environments built on top of the conda package. The environments are primarily built to run on the JASMIN platform but they may be applicable for other uses.

Why conda?

Conda is a package management system that has emerged from the Python community. It is an open source and runs on Windows, macOS and Linux. Conda can be used to install, run and update packages and their dependencies. It includes features to create, save, load and switch between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.

Conda, Anaconda and Miniconda

The ecosystem of conda tools includes three main players. It is useful to understand the distinction between them:

  • Conda: The package management system itself.
  • Anaconda: A pre-selected set of (over 250) scientific software packages that can be installed in a single (conda) environment.
  • Miniconda: A basic installer that contains an entire Python installation and the conda package manager.

In this plan we are only concerned with conda and miniconda. It is necessary to start with miniconda in order to get a baseline python and conda installation. Once that is in place conda is available to create and manage multiple environments.

Workflows

This diagram attempts to explain the workflow of jaspy:

alt text

There are four workflows related to jaspy:

  1. User workflow:
  • Activate a jaspy environment
  • Use software in that environment
  1. Platform Administrator workflow:
  • Install jaspy
  • Install environment(s)
  • Document usage of environment(s) for users
  1. Environment Developer workflow:
  • Install jaspy
  • Develop new environment(s)
  • Save the new environment(s)
  • Test the installation of the new environment(s)
  • Commit the new environments(s) to the repository
  1. Jaspy Core Developer workflow:
  • Install jaspy
  • Develop and improve the core framework.
  • Test and update the code.

Workflow 1: User

Assuming that a Platform Administrator has installed jaspy then you can use it as explained here. Get your settings as recommended by your administrator (which might be you):

1. Clone Jaspy

git clone https://github.com/cedadev/jaspy
cd jaspy/src/deployment/

2. Set your base directory to install jaspy (and the conda packages)

export JASPY_BASE_DIR=/usr/local/jaspy

3. Install miniconda at the required version

./install-miniconda.sh py3.7

4. Install the conda environment required

./install-jaspy-env.sh isc-env-r20181009

5. Activate and use the environment

source ./activate-jaspy-env.sh isc-env-r20181009
python -c 'import sys; print(sys.version)'

Other features

List the available jaspy conda environments:

$JASPY_BASE_DIR/bin/list-conda-envs.sh

Versioning

There are different levels of versioning:

  1. Miniconda version, e.g.:
  1. Python version:
  • python 2.7.13
  • python 2.7.15
  • python 3.6.2
  1. Versions of the jaspy environments themselves, e.g.:
  • jaspy-py27-0.1.0
  • jaspy-py36-0.2.1

To ensure reproducibility, the jaspy approach will involve creating a separate environment for each python and miniconda version as follows:

  • ${JASPY_BASE_DIR}/jas${PY_VERSION}/${MINICONDA_VERSION}/envs/${JASPY_ENV_NAME}

E.g.:

  • /apps/contrib/jaspy/miniconda_envs/jaspy3.7/m3-4.5.11/envs/jaspy3.7-m3-4.5.11-r20181218

Note on reproducibility

Whilst this is a verbose and complex approach it is the most transparent way to ensure that environments can be reproduced. However, we are aware that on any two systems there might be subtle differences (for example in compilers and installed libraries) that might lead to differences in the installation. Unfortunately, we do not have the resource to guarantee exact reproducibility.

Note that conda allows you to "pin" environments:

https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#building-identical-conda-environments

We use this approach but take it one step further by caching everything on our own server to ensure that packages are not revoked from remote repositories.

Sign-posting "easy" versions for users

It would be undesirable to expect users to keep track of exact versions (although some may choose to for their own reasons). We will therefore provide sign-posts that specify:

  • "current"
  • "_next"
  • "_previous"

Set-up using module files

Users can activate a given jaspy environment using one of two methods:

  1. Source-activate: source ${JASPY_BASE_DIR}/bin/activate <jaspy_env_id>

  2. Module files: module load contrib/jaspy<py_version>[/<jaspy_sub_version>]

This requires the following set-up:

  • "current", "_next" and "previous" versions are symlinks in the envs/ directory of a given miniconda directory tree.

FAQs

  1. Why does jaspy insist on serving the binaries from its own channel?
  • Conda and its related ecosystem (conda-forge channel etc.,) provide a superb foundation for building and managing software environments. However, given the collaborative nature of the ecosystem there are dependencies that are out of our control, such as an update (or removal) to a package version, which can lead to problems.
  • In particular, a simple YAML environment description might resolve fine on one day but might not be repeatable the next day. A solution to this problem is to capture the exact environment in a set of binary files and cache them on our own server.
  • The jaspy channels are a place where we know that we can describe an exact environment independently of any perturbations that might be happening in other conda recipes and channels.

jaspy-manager's People

Contributors

agstephens avatar alaniwi avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.