Giter VIP home page Giter VIP logo

apricot's Introduction

APRICOT: Advanced Platform for Reproducible Infrastructures in the Cloud via Open Tools

Introduction

APRICOT is an open-source extension to support customised virtual infrastructure deployment and usage from Jupyter notebooks. It allows multi-cloud infrastructure provisioning using a wizard-like GUI that guides the user step by step through the deployment process. It implements Ipython magic functionality to use and manage the deployed infrastructures within Jupyter notebooks for increased usability.

Experiment replication methodology

A generic computational experimentation involves the elements shown in the following figure. So, to be able to reproduce any experiment, any researcher needs:

  • Required input data or a method to produce it.

  • Specific hardware needs, like memory requirements, GPGPUs, FPGAs, multi-node cluster, etc.

  • A list of all the required software and their configuration, such as MPI on a cluster, specific language compilers, specific software or libraries, source codes, etc.

  • Clear instructions on how to reproduce the entire experiment.

Alt text

APRICOT can be used to achieve reproducible experiments for experiments that require complex customised computing infrastructures using Jupyter notebooks. The key points to develop reproducible experiments using APRICOT extensions are:

  • Required data must be provided using external storage systems or a notebook with instructions to create it. APRICOT has been configured to use OneData as external storage provider.

  • APRICOT provides a set of predefined configurable infrastructures to fit the experiments. Any researcher can easily deploy the same computing infrastructure than the one used in a previous experimentation carried out with the deployed infrastructure in APRICOT.

  • APRICOT allows remote execution of commands at the deployed infrastructures to ease interaction. So, extra needed software can be documented and installed at the infrastructure within the same Jupyter notebook where the experimentation has been documented in order to be executed step by step.

  • Since APRICOT extension uses Jupyter notebooks as base environment, all the experiment can be documented using text, life code and images.

Components

Alt text

APRICOT has been constructed using the following components:

  • Jupyter, an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
  • CLUES, an energy management system for High Performance Computing (HPC) Clusters and Cloud infrastructures.
  • EC3, an open-source tool to deploy compute clusters that can horizontally scale in terms of number of nodes with multiple plugins.
  • IM, an open-source virtual infrastructure provisioning tool for multi-clouds.
  • ONEDATA, global data storage backed by computing centers and storage providers worldwide.

Installation

Requisites

APRICOT requires the IM client to deploy the infrastructure and get the access credentials. The installation details can be found at IM documentation.

Also, APRICOT requires a Jupyter installation, since uses its environment to run. It is compatible with Jupyter 4.x and 5.x versions.

Installing

To install APRICOT package, run the provided install script

bash install.sh

Deployment

Infrastructure deployment with APRICOT is done using a notebook extension. This one, creates the button showed at next image, which opens a GUI to guide the user step by step trough deployment process.

Alt text

Deployment process include,

  • Select infrastructure topology between a set of predefined infrastructures. At the moment, predefined topologies are Batch-Cluster, MPI-Cluster and Advanced which allows a custom configuration.
  • Select a cloud provider where deploy the infrastructure. Actual supported providers are OpenNebula and AWS.
  • Depending on the selected provider, introduce the required access credentials.
  • Specify frontend and workers specifications such as memory, CPUs, OS image etc.
  • Set infrastructure features like maximum number of workers or cluster identifier name.
  • Deploy the infrastructure.

Alt text

If any error occurs during deployment, this will be shown in the web console. Also, it is possible to get infrastructure configuration logs using implemented magics functions or get them directly using IM client via a terminal or bash magics into the notebook.

Infrastructure management

To manage and use previous deployed infrastructures within Jupyter notebook environment, a set of Ipython magic functions have been implemented. These functions are listed below:

  • Magic lines:
    • apricot_log:
      • Arguments: Cluster name identifier
      • Returns: The configuration logs of specified cluster
    • apricot_ls: Takes no arguments and returns a list with all the deployed clusters using EC3. Internally, executes a ec3 list.
    • apricot_nodels:
      • Arguments: Cluster name identifier.
      • Return: A list of working nodes and their status at the specified cluster.
    • apricot_upload: Upload specified local files into the specified cluster destination path.
      • Arguments: Cluster name identifier, upload files paths, destination path.
    • apricot_download: Download files located at specified cluster to local storage.
      • Arguments: Cluster name identifier, download files paths, local destination path.
  • Magic line and cell:
    • apricot: Perform multiple tasks depending on input command.
      • exec: Takes as arguments a cluster name identifier and a command to be executed in the specified cluster. This call is syncronous.
      • execAsync: Same as exec but the call is done asyncronous.
      • list: Same as apricot_ls
      • destroy: Take a cluster name identifier as argument an destroys the infrastructure.

Like any Jupyter magics, these must be lodaded at the notebook using %reload_ext apricot_magic or configure Jupyter to load these magics in all notebooks.

Docker

A docker file has been provided to construct a docker image with Jupyter and APRICOT configured. Use

docker pull grycap/apricot

to pull the image. Then, use

docker run --publish 8888:8888 -it grycap/apricot

to create and execute a container. The container will start automatically a Jupyter server with APRICOT preconfigured. Then, use the url provided by Jupyter to access to the server.

Licensing

APRICOT is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

apricot's People

Contributors

vigial avatar antoniosanch3z avatar gmolto avatar micafer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.