Giter VIP home page Giter VIP logo

docker-why's Introduction

docker-why

Quick notes about why an R user would use Docker, jotted down as I read and thought about responses to this tweet:

Has anyone in the #rstats world written really well about the why of their use of Docker, as opposed to the how?

This does NOT yet go exhaustively though all the replies, make proper links, etc. Mea culpa.

Note: I say "Docker" but perhaps I mean "Linux Containers". I think Docker/DockerHub is to Linux Containers as Git/GitHub is to distributed version control. It's not the only game in town, but it's the most prevalent one today. To push the metaphor even further, Singularity seems to be the GitLab of Linux Containers.

Main categories

  • Educator: Expert, opinionated R user wants to provide a very specific R environment to other people. Often coupled with using the cloud.

    • Carl B: "I can point them to a URL with RStudio+tidyverse etc ready-to-go on running on an Amazon machine, a XSEDE allocation, or campus cluster". Via DM.
    • Mine and Colin's Duke teaching setup. Link to PeerJ preprint.
    • Steph Locke's teaching setup. blog post.
  • Research/Analysis team: Similar to the Educator approach, a research team wants to share a common environment with the same software (and versions thereof), data sets, etc., and deploy/scale this environment as needed.

  • R Package Developer

    • You have OS A but need to test/debug on OS B (could be a different version of A or an entirely different OS). For certain combinations of A and B, Docker is a way to drop into a running version of B on A. Contrast this with the autopsy reports you get from R CMD check on CRAN, rhub, travis, appveyor. BrodieG example in reply to tweet. Sean Kross blog post. Jim H draft blog post.
      • Note that many of the services mentioned, such as Travis CI and R Hub, are themselves using containers in order to build and test your project on a wide variety of systems.
    • Your normal development setup is A but you need to test/debug with setup B. Actually toggling back and forth is a PITA; it's nicer to use Docker for the setup you use less frequently. Jim H's clang with sanitizers example.
    • You are working in a group of developers, each specialized in some part of a complicated stack necessary to deliver Thing. The different sub-stacks can be containerized to allow everyone to innovate in their domain, to spare everyone from understanding and maintaining everything, and to test Thing when you combine different versions of the sub-components. Rich FitzJohn example via Slack.
  • User of the cloud Certain things you do require more compute than you have, so you'll pay for compute as needed. Docker is ?the? main way to record/setup the environment you need to be in on this rented "hardware". So you work locally in Docker to interactively develop Thing and then just transfer it to the cloud when it's time to scale up. Mark Edmondson blog post

  • Nomad: Your computer(s) is not a physical thing, but rather one or more Docker files. You can instantiate any of your usual computing environments from an internet cafe in the middle of nowhere.

  • Researcher who values computational reproducibility: You want to document and share an analysis, with a guarantee that it can be repeated. This is an emerging best practice for scientific publication and some industries, e.g. pharma, may do this to meet regulatory requirements. Therefore, you decide to capture the entire computational environment. Contrast this with the use of packrat or checkpoint to specifically manage the R packages used. This is the use case that has gotten the most attention from a "why?" point of view. Lots of links in the tweet thread.

  • Other: wrathematics example of HPC and large scale I/O, tweet.

Connection to other workflows

something about Docker as version control for your computational setup

Further reading

Lots of excellent resources exist for using Docker. Obviously I feel they cover how much better than why, hence my question. Good leads for getting started with Docker:

  • PRs for links very welcome here! The tweet thread has many!

docker-why's People

Contributors

cpsievert avatar eddelbuettel avatar jennybc avatar nfultz avatar noamross avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.