Giter VIP home page Giter VIP logo

rstudio2019's Introduction

How To Make Your Data Analysis Notebooks More Reproducible

rstudio_talk_slides

Slide deck | Slide deck as PDF

πŸŽ₯ Video of talk at rstudio::conf(2019)

Resources

I have included a handful of links to papers, software packages and tutorials/manuals about some tools I mention in my talk. Pull requests or issues on additional ones to include are welcome.

Research Compendia

Examples of Research Compendia on GitHub Below are a few links to real world examples of research compendia in R. To have a minimal compendium, all you really need is a valid DESCRIPTION file containing a handful of fields such as type, name, version and dependencies. See Marwick et al 2017 for a detailed description of the different types of compendia.

Small

Medium

Large

Software packages related to research compendia

  • πŸ“¦ rrtools by Ben Marwick (also the author of the packaging data analysis paper mentioned above) extends functions in devtools and provides instructions, templates, and functions to make a basic compendium suitable for doing reproducible research with R.
    • Also see πŸ“¦ workflowr by John Blischak and the task view on R-based data analysis projects maintained by John Blischak, Anna Krystalli, Ben Marwick, Daniel NΓΌst.
  • πŸ“¦ usethis Many of the major function in rrtools are imported from usethis. A savvy user can get by setting up and maintaining a compendium purely with usethis functions.
  • πŸ“¦ goodpractice - Designed to help you build more robust packages, the package does a deep dive on your package contents and provide advice on syntax pitfalls to avoid, code formatting suggestions, and helps you improve overall package structure.
  • The πŸ“¦ rticles package by JJ has numerous journal templates and together with Rstudio addins like word countaddin and citr + knitcitations.

πŸ“ˆ Data management

  • πŸ“¦ piggyback, [docs]: This clever R package allows you to attach arbitrary data (or other) files (upto 2gb each) to a GitHub release. Given GitHub's fast CDN, this would be an easy way to quickly attach large files to a compendium and read them back in a local/collaborator/remote environment very easily. As always be sure to archive a long-term copy on Zenodo.
  • πŸ“¦ arkdb [docs]: This package allows you to archive and unarchive databases as flat text files.
  • πŸŽ₯ For more on setting up data packages, see this excellent talk by Noam Ross at New York R.

Computational environments: Binder and friends

Other hosted Binder hubs

Setting up Binder for your analysis

I have captured all the various ways to set up mybinder with a R project in a separate document.

Are you interested in setting up or hosting a binderhub for the R community? Get in touch via the issues.

Also see

Software packages related to setting up computational environments

  • πŸ“¦ Containerit. Detailed blog post This sweet package will generate a Dockerfile for you by examining the code inside a folder or just from your session info. This is analogous to repo2docker but is very R centric
  • stevedore Although there are a few docker clients (docker, harbor), this is my recommendation for managing docker containers from inside R.

πŸ”¨ Workflows: drake and friends

  • πŸ“¦ drake - An R-focused pipeline toolkit for reproducibility and high-performance computing. Install the package from here or CRAN.
  • The prequel to the drake R package A blog post by the creator of drake describing his motivation for the package.
  • drake manual A detailed bookdown guide on how to setup and use drake for projects of varying levels of complexity.
  • Presentation on drake Slides from a talk by Will Landau (who is here at the conference so go pick his brain if you want to learn more!)

Real world drake examples

Miscellaneous


Acknowledgments

Many thanks to Chris Holdgraf, Carl Boettiger, Will Landau, and Ben Marwick for various discussions on these topics. Also thanks to Ciera Martinez, Kara Woo, and Nick Tierney for comments on the presentation.

rstudio2019's People

Contributors

karthik avatar batpigandme avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.