Giter VIP home page Giter VIP logo

sandpaper's Introduction

{sandpaper}: User Interface to The Carpentries Workbench

R Universe Codecov test coverage Lifecycle: experimental CRAN status R-CMD-check

The {sandpaper} package was created by The Carpentries to re-imagine our method of creating lesson websites for our workshops. This package will take a series of Markdown or RMarkdown files and generate a static website with the features and styling of The Carpentries lessons including customized layouts and callout blocks. Much of the functionality in this package is inspired by Jenny Bryan’s work with the {usethis} package.

Documentation

Want to know how this works in a lesson format? Head over to https://carpentries.github.io/sandpaper-docs/.

If, instead, you already know how a lesson is built and are interested in understanding how the functions in {sandpaper} work, you can visit this package documentation site at https://carpentries.github.io/sandpaper/.

Installation

{sandpaper} is not currently on CRAN, but it can be installed from our Carpentries Universe (updated every hour) with the following commands:

options(repos = c(
  carpentries = "https://carpentries.r-universe.dev/", 
  CRAN = "https://cran.rstudio.com/"
))
install.packages("sandpaper", dep = TRUE)

Note that this will also install development versions of the following packages:

package What it does
{varnish} html, css, and javascript templates for The Carpentries (in progress)
{tinkr} manipulation of knitr markdown documents built on the commonmark xml library
{pegboard} programmatic interface to lesson components for validation (in progress)

Design

This package is designed to make the life of the lesson contributors and maintainers easier by separating the tools needed to build the site from the user-defined content of the site itself. It will no longer rely on Jekyll or any of the other >450 static site generators, but instead rely on R, RStudio, and {pkgdown} to generate a site with the following features:

Rendering locally

diagram of three folders. The first folder, "episodes/", labelled as RMarkdown, has an arrow (labelled as hash episodes) pointing to "site/built/", labelled as Markdown. The Markdown folder has an arrow (labelled as "apply template") pointing to "site/docs/", labelled as "HTML". The first folder is labelled in pale yellow, indicating that it is the only one tracked by git.

The local two-step model of deployment into local folders

In a repository generated via {sandpaper}, only the source is committed to avoid issues surrounding out-of-date artefacts and directory structure confusion.

The website is generated in two steps:

  1. markdown files from the source files are rendered containing a hash for the source file so that these need only be re-rendered when they change.
  2. html files are generated from the rendered markdown files and the CSS and JS sources in the {varnish} package for the preview.

To ensure there are no clashes between minor differences in the user setup, no artifacts are committed to the main branch of the repository. Because of the caching mechanism between the website and the rendered markdown files, long-running lessons can be updated and previewed quickly.

Rendering on continuous integration

Diagrammatic representation of the GitHub deployment cycle showing four branches, gh-pages, md-outputs, main, and my-edit. The my-edit branch is a direct descendent of the main branch, while the gh-pages and md-outputs branches are orphans. Each commit of the main branch has a process represented by a dashed arrow that builds a commit of the subsequent orphan branches

Two-step deployment model on continuous integration

Continuous integration will act as the single source-of-truth for how the outputs of the lessons are rendered. For this, we want the resulting website to be:

  • CI agnostic (but currently set up with GitHub)
  • easy to set up
  • auditable (e.g. I can see changes between the content of two commits)
  • versionable (e.g. I can instruct learners to go to <WEBSITE>/1.1. This is inspired from the python documentation style)

To acheive this, there will be two branches created: md-outputs and gh-pages that will inerit like so main -> md-outputs -> gh-pages. Because the build time from main to md-outputs can be time intensive, this will default to updating only files that were changed.

  • md-outputs: this branch will contain the files and artifacts generated from rmarkdown in the vignettes directory of a thin package skeleton.
  • gh-pages: this branch is generated via md-outputs and bundles the html, css, and js for the website. This will contain a single index.html file with several subfolders with different versions of the site. The index.html file will redirect to the current/ directory, which contains the up-to-date site.

Scheduled builds

  • gh-pages website: Because we are designing the lessons to have content separated from the styling, we will set up the CI to generate the webpage from the pre-built sources on a weekly basis, which will check if there has been an update to the styles (which I have in the {varnish} package) and then rebuild the site without rebuilding the content.
  • md-outputs branch: This will be rerun every month from scratch with the most recent version of R and R packages. If there is a change, a pull request can be generated to update the renv.lock file with a link to the changed markdown files in this branch.

Function syntax

The functions in {sandpaper} have the following prefixes:

  • create_ will create/amend files or folders in your workspace
  • update_ will update build resources in the lesson
  • build_ will build files from your source
  • check_ validates either the elements of the lesson and/or episodes
  • fetch_ will download files or resources from the internet
  • reset_ removes files or information
  • get_ will retrieve information from your source files as an R object
  • set_ will update information in files.
  • ci_ interacts with continous integration to build the website

Here is a working list of user-facing functions:

Lesson and Episode Creation

  • create_lesson() creates a lesson from scratch
  • create_episode() creates a new episode with the correct number prefix
  • create_dataset() creates a csv or text data set from an R object
  • set_episodes() arranges the episodes in a user-specified order

Accessors

  • get_config() reads the contents of config.yaml as a list
  • get_drafts() reports files that are not listed in config.yaml
  • get_episodes() returns the episode filenames as a vector
  • get_syllabus() returns the syllabus with timings, titles, and questions

Website Creation and Validation

  • check_lesson() checks and validates the source files and lesson structure
  • build_episode_md() renders an individual file to markdown (internal use)
  • build_episode_html() renders a built markdown file to html (internal use)
  • build_lesson() builds the lesson into a static website
  • build_portable_lesson() builds the lesson into a portable static website
  • fetch_lesson() fetches the static website from the lesson repository

Continuous Integration Utilities

  • ci_deploy() builds and deploys the lesson on CI from the source files
  • ci_build_markdown() builds the markdown files on CI from the source and deploys them to the markdown branch.
  • ci_build_site() deploys the lesson on CI from pre-rendered markdown files
  • ci_release() builds and deploys the lesson on CI from the source files and adds a release tag
  • update_github_workflows() updates GitHub workflows

Cleanup

  • reset_episodes() removes the schedule from the config.yaml file
  • reset_site() clears the website and cache

Usage

There are five use-cases for {sandpaper}:

  1. Creating lessons
  2. Contributing to lessons
  3. Maintaining lessons
  4. Rendering a portable site
  5. Rendering a site with GitHub actions.

Creating a lesson

To create a lesson with {sandpaper}, use the create_lesson() function:

sandpaper::create_lesson("~/Desktop/r-intermediate-penguins")

This will create folder on your desktop called r-intermediate-penguins with the following structure:

|-- .gitignore               # - Ignore everything in the site/ folder
|-- .github/                 # - Scripts used for continuous integration
|   `-- workflows/           #
|       |-- deploy-site.yaml # -   Build the source files on github pages
|       |-- build-md.yaml    # -   Build the markdown files on github pages
|       `-- cron.yaml        # -   reset package cache and test
|-- episodes/                # - PUT YOUR MARKDOWN FILES IN THIS FOLDER
|   |-- data/                # -   Data for your lesson goes here
|   |-- figures/             # -   All static figures and diagrams are here
|   |-- files/               # -   Additional files (e.g. handouts) 
|   `-- 00-introducition.Rmd # -   Lessons start with a two-digit number
|-- instructors/             # - Information for Instructors
|-- learners/                # - Information for Learners
|   `-- setup.md             # -   setup instructions (REQUIRED)
|-- profiles/                # - Learner and/or Instructor Profiles
|-- site/                    # - This folder is where the rendered markdown files and static site will live
|   `-- README.md            # -   placeholder
|-- config.yaml              # - Use this to configure commonly used variables
|-- CONTRIBUTING.md          # - Carpentries Rules for Contributions (REQUIRED)
|-- CODE_OF_CONDUCT.md       # - Carpentries Code of Conduct (REQUIRED)
|-- LICENSE.md               # - Carpentries Licenses (REQUIRED)
`-- README.md                # - Introduces folks how to use this lesson and where they can find more information.

Once you have your site set up, you can add your RMarkdown files in the episodes folder. By default, they will be built in alphabetical order, but you can use the set_episodes() command to build the schedule in your config.yaml file:

s <- sandpaper::get_episodes()
sandpaper::set_episodes(order = s, write = TRUE)

When you want to preview your site, use the following:

sandpaper::build_lesson()

Working in RStudio?

If you are using RStudio, you can preview the lesson site using the keyboard shortcut ctrl + shift + B (which corresponds to the “Build Website” button in the “Build” tab. To preview individual files, you can use ctrl + shift + K (This corresponds to the “Knit” button in the editor pane)

This will create the website structure inside of the the site/ folder, render the RMarkdown files to markdown (for inspection and quick rendering), render the markdown files to HTML, and then enable a preview within your browser window.

Contributing to a Lesson

To contribute to a lesson, you can either fork the lesson to your own repository and clone it to your computer manually from GitHub, or you can use the {usethis} package to automate it. For example, This is how you can create a copy of Programming With R to your computer’s Desktop.

usethis::create_from_github(
  repo = "swcarpentry/r-novice-gapminder", 
  destdir = "~/Desktop/r-novice-gampinder",
  fork = TRUE
)

This will copy all of the source files to your computer and move you to the directory.

Note that the rendered website will not be immediately available. To download the site as it currently appears on the web, use:

sandpaper::fetch_lesson(markdown = TRUE, site = TRUE)

This will download the site and the rendered markdown files into the site/ folder. To save bandwidth, you can choose to just download the markdown files and artifacts by settin site = FALSE. Now, you can edit the Rmarkdown files in episodes/ and quickly render the site.

To upload changes to the lesson repository, you can use the follow

Maintaining a Lesson

When you are maintaining a lesson, there is a high likelihood that you will already have a copy on your machine. If not, follow the instructions in the contributing to a lesson section above.

The typical workflow will look like this:

  1. open the sandpaper project in RStudio and make edits to files in the episodes/ folder
  2. in the R console run the following
sandpaper::check_lesson() # validates the structure of the input files
sandpaper::build_lesson() # builds and validates lesson

Rendering a portable site

To render a portable site, you can follow the instructions for contributing to a lesson or maintaining a lesson to set up. Once you have the lesson set up, you can use the following command:

sandpaper::build_portable_lesson(version = "current")

This will render a fully portable lesson site as a zip file in the site/ folder. You can distribute this lesson to learners who do not have reliable internet access for use offline without sacrificing any of the styling.

Rendering with GitHub actions

Ultimately, there should be a minimal number of functions that handle this situation because writing CI configuration files is maddening. The most straightforward function is:

sandpaper::ci_deploy(md_branch = "md-outputs", site_branch = "gh-pages")

This function will create git worktrees for the orphan md-outputs branch in the site/built folder and the orphan gh-pages branch in the site/docs folder. After that, we generate the site as normal.

Because css and js libraries may need updating before any lesson material does, a step can be created just for rebuilding the site that uses:

sandpaper::ci_build_site(branch = "gh-pages")

When a lesson is given a release, the current site folder needs to be duplicated to a versioned folder and a tag needs to be added to the md-outputs branch:

sandpaper::ci_release(tag = "0.1", md_branch = "md-outputs", site_branch = "gh-pages")

sandpaper's People

Contributors

astrodimitrios avatar bencomp avatar bisaloo avatar brownsarahm avatar cpauvert avatar erinbecker avatar fmichonneau avatar froggleston avatar jcolomb avatar joelnitta avatar klbarnes20 avatar maelle avatar milanmlft avatar olexandr-konovalov avatar tobyhodges avatar yabellini avatar zkamvar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sandpaper's Issues

Pandoc extension audit

autolink_bare_uris and emoji need to be added to the list:

construct_pandoc_args <- function(path_in, output, to = "html", ...) {
exts <- paste(
"smart",
"auto_identifiers",
"tex_math_dollars",
"tex_math_single_backslash",
"markdown_in_html_blocks",
"yaml_metadata_block",
"header_attributes",
"native_divs",
sep = "+"
)

GitHub Actions: consider centralizing workflows

One of the main things that's currently painful in the lesson template is the way it's updated: via pull request.

I've been able to split away the components for updating it into two packages, {sandpaper} for the engine and {varnish} for the styling, but there is still a third, deployment component that are the github actions.

In out current mode of thinking, we are duplicating the actions across the repositories, but that's problematic because it means that they can be modified and we want to avoid doing that because we know that they will need to be updated in the future. This comment outlines the issue quite well: https://github.community/t/being-dry-centralized-workflows/16548

One solution is creating a stand-alone Composite action: https://docs.github.com/en/free-pro-team@latest/actions/creating-actions/creating-a-composite-run-steps-action

The requirement for this is to make sure that these actions do not require an extra secret to be added to the repos (or maybe that's not so bad with the tradeoff of having smaller actions).

Feature: incorporate {learnr} lessons as part of the style

The {learnr} package provides interactive lessons with answer-checking mechanisms that run in a shiny environment written as RMarkdown documents rendering to a learner::tutorial document. We might want to consider a method for compiling the episodes into a package serving learnr tutorials from the episodes directory.

Things to consider (based on initial run):

{learnr} is agnostic of {dovetail}, so both will work well together 💯

knitr::knit(tutorial) does not work for learnr documents

knitr::knit("lrrrrn/lrrrrn.Rmd")


processing file: lrrrrn/lrrrrn.Rmd
  |......                                                                                      |   7%
  ordinary text without R code

  |............                                                                                |  13%
label: setup (with options) 
List of 1
 $ include: logi FALSE

Quitting from lines 8-11 (lrrrrn/lrrrrn.Rmd) 
Error: package or namespace load failed for 'learnr':
 .onAttach failed in attachNamespace() for 'learnr', details:
  call: NULL
  error: The shiny_prerendered_chunk function can only be called from within runtime: shiny_prerendered

rmarkdown::render(tutorial) does not work out of the box.

It produces a markdown file that has a yaml header and nothing else. It also inserts a dput_to_string character vector into my global namespace. I am going to assume that there is a mechanism for building this and I just haven't read it (I didn't read the manual very much at all)

Breaking Change: Rethink Lesson Folder Organization

At the moment, the folders in the lesson are all organized underneath the episodes folder. This is.... not optimal because it breaks the relationship between the folder structure you see in the repository with the folder structure that you see on the website. Ideally, you would want to have the folder structure reflect the structure of the website menu.

All of this can change, but it would be a good idea to make sure that we can switch between folder structures (heavily related to #19).

At the moment, it would be good to have a similar structure to what we have for the current template with the flexibility and notion that it could change.

I think the idea behind this is that the folders could look like this:

./
|-- CONDUCT.md
|-- SETUP.md
|-- episodes/
|-- extras/
|-- something/ 
|   `-- hidden/
`- LICENSE.md

The issue with this is how you tell {sandpaper} that this is the configuration you want without getting too in the weeds with yaml files. Why, you ask? Because I want to avoid behemoth config files that look like this: https://github.com/grunwaldlab/Population_Genetics_in_R/blob/master/_site.yml

It might be possible to do something like this, but it might be too complex 😬

- episodes
  path: episodes/
  order:
  - introduction.md
  - getting-to-know-vectors.Rmd
- extras
  path: extras/
  order:
  - factors.Rmd
  - classes.Rmd
- hidden
  path: something/hidden/
  order:
  - hidden-extra.md

CRON GitHub Action to update R packages used

For R-based lessons, the packages will be kept in check via {renv}. To avoid simply pinning the repository to a 2021 version of R for the rest of time, we will have a monthly action that will update the packages used by {renv} and create a pull request with the changes to the {renv} lockfile.

This will trigger the PR mechanism that will give a preview of the markdown diff due to the changes in the packages. If they are acceptable, the maintainer can merge and everything will be updated.

If there are averse changes to the updates, the maintainer can change the content of the RMarkdown sources in that branch so that they are acceptable.

Figure out a better way to translate to markdown

At the moment, we still have a weird case where we have to trick {pkgdown} into building the site by writing markdown files and Rmd files. This is largely fine, except when there are examples of how to write knitr documents, then it tries to render the examples and things go goofy 🙁

I've tried the trick of putting the lessons into the top of the site/ directory, but the problem is that the div tags are screwed up from pandoc somehow 🙁 and truncated early.

Code Handout

Overview

Extract content from the challenge blocks and translate them to code blocks for each episode.

A concept has been applied to Data Carpentry R Ecology lesson: https://github.com/datacarpentry/R-ecology-lesson/blob/main/make_code_handout.R, but it extracts EVERYTHING </Gary Oldman Voice>.

each exercise should be in its own file

Learners have a document that they can fill out as they go along. (see R ecology DC lesson). This also potentially allow new usage as “magic cells” and code slides.

Magic cells

This is a concept in Jupyter notebooks that allows people to load code into a cell using a single command.

Dr. Sarah Brown presented a poster on magic cells during CarpentryCon 2018 she also provided a repository containing examples.

Time of extraction

These should be created automagically on lesson rendering via pandoc lua filters.

Make sure stale md files are removed.

At the moment, {sandpaper} may not be clearing out old markdown files unless reset_site() is called. AFAICT, it should be doing this by default

Note: I've removed a test file that does show up on the website, but is not included in the diff for the markdown outputs. I think this is an error in {sandpaper} where invalid outputs are not cleared away.

Originally posted by @ravmakz in zkamvar/testme#6 (comment)

Document Local Workflow

The local workflow for a lesson maintainer or contributor should include:

  1. pulling the repository
  2. adding content
  3. previewing content
  4. creating a pull request

Fix schedule manipulation

At the moment, {sandpaper} automatically creates a schedule in the YAML for the template lesson materials, which creates frustration if you just want to drop in your file-ordered episodes and go.

Things to do:

  • use get_source_files() when the schedule is NULL or blank
  • don't auto-write the schedule on lesson creation
  • add option for create_episode() to add to the schedule and set it to FALSE
  • make sure options in the yaml are not over-written by rescheduling (this is an artifact of the way yaml::read_yaml() parses the files)

rename functions to align with concepts

I'm wondering if build_lesson() should be renamed build_site() so it matches with reset_site()? but I guess it would clash with {pkgdown} and {blogdown} function names? If we call reset_site() reset_lesson() users might be confused about what that would do (reset content of the lesson files?). It seems that have build_* and reset_* use the same suffix would be useful though...

Feature: build pages with different levels of hierarchy

At the moment, it's not quite possible to build pages that are not in the top level of the site because linking to them gets muddled in the template.

Case in point: zkamvar/glowing-chainsaw@bafd045 attempts to build new pages inside of an episode, but runs into problems because:

  1. It cannot build to the site because that has not been initialized at the time the page is generated
  2. building to a subfolder of episodes does not link the back and forward links properly because it assumes that they are on the same level.

Standardize setup page

From our discussion in maintainers meeting.

We thought it would be good to have sections for the both data download part of setup and then also the software install part of setup on the setup page.

Document PR workflow

The PR workflow has some extra features that require some documentation for the maintainers and contributors to understand what to expect:

  1. Source of truth is in the main branch of the repository
  2. PRs will have an extra comment that shows the output of the code.
  3. Describe checks that will be performed on the HTML and the markdown

Feature: include controls for dependency management

At the moment, we are thinking of using {renv} for dependency management, but there is the question of exactly how much we want to be responsible for in terms of supporting languages.

For example, if we are supporting an R lesson, we have the correct solution with {renv}, but if we wanted to support the environment for a python lesson, then it becomes tricker because of the relationship between {renv} and {reticulate} (See rstudio/renv#543 and rstudio/renv#537)

We know that BASH works and we are fairly confident that SQL works, but the question is: how do we ensure that these lessons can have auto-generated output should the lesson maintainer choose to do so?

BUG: set_*() functions modify existing yaml lists

I introduced set_episodes() and set_instructors() and set_learners() to allow people to configure the order of their episodes in the lesson.... well, apparently, using set_episodes() and then set_learners() forces the yaml list to collapse into a comma separated list 🤦

# Navigation ------------------------------------------------
# The menu bar will be in the following order:
# 
# - Code of Conduct      (CODE_OF_CONDUCT.md)
# - Setup                (Setup.md)
# - Episodes             (episodes/)
# - Learner Resources    (learners/)
# - Instructor Resources (instructors/)
# - Learner Profiles     (profiles/)
# - License              (LICENSE.md)
# 
# Use the following menu items to specify the order of
# individual pages in each dropdown section

# Order of episodes in your lesson 
episodes: 
- 01-starting-with-data.Rmd
... SNIP ...
- 15-supp-loops-in-depth.Rmd


# Information for Learners
learners: setup.md,reference.md

Lesson validation

Valid episodes should have the following attributes

  • YAML header:
    • title: string
    • teaching: number
    • exercises: number
  • Blocks:
    • Objectives
    • Exercises

Lua filter for Objectives block does not handle section titles or immediate following paragraph

The following patterns give odd results for the current lua filters:

:::::::::::::::: objectives ::::::::::::::::::::

- Explain how to use markdown with the new lesson template
- Demonstrate how to include pieces of code, figures, and nested challenge blocks

::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::: questions :::::::::::::::::::::

- HOW do you write a lesson using RMarkdown and `{sandpaper}`?

::::::::::::::::::::::::::::::::::::::::::::::::

Here's a fun paragraph that should not be included with the previous blocks.

screeshot showing the objectives block with the last paragraph included within the previous block

:::::::::::::::: objectives ::::::::::::::::::::

## Things to consider

- Explain how to use markdown with the new lesson template
- Demonstrate how to include pieces of code, figures, and nested challenge blocks

::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::: questions :::::::::::::::::::::

## Questions to ponder

- HOW do you write a lesson using RMarkdown and `{sandpaper}`?

::::::::::::::::::::::::::::::::::::::::::::::::

screenshot showing a malformed objectives block that contains two headers for questions and objectives

GitHub Actions: Translate warnings and errors to workflow commands

https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#setting-a-warning-message

I've found it helpful in pull requests when warnings and errors label the lines where an error is found. This is achieved in GitHub Actions via workflow commands that are strings output to stdout that start with ::.

For example, this warning was generated from this workflow command:

::warning file=episodes/01-introduction.Rmd,line=10,col=5::This is a warning

Confirm this works on windows without RStudio.

Greg wilson brought up to make sure that we don't force users to have RStudio if they don't want it. This means that the setup should work on Windows with a text editor like VSCode or Emacs.

Ensure versions of sandpaper and varnish can be pinned

One thing that was brought up by @tobyhodges is the concept that people like to be able to tweak the template. At the moment, this is not explicitly possible, but it is possible without too much effort by piggybacking on {pkgdown}'s approach to allowing custom templates a la tidytemplate: https://tidytemplate.tidyverse.org/ (which is exactly what we did with {varnish}).

Another aspect is the fact that some instructors like to tweak the lessons by forking the official lessons. We want to make sure that they don't have to go through any changes to the site if they don't want to.

Both of these aspects can be achieved by introducing a mechanism by which users can lock the versions of sandpaper and varnish needed to build the site.

Make sure lessons with empty sites generate the sites first

Problem with the example when the site dir hasn't been populated:

usethis::create_from_github("zkamvar/miniature-octo-waffle", tempdir())
#> ✔ Creating '/tmp/RtmpkspjdR/miniature-octo-waffle/'
#> ✔ Cloning repo from '[email protected]:zkamvar/miniature-octo-waffle.git' into '/tmp/RtmpkspjdR/miniature-octo-waffle'
#> ✔ Setting active project to '/tmp/RtmpkspjdR/miniature-octo-waffle'
#> ✔ Setting active project to '<no active project>'
sandpaper::build_lesson(file.path(tempdir(), 'miniature-octo-waffle'))
#> Error: [ENOENT] Failed to search directory '/tmp/RtmpkspjdR/miniature-octo-waffle/site/built': no such file or directory

Created on 2020-08-28 by the reprex package (v0.3.0)

Create GitHub Action for building and review on PR

This GitHub action will be the entry point into many of the lessons. The idea is that the workflow would look like this (EDIT to reflect #34 (comment)):

  1. a pull request is created
  2. the site is built using sandpaper::build_lesson()
  3. the site is deployed from the artifact on netlify or aws
  4. the diff of the built/ directory is stored as an artifact for the PR (Note: we will need to have a way for maintainer to re-trigger these builds in case the artifacts get too old and die)
  5. the link to the site and the diff for the generated markdown documents are added as a comment.

Once the PR is accepted and merged, the staging branch is deleted and the main build (on push) runs.

Ensure code does not expose environment variables

Because of the way the setup-r action is run, it is possible for any script running in that action to have access to that environment variable. This would be a problem if someone were to include a script in the repository that gets run with the lessons harvesting the token during any of the processing steps and gains access. It's not impossible to imagine that this would happen we effectively use that strategy in {sandpaper} to create branches and push the site. All it would take would be for one maintainer to merge code in without checking to see if there are malicious scripts being run. There may be mitigations against this, but I'm not aware of any.

Thinking about caching

One of the bottlenecks is converting markdown to HTML and it might be a good idea to do two things:

  1. cache the varnish environment: This allows us to only translate the markdown files that needed to be rebuilt because we know the style has not changed.
  2. cache the schedule: Currently, we read in every episode anew when we want to build the schedule. This requires a couple of rounds of IO and costs some time (perhaps more so on disk drives). It would be good to store the schedule in either tabular format or HTML format for later passing to the converter function.

Document rolling updates

This will describe the CRON jobs that are deployed as GitHub Actions which update {varnish} and {renv}, respectively.

Create instructor aside class for displaying instructor notes

The <aside> tag is a neat feature of HTML5 that allows you to write side-bars in HTML without the use of JavaScript.

We can have contributors create instructor notes that can be hidden by using fenced div tags:

This is some markdown text

::::::::::::::::::::: instructor

This will be hidden, but appear on the side if a button is pressed.

:::::::::::::::::::::::::::::::

It's possible to achieve this via lua filters, which are available since pandoc 2.0. The R Markdown Cookbook describes these in detail: https://bookdown.org/yihui/rmarkdown-cookbook/lua-filters.html. One particular inspiration is the {govdown} package, which has this bit of lua that describes how we can replace a div tag with pretty much anything we want

If I manipulate this for our purposes, it would look like this:

-- Deal with fenced divs
Div = function(el)

  -- look for the instructor tag
  v,i = el.classes:find("instructor")
  -- If the index is not missing
  if i ~= nil then
    -- remove this class from the div tag
    el.classes[i] = nil

    -- create a new pandoc element, add raw HTML, 
    -- and fill it with the content of the div block
    local html
    local res = pandoc.List:new{}

    html = '<aside class="instructor">'

    table.insert(res, pandoc.RawBlock('html', html))

    for _, block in ipairs(el.content) do
      table.insert(res, block)
    end
    table.insert(res, pandoc.RawBlock('html', '</aside>'))

    return res
  end
  return el
end

Build A Link

Reference: https://zkamvar.github.io/stunning-barnacle#links

Contributors should be able to link to different parts of The Carpentries' ecosystem easily. We also want to make sure that we enforce canonical URLs to avoid situations where we have some sites that go to site.url/page/, site.url/page, and site.url/page.html

The use cases for linking are:

  1. within episodes (e.g. link to challenge)
  2. within lessons (e.g. link to references)
  3. across lessons (e.g. link to bash novice)
  4. across carpentries (e.g. link to glosario)

Note that canonical URLs can also be enforced via JavaScript: https://www.jpap.org/blog/2017/04/canonical-url-redirects-for-static-sites/ (but won't work if JavaScript is not enabled)

Document how to link external images

it seems that relying on knitr::include_graphics() has some advantages but what do we do for non-R based lessons. Do we ask them to use Rmd just for this feature until we can evaluate other languages?

related to #58

Feature: engine to convert ipynb to markdown (via jupytext)

Summary

This comes from lessons learned from https://github.com/datacarpentry/astronomy-python/, where Allen Downey has prepared a script for converting from iPython notebooks to markdown.

Since we are using markdown as an intermediate to HTML it should be a matter of dropping in an engine that renders and converts a single jupyter notebook to markdown. The script can be called from the {processx} package.

Tasks

  • confirm a python script can be run consistently from R via processx
  • create a python engine for conversion
  • create tests for python conversion
  • add python engine to code, acting on *.myst notebook files
  • add optional config for users to insert their own engine

Edit: I'm changing this from ipynb based to myst notebook/jupytext based operations because ipynb format is not git-friendly even if github will render them.

Rework: Think about using RMarkdown site instead of pkgdown site

Yes... Yes... I know.

I know what you're going to say and you don't have to say it.

I should have gone with RMarkdown sites all along in the first place 😞

At the moment {pkgdown} works, but there are changes coming down the pike that may make it less feasible. Luckily, because we have the staging area, switching these should not be that difficult... right?

Add function to update {varnish} package

Because we are porting the styling into the {varnish} package, the users should be able to install any version of it at will:

sandpaper::update_varnish() # latest version
sandpaper::update_varnish(version = "1.2.3") # specific version 

The varnish version should also be recorded in the config file.

Originally posted by @zkamvar in #1 (comment)

Think about i18n and l10n

There are different levels for translation for a website:

  1. Translation of the menus and the messages (e.g. 404)
  2. Translation of the prose.

The former is relatively easy, the latter is quite complicated and involves tradeoffs.

David Pérez-Suárez gave a really good talk about this in his CarpentryCon talk: https://youtu.be/IzRCuk7XX18

His solution was to have a centralized hub to hold translations and work with git submodules to control when those translations would be updated. It's not a trivial issue because there's a balance of effort on the maintainers and the contributors. David had mentioned that the updates would only flag translation files, but the result would remain unchanged until a translator comes along and translates the changes.

To address the translation of the prose within the document, they modified https://github.com/carpentries-i18n/po4gitbook to work with the kramdown tags. Looking elsewhere, it doesn't seem that there's really any other good solution to translating markdown to po and back again.

I believe some of this process can be improved by implementing translation to XML on the backend since that provides a clear structure for lists, paragraphs, etc. The challenge is to match that to a clear grammar for how the po files are to be structured.

There's a good breakdown of the tasks necessary for i18n: https://wiki.mageia.org/en/What_is_i18n,_what_is_l10n#I18N

I looked into how the Hugo Learn theme does i18n (https://learn.netlify.app/en/cont/i18n/), but it appears that they have a basic structure for the message translation for the menus and messages, but the prose is expected to live in separate files in separate repositories.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.