Giter VIP home page Giter VIP logo

brei's Introduction

Brei

GitHub Org's stars Python package PyPI - Version Entangled badge

Minimal workflow system and alternative to Make.

  • Read from TOML or JSON (also pyproject.toml in [tool.brei] section)
  • Only Python โ‰ฅ 3.11 required
  • Runs task lazily and in parallel
  • Supports variables, templates, includes and custom runners

Read more: documentation

Why (yet another workflow tool)

This tool was developed as part of the Entangled project, but can be used on its own. Brei is meant to perform small scale automisations for literate programming in Entangled, like generating figures, and performing computations locally. It requires no setup to work with and workflows are easy to understand by novice users. If you have any more serious needs than that, we'd recommend to use a more tried and proven system, of which there are too many to count.

When to use

You're running a project, there's lots of odds and ends that need automisation. You'd use a Makefile but your friend is on Windows and doesn't have GNU Make installed. You try to ship a product that needs this, but don't want to confront people trying it for the first time with a tonne of stuff they've never heard of.

Install

To install, you may:

pip install brei

Or you use a tool for virtual environments, we recommend Poetry, after creating a new project with poetry init:

poetry add brei

Development

To run unit tests and type checker:

poetry install
poetry shell
brei test

To build the documentation, run the brei weave workflow:

# poetry shell
brei weave

Some parts of Brei are literate. Run the entangled watch daemon while editing code,

entangled watch

or else, as a batch job, stitch changes before committing:

entangled stitch

License

Copyright Netherlands eScience Center, Apache License, see LICENSE.

brei's People

Contributors

jhidding avatar

Stargazers

 avatar

Watchers

 avatar  avatar

brei's Issues

input validation

there are scripts that will load successfully but are still malformed. validation is required

default runner

a default runner should use the asyncio.subprocess command for running shell commands

Add implicit dependency for each task on its own contents

Currently, when a Brei file was edited, we need to run with -B to rebuild everything. This is not so nice for reproducibility. Keep a database of tasks and a hash of a normalized representation (json.dumps(..., sort_keys=True)). Invalidate target paths that have depended on a task with a different hash.

Glob patterns and/or semantics for list type variables

Glob patterns

In many cases it is annoying that we need to list all our targets explicitly. Say we generated some figures in /docs/fig, then need to copy those to /docs/site/fig. It would be nice to be able to say

[template.copy]
description = "copy `${basename}`"
requires = ["${srcdir}/${basename}"]
creates = ["${tgtdir}/${basename}"]
script = """
mkdir -p '${tgtdir}'
cp '${srcdir}/${basename}' '${tgtdir}'
"""

[[call]]
template = "copy"
collect = "copy-figures"
[call.args]
srcdir = "docs/fig"
tgtdir = "docs/site/fig"
basename = ["*.svg", "*.png"]

The expansion of the script would actually work, but the requires and creates fields would be garbage. As a result we can't trace the dependencies properly, also because the expansion of the wildcards are not known in advance.

A solution would be to say

basename = ["glob(docs/fig/*.svg)", "glob(docs/fig/*.png)"]

where we understand that the glob syntax scans known targets. That creates another can of worms, since we don't know when all targets are known as they may be inside some other glob pattern rule. We'd need to establish all glob patterns in a program, and then scan each target as it is registered if it matches any of the existing patterns. Then someone thinks of the bright idea to add variables into a glob pattern, like glob(${fig_path}/*.svg), and I don't know how to evaluate such an expression.

List variables

Another solution would be that we can have the Entangled Brei hook create a variable listing figures:

[environment]
figures = ["docs/fig/plot1.svg", "docs/fig/plot2.svg"]

[[call]]
template = "copy-static"
collect = "copy-statics"
[call.args]
srcfile = ["splice(figures)", "splice(css)", "some_other_file"]

However, until now we don't have semantics for list variables, other than with template calls. Also, reworking these figure filenames into tasks for copying them to the correct location still requires some scripting.

Nevertheless, asking some agent for a list of items and then performing a task for all those items is a pattern that could pop up more often. The question is then, where does this line of thinking lead, and how do we prevent this exploding in our faces?

The thing is, to actually load a list we need the inverse of splice:

[[task]]
stdout = "listvar(css)"
script = "ls docs/template/*.css"

where it is understood that the list is split on newlines.

change api

  • rename pattern to template
  • rename dependencies to requires
  • rename targets to creates
  • rename language to runner

cycle detection

currently deadlock will ensue. detection should be at run time

testing

  • setup coverage
  • test examples
  • test exceptions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.