entangled / brei Goto Github PK

View Code? Open in Web Editor NEW

1.0 2.0 0.0 744 KB

Minimal workflow system and alternative to Make

Home Page: https://entangled.github.io/brei/

License: Apache License 2.0

Python 100.00%

brei's Introduction

Brei

Minimal workflow system and alternative to Make.

Read from TOML or JSON (also pyproject.toml in [tool.brei] section)
Only Python ≥ 3.11 required
Runs task lazily and in parallel
Supports variables, templates, includes and custom runners

Why (yet another workflow tool)

This tool was developed as part of the Entangled project, but can be used on its own. Brei is meant to perform small scale automisations for literate programming in Entangled, like generating figures, and performing computations locally. It requires no setup to work with and workflows are easy to understand by novice users. If you have any more serious needs than that, we'd recommend to use a more tried and proven system, of which there are too many to count.

When to use

You're running a project, there's lots of odds and ends that need automisation. You'd use a Makefile but your friend is on Windows and doesn't have GNU Make installed. You try to ship a product that needs this, but don't want to confront people trying it for the first time with a tonne of stuff they've never heard of.

Install

To install, you may:

pip install brei

Or you use a tool for virtual environments, we recommend Poetry, after creating a new project with poetry init:

poetry add brei

Development

To run unit tests and type checker:

poetry install
poetry shell
brei test

To build the documentation, run the brei weave workflow:

# poetry shell
brei weave

Some parts of Brei are literate. Run the entangled watch daemon while editing code,

entangled watch

or else, as a batch job, stitch changes before committing:

entangled stitch

License

brei's People

Contributors

Stargazers

Watchers

brei's Issues

input validation

there are scripts that will load successfully but are still malformed. validation is required

error handling

the code still has many gaps in handling errors.

documentation

default runner

a default runner should use the asyncio.subprocess command for running shell commands

Add implicit dependency for each task on its own contents

Currently, when a Brei file was edited, we need to run with -B to rebuild everything. This is not so nice for reproducibility. Keep a database of tasks and a hash of a normalized representation (json.dumps(..., sort_keys=True)). Invalidate target paths that have depended on a task with a different hash.

Glob patterns and/or semantics for list type variables

Glob patterns

In many cases it is annoying that we need to list all our targets explicitly. Say we generated some figures in /docs/fig, then need to copy those to /docs/site/fig. It would be nice to be able to say

[template.copy]
description = "copy `${basename}`"
requires = ["${srcdir}/${basename}"]
creates = ["${tgtdir}/${basename}"]
script = """
mkdir -p '${tgtdir}'
cp '${srcdir}/${basename}' '${tgtdir}'
"""

[[call]]
template = "copy"
collect = "copy-figures"
[call.args]
srcdir = "docs/fig"
tgtdir = "docs/site/fig"
basename = ["*.svg", "*.png"]

The expansion of the script would actually work, but the requires and creates fields would be garbage. As a result we can't trace the dependencies properly, also because the expansion of the wildcards are not known in advance.

A solution would be to say

basename = ["glob(docs/fig/*.svg)", "glob(docs/fig/*.png)"]

where we understand that the glob syntax scans known targets. That creates another can of worms, since we don't know when all targets are known as they may be inside some other glob pattern rule. We'd need to establish all glob patterns in a program, and then scan each target as it is registered if it matches any of the existing patterns. Then someone thinks of the bright idea to add variables into a glob pattern, like glob(${fig_path}/*.svg), and I don't know how to evaluate such an expression.

List variables

Another solution would be that we can have the Entangled Brei hook create a variable listing figures:

[environment]
figures = ["docs/fig/plot1.svg", "docs/fig/plot2.svg"]

[[call]]
template = "copy-static"
collect = "copy-statics"
[call.args]
srcfile = ["splice(figures)", "splice(css)", "some_other_file"]

However, until now we don't have semantics for list variables, other than with template calls. Also, reworking these figure filenames into tasks for copying them to the correct location still requires some scripting.

Nevertheless, asking some agent for a list of items and then performing a task for all those items is a pattern that could pop up more often. The question is then, where does this line of thinking lead, and how do we prevent this exploding in our faces?

The thing is, to actually load a list we need the inverse of splice:

[[task]]
stdout = "listvar(css)"
script = "ls docs/template/*.css"

where it is understood that the list is split on newlines.

change api

rename pattern to template
rename dependencies to requires
rename targets to creates
rename language to runner

setup coverage
test examples
test exceptions