spietras / rules_conda Goto Github PK
View Code? Open in Web Editor NEWRules for creating conda environments in Bazel :green_heart:
Home Page: https://spietras.github.io/rules_conda
License: MIT No Attribution
Rules for creating conda environments in Bazel :green_heart:
Home Page: https://spietras.github.io/rules_conda
License: MIT No Attribution
Validate if things are fresh
It would be nice to have some test automation to be able to see if new changes break something
The README still uses the old flags before mamba
support was added.
It would be nice to have a GitHub Action set up for code autoformatting. Maybe we could use the existing Buildifier GitHub Action: https://github.com/thompsonja/bazel-buildifier
Hey, any update on that?
It's pretty empty there now
Hi,
thank you for providing this nice rules! I am a newbie with bazel and would like to ask for help building a py3_image instead of a py_binary.
Thanks you!
Think about using mamba
instead of conda
. That might speed things up.
Could improve user experience (e.g. faster consecutive recreations) for less aware users
bazelisk
binariesbazel
versionconda
versionrules_python
version in the exampleNot supporting Python 2 might be even considered a good practice nowadays as the world rather wants to move on. I don't use it and don't know anyone that does, so for me it makes no difference. It might even make things easier to have an assumption that only Python 3 will be used.
I think we should wait some time and if no one protests then we'll drop it.
I know that this might be a rather large request, but is there any chance to change rules_conda to not run it's heavy work (installing conda and setting up the environment) during analysis phase?
As far as I understand it basically all other rulesets only set up a scaffold of rules during the analysis phase, which makes them fast to run, and only do the heavy lifting if one of its rules is involved in the action graph of a target. What rules_conda is doing right now may be fine for a very python-focused bazel monorepo, but in one that has all kinds toolchains in it (which is the case for ours), simply including rules_conda increases e.g. the CI time of running lints for the frontend parts of the repo from ~1min to ~6min, because the conda environment is being set up, even though it will never be used.
Would love to hear your thoughts on this!
I'm trying to use rules_conda with pybind11_bazel in Hermetic Python mode. This is necessary because pybind11 requires extension modules to be built for the correct Python version.
pybind11_bazel supplies a python_configure
rule that takes an attr python_interpreter_target
that's a of type Label:
python_configure(
name = "local_config_python",
python_interpreter_target = "@python_interpreter//:python_bin",
)
There does not seem to be a way to pass the Python interpreter created by rules_conda's environment to python_configure
.
In WORKSPACE
:
load("@rules_conda//:defs.bzl", "conda_create", "load_conda", "register_toolchain")
load_conda(
quiet = False, # use True to hide conda output
version = "4.10.3", # optional, defaults to 4.10.3
)
conda_create(
name = "conda_test_env",
timeout = 600, # each execute action can take up to 600 seconds
clean = False, # use True if you want to clean conda cache (less space taken, but slower subsequent builds)
environment = "@//:conda_test_env.yml", # label pointing to environment.yml file
quiet = False, # use True to hide conda output
)
register_toolchain(
py3_env = "conda_test_env",
)
Which generates a external/conda_test_env/BUILD
file containing:
py_runtime(
name = "python_runtime",
files = glob(
["conda_test_env/**/*"],
exclude_directories = 0,
),
interpreter = "conda_test_env/bin/python",
python_version = "PY3",
)
On the file system, the Python interpreter is located at: $(bazel info output_base)/external/conda_test_env/conda_test_env/bin/python
.
In WORKSPACE:
python_configure(
name = "local_config_python",
python_interpreter_target = "@conda_test_env//conda_test_env/bin/python",
)
This doesn't work because it complains that bin
is not a package: it requires a BUILD
file which makes sense. Changing .../bin/python
to /.../bin:python
returns a similar error.
Manually adding a BUILD
file to the bin
directory now causes a problem for py_runtime
, which contains the line: interpreter = "conda_test_env/bin/python"
ERROR: /private/var/tmp/_bazel_jiawen/8678712aa06452e8f0efd934ed354368/external/conda_test_env/BUILD:6:11: Label '@conda_test_env//:conda_test_env/bin/python' is invalid because '@conda_test_env//conda_test_env/bin' is a subpackage; perhaps you meant to put the colon here: '@conda_test_env//conda_test_env/bin:python'?
But adding the colon also fails, because apparently it must be a target name and not a label?
ERROR: /private/var/tmp/_bazel_jiawen/8678712aa06452e8f0efd934ed354368/external/conda_test_env/BUILD:6:11: @conda_test_env//:python_runtime: invalid label 'conda_test_env/bin:python' in attribute 'interpreter' in 'py_runtime' rule: invalid target name 'conda_test_env/bin:python': target names may not contain ':'
Any ideas?
Now we have only one test: it's calling the rules with standard arguments and checks if all packages were installed correctly.
But as we grow, new arguments are added to the rules and possible input configurations are multiplying. We want to test if execution is successful depending on different inputs to the rules, not only with one standard input.
As these are repository rules, they are used inside a WORKSPACE
file. This means that each different input configuration requires a different workspace.
It would be best to just have some configuration matrix with different input values. The test framework would make a workspace for each combination of values, check the execution there, and report which ones fail. The test can be the same as it is now - checking if installed package versions are correct.
I'm going down a Bazel/conda monorepo rabbithole. I did a bunch of reading and it seems the existing rules, by passing kwargs
should be able to support the use case where we have different platforms requiring different environment files (e.g., GPU support on Linux but not on Mac).
I think this is possible with a combination of:
--platforms=...
as Configurable attributes.register_toolchains
.Shall I set up a simple example with a test?
Hi,
first, many thanks for this work, it really helps to make hermetic conda builds! The project does not have a license, which may prevent its use, could you select one?
Consider adding support for miniforge
. It's a conda
installer similar to miniconda
, but it's developed by the community and has conda-forge
as the default channel. Some people might still want to use the usual miniconda
installation, so we should make it possible to choose (probably with miniconda
being the default).
This idea was originally brought up by @jiawen in #2 (comment)
I see rules_conda
now adds mamba
support in the main branch, this is fantastic.
However the example could not be built with the following error:
❯ ./bazelw run app
Starting local Bazel server and connecting to it...
ERROR: .../rules_conda/example/WORKSPACE:64:11: //external:conda: no such attribute 'conda_version' in 'load_conda_rule' rule
ERROR: .../rules_conda/example/WORKSPACE:64:11: //external:conda: no such attribute 'mamba_version' in 'load_conda_rule' rule
ERROR: .../rules_conda/example/WORKSPACE:64:11: //external:conda: no such attribute 'install_mamba' in 'load_conda_rule' rule
ERROR: .../rules_conda/example/WORKSPACE:72:13: //external:py2_env: no such attribute 'use_mamba' in 'conda_create_rule' rule
ERROR: .../rules_conda/example/WORKSPACE:82:13: //external:py3_env: no such attribute 'use_mamba' in 'conda_create_rule' rule
ERROR: error loading package 'external': Package 'external' contains errors
FAILED: Build did NOT complete successfully (0 packages loaded)
FAILED: Build did NOT complete successfully (0 packages loaded)
This is because the example WORKSPACE
loads release version of rules_conda
by default rather than that from the main branch, and new features like conda_version
are added into example.
In this case, there will be a gap between the release version and the main branch.
I think it might be good to turn on local_repository
by default, and leave the release version as another option in the example. Since it's higher chance for people cloning the repo first and then run the example, it could make the example playing experience more smooth.
I tried git_repository
as well (by using commit zip), it could help bridge the gap, but bazel
would clone the whole repo (including example 26M with bazelisk binary).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.