tomeichlersmith / denv Goto Github PK
View Code? Open in Web Editor NEWuniformly interact with containerized environments across runners
Home Page: https://tomeichlersmith.github.io/denv/
License: GNU General Public License v3.0
uniformly interact with containerized environments across runners
Home Page: https://tomeichlersmith.github.io/denv/
License: GNU General Public License v3.0
Describe the bug
At some clusters, I have a symlink from my home directory (where I end up after SSHing) to the workspace for the experiment (where I have actual access to disk space and the software). This allows me to have the following workflow.
ssh <cluster>
cd work
denv
where work
points to some directory on a different filesystem.
This leads to (for example)
tom@cluster-node:~/work$ denv config print
denv_workspace="/home/tom/work"
denv_name="work"
denv_image="tomeichlersmith/hps-env:v3.3.0"
denv_mounts=""
denv_shell="/bin/bash -i"
singularity version 3.8.7-1.el7
Now this leads to normal behavior within the denv while interacting with the shell, but if I were to go to the work
directory via some other path, any deduced paths (e.g. paths deduced during a cmake
call) within the previous denv session would be invalid.
Expected behavior
I would like to have denv
resistant against how the user cd
s to their workspace directory. I think this could be accomplished by using realpath
to get the full, non-symlink path for mounting. I could implement a method to mount both the symlink path and the full, non-symlink path but I'm concerned that would be difficult.
Additional context
$ denv version
denv v0.2.0
Is your feature request related to a problem? Please describe.
I'd like to write scripts that run programs in a denv. The easiest approach to do this is to write a normal script and then execute it with denv
:
denv my-script.sh
But sometimes, this execution path is unavailable or bloated. For example, sometimes I want to run my-script.sh
in a denv that resides somewhere else. The currently supported way to do this is
denv_workspace=/full/path/to/denv denv my-script.sh
It is a hassle to type out the full denv_workspace path and sometimes its not possible to do (for example in batch processing contexts).
This forces me to write a second script that can wrap my-script.sh
#!/bin/sh
denv_workspace=/full/path/to/denv \
denv $@
which unfortunately means another round of shell interpretation when expanding $@
and another file to carry around.
Describe the solution you'd like
It'd be really cool if I could specify denv
with a shebang so that my-script.sh
is automatically run within the denv.
Something like
#!/usr/bin/env denv
<rest of my-script.sh contents>
This would allow me to have a single file whose first line specifies that it is run by denv
I've tried this, but it does not work out of the box. It appears to hang probably due to my misunderstanding how the script is given to denv
when it is specified in the shebang.
This solution could be expanded by adding a denv
option specifying what should be run within the denv.
#!/usr/bin/env denv --shebang python
print("hello world")
Or using the current "remote" running capability
#!/usr/bin/env denv_workspace=/full/path/to/denv denv
<shell script contents>
Describe alternatives you've considered
A wrapper script like shown above may be able to function as a shebang, but it still would introduce the bloat of an additional script to carry around.
Additional context
GNU parallel has a --shebang
option: https://www.gnu.org/software/parallel/parallel_tutorial.html#shebang
Looking at the source parallel
and searching for --shebang
reveals that it needs to re-execute itself when acting as a shebang.
It looks like we want to mimic --shebang-wrap
so that the user can tell denv
which program to give the script to within the container.
# Program is called from #! line in script
# remove --shebang-wrap if it is set
$opt::shebang_wrap = ($ARGV[0] =~ s/^--shebang-?wrap *//);
# remove --shebang if it is set
$opt::shebang = ($ARGV[0] =~ s/^--shebang *//);
# remove --hashbang if it is set
$opt::shebang .= ($ARGV[0] =~ s/^--hashbang *//);
if($opt::shebang) {
my $argfile = Q(pop @ARGV);
# exec myself to split $ARGV[0] into separate fields
exec "$0 --skip-first-line -a $argfile @ARGV";
}
if($opt::shebang_wrap) {
my @options;
my @parser;
if ($^O eq 'freebsd') {
# FreeBSD's #! puts different values in @ARGV than Linux' does
my @nooptions = @ARGV;
get_options_from_array(\@nooptions);
while($#ARGV > $#nooptions) {
push @options, shift @ARGV;
}
while(@ARGV and $ARGV[0] ne ":::") {
push @parser, shift @ARGV;
}
if(@ARGV and $ARGV[0] eq ":::") {
shift @ARGV;
}
} else {
@options = shift @ARGV;
}
my $script = Q(Q(shift @ARGV)); # TODO - test if script = " "
my @args = map{ Q($_) } @ARGV;
# exec myself to split $ARGV[0] into separate fields
exec "$0 --_pipe-means-argfiles @options @parser $script ".
"::: @args";
}
}
if($ARGV[0] =~ / --shebang(-?wrap)? /) {
::warning("--shebang and --shebang-wrap must be the first ".
"argument.\n");
}
man
page for each sub-commandDescribe the bug
The installation script on macOS (13.3.1) fails with a permission error.
To Reproduce
Steps to reproduce the behavior:
denv
with...Expected behavior
denv
should install without errors
Screenshots
Here is the output of a session where I try to install denv
:
░ tamasgal@silentbox:~
░ 17:42:18 > curl -s https://raw.githubusercontent.com/tomeichlersmith/denv/main/install | sh
ERROR: Cannot write into /Users/tamasgal/.local/share/man/man1, permission denied
░ tamasgal@silentbox:~
░ 17:43:04 1 > mkdir .denv
░ tamasgal@silentbox:~
░ 17:43:13 > curl -s https://raw.githubusercontent.com/tomeichlersmith/denv/main/install | \
sh -s -- --prefix .denv --next --simple
ERROR: Cannot write into .denv/share/man/man1, permission denied
░ tamasgal@silentbox:~
░ 17:43:24 1 > mkdir -p .denv/share/man/man1
░ tamasgal@silentbox:~
░ 17:47:03 > curl -s https://raw.githubusercontent.com/tomeichlersmith/denv/main/install | \
sh -s -- --prefix .denv --next --simple
INFO: Checking dependencies...
INFO: Downloading...
INFO: Unpacking...
install: illegal option -- D
usage: install [-bCcpSsv] [-B suffix] [-f flags] [-g group] [-m mode]
[-o owner] file1 file2
install [-bCcpSsv] [-B suffix] [-f flags] [-g group] [-m mode]
[-o owner] file1 ... fileN directory
install -d [-v] [-g group] [-m mode] [-o owner] directory ...
ERROR: Copying to .denv/bin failed.
Do you have permission to write there?
need to get idmap support: containers/podman@82a050a
already merged into podman main, just waiting on release
I haven't done a very exhaustive search, but I have stumbled upon other runners that may be useful for denv
to support.
write up some completions that can do the minimum of
I don't forsee an easy way to tab-complete image tags, although I could maybe hook into docker/podman's tab complete functions if that is the runner being used.
I think it makes sense to only have one way to set different configuration variables and I think a logical separation is the config file has the normal and necessary options for running and the optional parameters will be looked for in the environment.
Some folks may not have a ~/.local
already created so I should make it if it doesn't exist yet.
Is your feature request related to a problem? Please describe.
Unpacked images are supported natively by singularity/apptainer but I'd like to also support them with docker/podman so that a user's workflow doesn't need to change between machines.
Describe the solution you'd like
Something within denv that would allow docker/podman to load these unpacked images (perhaps docker load
?). This would then allow users to have CVMFS on their laptops with docker/podman and use the same commands they would use on a HPC with apptainer/singularity.
want to make sure uniformity across runners when opening a specifically-configured denv. What is the minimal feature set of denv?
~/.local
are already in the path.bashrc
can be controlled per-workspace and specializeddenv
in scripts as well as allowing the user to be forced into a newer image versionIs your feature request related to a problem? Please describe.
I'd like to make the start-up process more user friendly. Specifically, having a denv check
(or similar) command to verify installation and availablility of a runner will be helpful. In addition, this runner check could take the time to compare version numbers on top of just checking for existence of commands within the PATH
.
Describe alternatives you've considered
The only alternative I can think of is to manually do this checking outside of denv
and document this process in the manual. This is icky.
Additional context
LDMX-Software/ldmx-sw#1232 using in ldmx-sw's ldmx-env.sh
would benefit from a check command.
Seems like a nice bash-focused test writing toolkit.
https://bats-core.readthedocs.io/en/stable/tutorial.html#
it will be better than rolling my own like I do currently, hopefully making it easier to write more thorough tests
Is your feature request related to a problem? Please describe.
HPS's slic
falls-back to downloading GDML and LCDD schemas from the internet if they are not found locally. This has caused me issues when running on SLAC's cluster since sometimes this internet connection is disrupted. The reason these schemas are not found locally is simply due to a mis-configuration of the container environment (the actual files were there), and the most direct way for me to quickly test slic is to prevent it from connecting to the internet at all so I can see if it fails at this fallback or continues successfully.
Describe the solution you'd like
apptainer
can be given --net --network none
1 which puts the container into a network-less environment. I expect the other runners have something similar.
Describe alternatives you've considered
From the denv
side, there isn't really an alternative. I think keeping the network connected is a sensible default but it is helpful to run without it occasionally.
Imagine a world where we are developing using denv
and we have installed a program to the logical place of ~/.local
(within the denv). This means a program we want to run is probably within ~/.local/bin/
. We can add this path to the PATH
variable in a .bash_aliases
file which will be sourced by bash
when we launch an interactive shell in the denv.
denv init ubuntu:22.04
mkdir -p .local/bin
cat > .local/bin/hello <<\WORLD
echo world
WORLD
chmod +x .local/bin/hello
cat 'export PATH=${HOME}/.local/bin:${PATH}' > .bash_aliases
This setup works and we can call our newly installed program.
denv
hello
# world is printed
exit
The issue arises when we want to run the program directly from outside the denv. Perhaps due to our workflow or perhaps in another script or something.
denv hello
# entrypoint complains about not being able to find the program 'hello'
We can get around this issue by launch bash explicitly, but it would be cool if denv
did this automatically.
denv bash -ic hello
# world is printed
Describe the bug
On some clusters, podman
is installed in a "emulation" mode such that the program docker
is also installed which simply redirects to podman
(presumably with some translation layers inserted). This leads to the following output of denv check
:
Entrypoint found alongside denv
Looking for docker... Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
found 'podman version 1.6.4' <- would use without DENV_RUNNER defined
Looking for podman... found 'podman version 1.6.4'
Looking for apptainer... found 'apptainer version 1.2.5-1.el7'
Looking for singularity... not found
I think handling this could be an easy case of processing the text returned by docker --version
a bit more than I do already. It outputs podman
inside it anyways.
Solution
Update denv check
to output a "emulation via podman" message if this case is found. Maybe do a similar message for singularity symlinks to apptainer?
need to figure out which code from distrobox is necessary and which is there for its extra features denv won't support
Sylab's singularityCE has the same name as the old name of apptainer. Currently, we are only testing the last release of singularity as maintained by apptainer, but we could also check if denv works with sylabs' fork.
Describe the bug
With the new branch protection rules on main
, the GitHub actions fail to push directly to it. I have attempted to make the action open a PR rather than push directly to main, but I am getting a funky error.
Switched to a new branch 'auto-man-update'
[auto-man-update 652c946] Auto Man Page Update from
1 file changed, 7 deletions(-)
pull request create failed: GraphQL: Head sha can't be blank, Base sha can't be blank, No commits between main and auto-man-update, Head ref must be a branch (createPullRequest)
Error: Process completed with exit code 1.
It's weird that gh
is not seeing the commit I just made. Not sure why that is...
Is your feature request related to a problem? Please describe.
On many HPCs with CVMFS, there is a cvmfs repo called unpacked.cern.ch which has "unpacked" SIF images. These are "unpacked" because they are not compressed into a single file that is easier to pass around. Being able to run directly from these unpacked images would be helpful since it would save the user from waiting for the image file to be built.
Describe the solution you'd like
singularity and apptainer can already run these images so I'd just need to update the denv code to be able to avoid building the intermediate file unless necessary. In addition, I'd like to check if docker/podman can run these images - if they can't, I'd need to seriously consider if these types of images should be supported by denv whose primary purposes is unifying the experience these different runners.
Describe alternatives you've considered
Alternatively, we could require unpacked images to be built into a single file. This is undesirable because it is making unnecessary copies, but it would enforce a certain level of stability on the denv users are using.
Additional context
Is your feature request related to a problem? Please describe.
It is helpful when using the image as an environment to do some mild introspection.
Describe the solution you'd like
Something like denv config print --image
or whatever that calls the necessary docker inspect --format ....
and apptainer inspect ...
commands.
Describe alternatives you've considered
An alternative is just to document it, but I think this is a simple enough task that it can be done across the current four runners. I'll probably have it be an optional feature of the runners so that future runners don't have too many requirements outside of download images and actually running them.
Idk something easy to type, maybe >
or _
or \
Context
In ldmx-env.sh, we need to go through a whole rigamarole to deduce the OS so we can pass the correct DISPLAY
environment variable to docker
:
Currently, I've only tested denv
manually on Linux hosts whose GUI applications are connected pretty cleanly by the auto-mounting of /tmp
and the forwarding of the DISPLAY
environment variable from the host environment.
I'd like to make sure denv
can share GUI applications with the host whether the host is Windoze via WSL or MacOS. I don't have a direct way of testing this, so I may need to rope some friends into helping test this for me 👀
It looks like WSL has been updated to support forwarding GUI applications with the help of a few environment variables. https://github.com/microsoft/wslg/blob/main/samples/container/Containers.md
I can't find anything on MacOS signalling an update relative to whats in ldmx-env.sh
on the first page of google.
In order to help isolate the denv from the host environment, it will be helpful to allow the user to be specific about which variables are shared between the host and the denv. We could also have a config mode that enables the current behavior: sharing all environment variables (except special and weird ones).
I'm thinking we'd introduce three new variables to the config.
env_var_copy_all
: 1
if doing the current behavior and 0
otherwiseenv_var_copy
: space-separated string of environment variables to copy into the denv, special variables will throw an error (e.g. HOME
cannot be in this list) and weird values will throw an error (e.g. can't copy in variables with newlines), ignored if env_var_copy_all
is set to 1
printenv | ...
stuff that is currently being used to deduce the environmentenv_var_set
: space-separated key=value
pairs to define for within the denv, allows for the user to set a variable for in the denv that may have a different variable on the hostin a normal terminal, I can enter a apptainer image with an interactive shell and get a nice prompt
apptainer exec --hostname hps-env.$(uname -n) --home $PWD --env PS1="${PS1}" .denv/images/tomeichlersmith_hps-env-v3.2.0.sif /bin/bash -i
sourced .bashrc
eichl008@hps-env ~>
but this seemingly requires me to pass the PS1
into the container which is lame. I want the PS1 to be set by the .bashrc
that is in the workspace. I know the bashrc is being sourced since I put echo "sourced .bashrc"
at the bottom. Adding set -x
to the bottom of the .bashrc
shows that apptainer is setting the prompt itself somewhere
apptainer exec --hostname hps-env.$(uname -n) --home $PWD .denv/images/tomeichlersmith_hps-env-v3.2.0.sif /bin/bash -i
sourced .bashrc and has PS1=${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$
++ PS1='Apptainer> '
++ unset PROMPT_COMMAND
The extra +
signs makes me think that apptainer is spawning an extra shell somewhere. I can edit it by defining --env PS1="<prompt>"
at apptainer run time.
This is because apptainer defines the PROMPT_COMMAND environment variable. https://apptainer.org/docs/user/latest/environment_and_metadata.html#environment-from-the-host
I like distrobox's https://github.com/89luca89/distrobox/blob/main/.shellcheckrc
and it allows for me to document why we ignore checks that we commonly ignore
Noticed this while trying to use combine which needs the container-defined LD_LIBRARY_PATH in order to load the shared libraries when running.
Alternative
Currently, you can avoid this manually by running in a more environment-restricted mode.
denv config env copy all off
denv config env copy hostkey key=var ...
Solution
I think denv
should make the opinionated choice to not share *PATH
variables by default. This is the opposite choice made compared to distrobox; however, I think it makes sense in order to isolate the programs in the denv from the programs in the host. I still think there should be some path (lol) for sharing specific *PATH
variables if the user desires, i.e. allow them via denv config env copy
but exclude them from denv config env copy all
.
Edit: To be more clear, I think distrobox combines both the host and the internal *PATH
variables so that both are available within the containerized environment. I'll need to look into that to see if its possible but I don't think it is since the limiting factor seems to be how apptainer defines the envrionment variables (i.e. with some sh
init scripts that check if they are already defined).
just so users have access to them
in order to unify the environment, especially with the user setting a home directory on the command line, we'll need to run a special entrypoint after opening the container similar to distrobox. However, it will be simplified relative to distrobox since many apptainer/singularity installations are configured to not allow writable images or containers and so users are unable to install packages after creating an image file (usually downloaded from DockerHub). For this reason, the only purpose of the entrypoint script would be to attempt to check if the attached HOME is setup with some RC files and if not, setup a few default RC files before starting the interactive shell. I could also see an option where the interactive shell is not started in favor of the user supplying a command to run in the container a la the ldmx
command.
The rough outline would look something like below but it is still lacking
if [ ! -f ${HOME}/.bashrc ]; then
# try to use system tools like usermod/useradd to setup default RC files
if command -v usermod; then
usermod -d ${HOME} ${USER}
else
echo "No `usermod` available, falling back to plain cp"
cp -t ${HOME} /etc/skel/*
fi
fi
if [ $# -eq 0 ]; then
/bin/bash -i
else
exec "$@"
fi
Describe the bug
Attempting to use any tool that checks /etc/passwd for user IDs seems to not work on Mac devices. Examples include git
or just whoami
.
To Reproduce
Steps to reproduce the behavior:
denv
with...whoami
You also see your username in the prompt as I have no name
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add output of denv version
and denv config print
here.
denv v0.5.0
denv_workspace="/Users/einarelen/ldmx/umpg/ldmx-sw/SimCore/G4Py8"
denv_name="G4Py8"
denv_image="ldmx/local:91-pythia8-support-dev-mt-debug"
denv_mounts=""
denv_shell="/bin/bash -i"
Docker version 24.0.6, build ed223bc
By default, the current order prefers docker/podman over apptainer/singularity; however, this tends to lead to issues on systems that have both apptainer and a restricted, rootless podman installed. The installed podman has a ID restriction preventing it from running some images. (More detail printout below)
My general idea is to, at minimum, have denv
prefer apptainer/singularity as a proxy for a test on if podman is restricted or not. I could also investigate the addition of a prompt for the user to choose the runner they wish to use and then write that runner into the config (or maybe a user config in ~/.config
?). This is somewhat tied with #93 since having a check function that actually shows which runner will be used by denv will be helpful for testing a more complicated deduction procedure.
I've seen this on SLAC's S3DF and JLab's ifarm.
$ podman run --rm hello-world
!... Hello Podman World ...!
.--"--.
/ - - \
/ (O) (O) \
~~~| -=(,Y,)=- |
.---. /` \ |~~
~/ o o \~~~~.----. ~~
| =(X)= |~ / (O (O) \
~~~~~~~ ~| =(Y_)=- |
~~~~ ~~~| U |~~
Project: https://github.com/containers/podman
Website: https://podman.io
Desktop: https://podman-desktop.io
Documents: https://docs.podman.io
YouTube: https://youtube.com/@Podman
X/Twitter: @Podman_io
Mastodon: @[email protected]
$ podman image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/podman/hello latest b1c06f48960c 4 days ago 1.7 MB
$ podman pull busybox:latest
Resolved "busybox" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/busybox:latest...
Getting image source signatures
Copying blob 7b2699543f22 done
Error: writing blob: adding layer with blob "sha256:7b2699543f22d5b8dc8d66a5873eb246767bca37232dee1e7a3b8c9956bceb0c": Error processing tar file(exit status 1): potentially insufficient UIDs or GIDs available in user namespace (requested 65534:65534 for /home): Check /etc/subuid and /etc/subgid if configured locally and run podman-system-migrate: lchown /home: invalid argument
$ podman image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/podman/hello latest b1c06f48960c 4 days ago 1.7 MB
Describe the bug
Running denv check
on a MacOs system is likely to result in
sh: 1: check: not found
since check isn't installed by default. This can be resolved by the user installing it (e.g. brew install check), but it might be worth working around to make denv easier to use for people
Is your feature request related to a problem? Please describe.
I don't like that I have to manually edit three files just to set the version number. I think a small shell script would be able to do it as well as add in a few more checks.
Describe the solution you'd like
Something that can
./ci/set_version X.Y.Z
and change the code in the necessary places to change the version to X.Y.Z
Describe alternatives you've considered
Is your feature request related to a problem? Please describe.
I'd like to host jupyter labs in denv alongaide my other crap I have in it. One denv to rule them all.
Describe the solution you'd like
Apptainer already conencts ports by default, I think I just need to add a specific flag for docker.
Describe alternatives you've considered
We could have the lab outside the denv or connect to the denv over ssh for port forwarding a la vs code but thats lame.
https://github.com/sigoden/argc
It's mainly marketed as a way to avoid boilerplate in scripts, but it also has a "build" option which can generate the boilerplate CLI code. I want to see if we can use this because then the CLI, the help messages, and the man pages would all be generated from the same source of truth.
both via the config
command and the init
command
maybe change the default to something more helpful (like the workspace directory basename?)
Currently, I have templated out MacOS testing within the testing workflow.
denv/.github/workflows/test.yml
Lines 63 to 77 in 80efb6d
It is commented out due to a few complexities.
ro
at a special (root) location. This got around the issue of being unable to launch the container, but the tests still failed.One program denv
running commands.
init WORKSPACE [ -i, --image IMAGE ] [--no-gitignore] [--home HOMEDIR] [--mount DIR0 [DIR1...]]
WORKSPACE/.denv
WORKSPACE/.denv/config
file for container runners to read.gitignore
inside .denv
to ignore local-only files (can be disabled on the CLI)config
: safely view and edit config
print
show config to userimage
: image manipulation commands
pull
: re-pull image without changing tag (e.g. in the case of *:latest
-type image tags)use
: set which image to use (and pull if not available locally)mounts
: add the command line arguments to set of mounts to put into the denvhome
: change the home directory for the denv
(discouraged)Run commands inside the opened container. We should copy over all environment variables except for any that are found to break the workspace-is-home configuration we are trying to focus on.
(no other inputs)
/etc/skel
if file WORKSPACE/.dbx/skel-init
does not exist<command> [args ...]
<command> [args ...]
in container.denv/
config
- holds container configuration variables, written by init
and set
, read by container running commands.gitignore
- ignore all files in this directory except config
skel-init
- empty, exists to signal to container entrypoint to not copy over skel filesimages/
- only exists for apptainer/singularity runners, holds *.sif
filesName | Description |
---|---|
image | container image to run |
home | full path to workspace directory which will be container home |
mounts | additional mounts besides workspace directory |
shell | which the command to execute when starting an interactive session |
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.