Comments (16)
These are my opinions as a docker and ipython user.
I am not an official member of your project.
Re: (3) Does Docker tagging help us with anything?
As a user, it would help if I could rely upon certain numbered version as being unchanging, i.e. designated releases, so I can use it to reliably build functionality that will continue to function for processes or customers that need a frozen environment that always works.
:latest should be for jupyter devs, testers, bleeding edgers, who can afford malfunction or investigate it.
As to the myriad versions of everything else, the choices seem to be: (a) aggregation into one container, (b) joining various containers, (c) customizable build script left in container
(a) aggregate into one container. This is good for end user ease-of-use, until they decide they want something different than the stock arrangement. One issue that this style tends to produce huge containers that take forever to download. Some of the spark/hadoop docker containers released by others suffer from too-big and too-many-layers already. One solution would be to use a suitable base image everyone using docker already has, or should have, but as I write this ubuntu:latest and debian:latest images have python3 and no python2, and centos:latest has python2 and no python3... not to mention the other lesser used components.
(b) Split across containers. Here an environment would be built from several docker containers linked to the jupyter container, perhaps as docker volumes. This is often done for tcp/ip linking of a database running in one container with an app like a web front-end running in another container. In those cases the mysql/maria containers provide examples, but that doesn't seem to be the primary problem faced here. Instead, the problem here seems to be "what lang/environment is this jupyter for? can I adjust that without downloading the entirety of jupyter again?" In those cases a -v volume option exists that allows mounting one containers filesystem inside another container. But this configuration would seem to be harder on developers and end users but perhaps more flexible for creating various configurations. I'm unaware of a way for a bunch of containers to each dump executables into /usr/bin of a single container. In the absence of that functionality, some clever planning and setting of PATH, PYTHONPATH, etc., would need to be done.
(c) customizable build script left in container The idea is to leave a script in the container that uses root privilege on the container along with apt, yum, pip, and similar tools to customize the environment from the base environment to something that works and then run the resulting environment as an ordinary user.
As a potential end user (a) and (c) currently sound best to me.
from docker-stacks.
I believe we're on track for (c), but please check this assumption.
The initial Docker image definitions in this repo install conda
and pip
in an environment so that the unprivileged jovyan
container user can install new packages. The minimal-notebook
definition upon which the others are based accepts a GRANT_SUDO
env var on start to give that jovyan
user the ability to run apt-get
if the user is in a trusted environment (e.g., his/her own VM). In addition, once these images are published to Docker Hub, end users can script the creation of new images to install libs that will not be lost across container destroy/create cycles, ala:
FROM jupyter/python-notebook-stack
USER $NB_USER
RUN conda install <some additional lib for Python 3 that you want in your custom image>
docker build -t parente/my-python-notebook-stack .
from docker-stacks.
Most of the issue we had with automated builds was how permissions on Docker Hub worked. We weren't getting deterministic builds for jupyter/demo
permissions wise and sometimes our builds took too long on the hub, so we started building them manually instead.
I'm a bigger fan of automated trusted (as much as they can be) builds, triggered via the normal github -> docker webhook.
from docker-stacks.
I'd like to give the automated builds another shot. I started by getting minimal-notebook working properly under my personal namespace on Docker Hub. It built without a hitch. I'll try scipy-notebook too (hacked to point to parente/*) and ensure the automation properly rebuilds scipy-notebook on a new build of minimal finishes.
If all that works, would be good to get the builds going under jupyter/*.
from docker-stacks.
I setup parente/minimal-notebook
and parente/pyspark-notebook
which is triggered by the former. I made changes in my fork to the minimal notebook and confirmed the latter rebuilt. I've been using the resulting pyspark image locally all day without permission-related or other problems.
I think we should give the automated builds a shot again and only fall back on a bespoke solution if necessary.
Under what org should we build the images on Docker Hub? Should we create a new jupyterstacks
org to avoid naming conflicts with existing images? Or would you like them all under jupyter
?
from docker-stacks.
I think they can just be under jupyter. @rgbkrk?
from docker-stacks.
I think they can be under jupyter.
from docker-stacks.
Works for me. @rgbkrk can you grant me permissions in the jupyter org to set it up?
from docker-stacks.
I setup the automated builds for "latest" versions of the stacks we have. Everything went smoothly except r-notebook which has the new behavior (on Dockerhub and local for me) of hanging in conda solving package specs. I'll look into debugging it locally.
from docker-stacks.
r-notebook problem was related to r-devtools. Bumping it 1.8 and adjusting for newer R 3.2 release and incompatible packages solved the problem. It's now on docker hub. (I'm still not clear why conda was struggling with the older versions, but moving on ...)
Along the way I noticed it was installing a second copy of IPy. PR #8 should fix it.
@rgbkrk The docker hub webhooks for this git repo don't seem to be enabled. I don't have admin permissions on the repo to check. (And I'm a bit lost on where I would configure a docker hub organization to enable them for an organization on github.) Did you or someone else manage to set them up for ipython/docker-images originally?
from docker-stacks.
Let me see what I can do there too.
from docker-stacks.
Want to try setting up hooks now? You now have admin access on this repo instead of just write.
from docker-stacks.
I'll give it a shot later today. Tnx.
from docker-stacks.
I think it's now enabled, but we'll have to wait for the next git push / PR to find out. If it doesn't work, it's honestly not too bad at the moment with the few stacks we have to manually go trigger the docker hub builds. Better control too. As it stands, any merge to any stack folder is going to trigger all of them to rebuild, and there's nothing we can do about it if all the stacks are in one repo.
At any rate, all the "latest" images are now pushed to Docker Hub using notebook 3.2.1. I plan to now create a 3.2.1 branch, update all the descendants of minimal-notebook to build FROM jupyter/minimal-notebook:3.2.1, and get those tagged builds going too. If that all works, can move on to issue #6 once Conda has a 4.0 build.
from docker-stacks.
Created branch 3.2.x (following pattern of jupyter/notebook branches). Setup builds for that branch on Docker Hub. Manually triggered the build of jupyter/minimal-notebook:3.2. (Notice no patch number since the conda command allows the patch to vary.) Updated all Dockerfiles in the branch to build FROM that tagged 3.2 image. Pushed that minor change to 3.2.x branch and all images started rebuilding automatically on GitHub. They all finished successfully.
from docker-stacks.
(1) and (2) from the original description are done. (3) is done enough (tags reflect main process version). I opened issue #12 for the finer points of how to capture versions of installed libraries in a stable manner.
from docker-stacks.
Related Issues (20)
- Broken docker-stacks-foundation image HOT 1
- Broken docker-stacks-foundation image #2 HOT 2
- Install latest spark version automatically
- [jupyter/tensorflow-notebook] Kernel stays in Busy mode indefinitely HOT 6
- [BUG] Healthcheck fails when using a custom runtime dir HOT 5
- After starting the container with NB_USER=root, NB_UID=0, and NB_GID=0, $HOME environment variable is still /home/jovyan HOT 3
- Python version pinning mechanism in docker-stack-foundation does not match comment description HOT 1
- GitHub ARM64 runners are now available HOT 4
- seems broken: Infinite API requests to api/kernels/<changing-constantly>/channels?session_id=<changing-constantly> and kernel error HOT 5
- NB_USER does not have permissions to mounted directory HOT 16
- Update docker HOT 6
- Add container images for the GPU version of TensorFlow and PyTorch Notebook HOT 7
- Is the latest build correctly labeled on quay.io HOT 3
- Updates to Docker Hub have stopped HOT 3
- Default JUPYTER_PORT HOT 2
- Kernel crash when using tensorflow/pytorch notebook image HOT 5
- build error HOT 3
- Update images to Ubuntu 24.04 LTS HOT 2
- PySpark Notebook 3.5.1 HOT 1
- R v4.4.0 needed for r-notebooks due to security vulnerability HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docker-stacks.