Giter VIP home page Giter VIP logo

ehfd / nvidia-dind Goto Github PK

View Code? Open in Web Editor NEW
27.0 27.0 11.0 92 KB

Isolated DinD (Docker in Docker) container for developing and deploying Docker containers using NVIDIA GPUs and the NVIDIA container toolkit. Useful for deploying the Docker engine with NVIDIA in Kubernetes.

Home Page: https://github.com/ehfd/nvidia-dind/pkgs/container/nvidia-dind

License: Apache License 2.0

Dockerfile 51.18% Shell 48.82%
container container-image dind docker docker-container docker-image k8s kubernetes nvidia nvidia-docker nvidia-gpu supervisor

nvidia-dind's People

Contributors

ehfd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

nvidia-dind's Issues

minikube cannot detect the GPUs

I used your image to create a container. In the container, I installed minikube. When I run minikube start, the node minikube didn't detect any GPU. I am wondering how to fix this. By the way, the command nvidia-smi works well.

docker run --gpus 1 -it --privileged --name ElasticDL -d ghcr.io/ehfd/nvidia-dind:latest
docker exec -it ElasticDL /bin/bash
# install minikube
minikube start
alias kubectl="minikube kubectl --"
kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"

Then the result shows that the GPU num is <none>

When I run a pod, the pod status is:

FailedScheduling
0/1 nodes are available: 1 Insufficient ndefault-schedulervidia.com/gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Needs cuda (at least for pytorch)

This is less a bug report and more of an FYI for anyone else trying to use this container with a pytorch or any torch?. I was able to get nvidia-smi to work inside my matroska container, but the cuda version was n/a and torch.cuda.is_available() would be False. So I changed the base from ubuntu to the cuda base that matched my AMI (11.4 in this case) and now it works!

Containers running inside the dind don't have access to the GPU

Hi,
This is a great solution for running an isolated environment inside a docker daemon on the host machine.
Running nvidia-smi inside the dind works as expected.
However, when running a container that needs to run inside the dind:
docker run --rm --name test --gpus=all --runtime=nvidia nvidia/cuda:12.0.1-base-ubuntu22.04 nvidia-smi
I get the following error:
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: mount error: stat failed: /proc/driver/nvidia/gpus/0000:04:00.0: no such file or directory: unknown.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.