Giter VIP home page Giter VIP logo

selkies-project / docker-nvidia-egl-desktop Goto Github PK

View Code? Open in Web Editor NEW
191.0 15.0 43.0 297 KB

KDE Plasma Desktop container designed for Kubernetes with direct access to the GPU with EGL using VirtualGL and Vulkan for GPUs with WebRTC and HTML5, providing an open-source remote cloud graphics or game streaming platform. Does not require /tmp/.X11-unix host sockets or host configuration.

Home Page: https://github.com/selkies-project/docker-nvidia-egl-desktop/pkgs/container/nvidia-egl-desktop

License: Mozilla Public License 2.0

Dockerfile 84.38% Shell 15.62%
nvidia-docker nvidia docker-image docker html5 opengl ubuntu gpu nvidia-gpu kubernetes

docker-nvidia-egl-desktop's Introduction

docker-nvidia-egl-desktop

KDE Plasma Desktop container designed for Kubernetes with direct access to the GPU with EGL using VirtualGL and Vulkan for GPUs with WebRTC and HTML5, providing an open-source remote cloud graphics or game streaming platform. Does not require /tmp/.X11-unix host sockets or host configuration.

Use docker-nvidia-glx-desktop for a KDE Plasma Desktop container with better performance, with fully optimized OpenGL and Vulkan for NVIDIA GPUs by spawning its own fully isolated X Server instead of using /tmp/.X11-unix host sockets.

Read the Troubleshooting section first before raising an issue. Support is also available with the Selkies Discord. Please redirect issues or discussions regarding the selkies-gstreamer WebRTC HTML5 interface to the project.

Usage

This container is composed fully of vendor-neutral applications and protocols except the NVIDIA base container itself, meaning that there is nothing stopping you from using this container with GPUs of other vendors including AMD and Intel. Use the respective vendor's container toolkit/runtime or Kubernetes device plugin and make sure that it provisions /dev/dri/card[n] devices, then set the environment variable WEBRTC_ENCODER to the value x264enc, vp8enc, or vp9enc if using the selkies-gstreamer WebRTC interface. However, this is not officially supported and you must solve your own problems. This container also supports running without any GPUs with software fallback (set WEBRTC_ENCODER to the value x264enc, vp8enc, or vp9enc if using the selkies-gstreamer WebRTC interface).

Wine, Winetricks, Lutris, and PlayOnLinux are bundled by default. Comment out the section where it is installed within Dockerfile if the user wants to remove them from the container.

There are two web interfaces that can be chosen in this container, the first being the default selkies-gstreamer WebRTC HTML5 interface (requires a TURN server or host networking), and the second being the fallback noVNC WebSocket HTML5 interface. While the noVNC interface does not support audio forwarding and remote cursors for gaming, it can be useful for troubleshooting the selkies-gstreamer WebRTC interface or using this container with low bandwidth environments.

The noVNC interface can be enabled by setting NOVNC_ENABLE to true. When using the noVNC interface, all environment variables related to the selkies-gstreamer WebRTC interface are ignored, with the exception of BASIC_AUTH_PASSWORD. As with the selkies-gstreamer WebRTC interface, the noVNC interface password will be set to BASIC_AUTH_PASSWORD, and uses PASSWD by default if not set. The noVNC interface also additionally accepts the NOVNC_VIEWPASS environment variable, where a view only password with only the ability to observe the desktop without controlling can also be set.

The container requires host NVIDIA GPU driver versions of at least 450.80.02 and preferably 470.42.01, with the NVIDIA Container Toolkit to be also configured on the host for allocating GPUs. All Maxwell or later generation GPUs in the consumer, professional, or datacenter lineups will not have significant issues running this container, although the selkies-gstreamer high-performance NVENC backend may not be available (see the next paragraph). Kepler GPUs are untested and likely does not support the NVENC backend, but can be mostly functional using fallback software acceleration.

The high-performance NVENC backend for the selkies-gstreamer WebRTC interface is only supported in GPUs listed as supporting H.264 (AVCHD) under the NVENC - Encoding section of NVIDIA's Video Encode and Decode GPU Support Matrix. If you are using software fallback without allocated GPUs or your GPU is not listed as supporting H.264 (AVCHD), add the environment variable WEBRTC_ENCODER with the value x264enc, vp8enc, or vp9enc in your container configuration for falling back to software acceleration, which also has a very good performance depending on your CPU.

The username is user in both the container user account and the web authentication prompt. The environment variable PASSWD is the password of the container user account, and BASIC_AUTH_PASSWORD is the password for the HTML5 interface authentication prompt. If ENABLE_BASIC_AUTH is set to true for selkies-gstreamer (not required for noVNC) but BASIC_AUTH_PASSWORD is unspecified, the HTML5 interface password will default to PASSWD.

NOTES: Only one web browser can be connected at a time with the selkies-gstreamer WebRTC interface. If the signaling connection works, but the WebRTC connection fails, read the Using a TURN Server section.

Running with Docker

  1. Run the container with Docker (or other similar container CLIs like Podman):
docker run --gpus 1 -it --tmpfs /dev/shm:rw -e TZ=UTC -e SIZEW=1920 -e SIZEH=1080 -e REFRESH=60 -e DPI=96 -e CDEPTH=24 -e PASSWD=mypasswd -e WEBRTC_ENCODER=nvh264enc -e BASIC_AUTH_PASSWORD=mypasswd -p 8080:8080 ghcr.io/selkies-project/nvidia-egl-desktop:latest

NOTES: The container tags available are latest and 22.04 for Ubuntu 22.04, and 20.04 for Ubuntu 20.04. Persistent container tags are available in the form 22.04-20210101010101. Replace all instances of mypasswd with your desired password. BASIC_AUTH_PASSWORD will default to PASSWD if unspecified. The container must not be run in privileged mode.

The environment variable VGL_DISPLAY can also be passed to the container, but only do so after you understand what it implicates with VirtualGL, valid values being either egl[n], or /dev/dri/card[n] only when --device=/dev/dri was used for the container.

Change WEBRTC_ENCODER to x264enc, vp8enc, or vp9enc when using the selkies-gstreamer interface if you are using software fallback without allocated GPUs or your GPU does not support H.264 (AVCHD) under the NVENC - Encoding section in NVIDIA's Video Encode and Decode GPU Support Matrix.

  1. Connect to the web server with a browser on port 8080. You may also separately configure a reverse proxy to this port for external connectivity.

NOTES: Additional configurations and environment variables for the selkies-gstreamer WebRTC HTML5 interface are listed in lines that start with parser.add_argument within the selkies-gstreamer main script.

  1. (Not Applicable for noVNC) Read carefully if the selkies-gstreamer WebRTC HTML5 interface does not connect. Choose whether to use host networking or a TURN server. The selkies-gstreamer WebRTC HTML5 interface will likely just start working if you add --network host to the above docker run command. However, this may be restricted or be undesired because of security reasons. If so, check if the container starts working after omitting --network host. If it does not work, you need a TURN server. Read the Using a TURN Server section and add the environment variables -e TURN_HOST=, -e TURN_PORT=, and pick one of -e TURN_SHARED_SECRET= or both -e TURN_USERNAME= and -e TURN_PASSWORD= environment variables to the docker run command based on your authentication method.

Running with Kubernetes

  1. Create the Kubernetes Secret with your authentication password:
kubectl create secret generic my-pass --from-literal=my-pass=YOUR_PASSWORD

NOTES: Replace YOUR_PASSWORD with your desired password, and change the name my-pass to your preferred name of the Kubernetes secret with the egl.yml file changed accordingly as well. It is possible to skip the first step and directly provide the password with value: in egl.yml, but this exposes the password in plain text.

  1. Create the pod after editing the egl.yml file to your needs, explanations are available in the file:
kubectl create -f egl.yml

NOTES: The container tags available are latest and 22.04 for Ubuntu 22.04, and 20.04 for Ubuntu 20.04. Persistent container tags are available in the form 22.04-20210101010101. BASIC_AUTH_PASSWORD will default to PASSWD if unspecified.

Change WEBRTC_ENCODER to x264enc, vp8enc, or vp9enc when using the selkies-gstreamer interface if you are using software fallback without allocated GPUs or your GPU does not support H.264 (AVCHD) under the NVENC - Encoding section in NVIDIA's Video Encode and Decode GPU Support Matrix.

  1. Connect to the web server spawned at port 8080. You may configure the ingress endpoint or reverse proxy that your Kubernetes cluster provides to this port for external connectivity.

NOTES: Additional configurations and environment variables for the selkies-gstreamer WebRTC HTML5 interface are listed in lines that start with parser.add_argument within the selkies-gstreamer main script.

  1. (Not Applicable for noVNC) Read carefully if the selkies-gstreamer WebRTC HTML5 interface does not connect. Choose whether to use host networking or a TURN server. The selkies-gstreamer WebRTC HTML5 interface will likely just start working if you uncomment hostNetwork: true in egl.yml. However, this may be restricted or be undesired because of security reasons. If so, check if the container starts working after commenting out hostNetwork: true. If it does not work, you need a TURN server. Read the Using a TURN Server section and fill in the environment variables TURN_HOST and TURN_PORT, then pick one of TURN_SHARED_SECRET or both TURN_USERNAME and TURN_PASSWORD environment variables based on your authentication method.

Using a TURN server

Note that this section is only required for the selkies-gstreamer WebRTC HTML5 interface. For an easy fix to when the signaling connection works, but the WebRTC connection fails, add the option --network host to your Docker command, or uncomment hostNetwork: true in your egl.yml file when using Kubernetes (note that your cluster may have not allowed this, resulting in an error). This exposes your container to the host network, which disables network isolation. If this does not fix the connection issue (normally when the host is behind another firewall) or you cannot use this fix for security or technical reasons, read the below text.

In most cases when either of your server or client has a permissive firewall, the default Google STUN server configuration will work without additional configuration. However, when connecting from networks that cannot be traversed with STUN, a TURN server is required.

Deploying a TURN server

Read the instructions from selkies-gstreamer if want to deploy a TURN server or use a public TURN server instance.

Configuring with Docker

With Docker (or Podman), use the -e option to add the TURN_HOST, TURN_PORT environment variables. This is the hostname or IP and the port of the TURN server (3478 in most cases).

You may set TURN_PROTOCOL to tcp if you are only able to open TCP ports for the coTURN container to the internet, or if the UDP protocol is blocked or throttled in your client network. You may also set TURN_TLS to true with the -e option if TURN over TLS/DTLS was properly configured.

You also require to provide either just TURN_SHARED_SECRET for time-limited shared secret TURN authentication, or both TURN_USERNAME and TURN_PASSWORD for legacy long-term TURN authentication, depending on your TURN server configuration. Provide just one of these authentication methods, not both.

Configuring with Kubernetes

Your TURN server will use only one out of two ways to authenticate the client, so only provide one type of authentication method. The time-limited shared secret TURN authentication requires to only provide the Base64 encoded TURN_SHARED_SECRET. The legacy long-term TURN authentication requires to provide both TURN_USERNAME and TURN_PASSWORD credentials.

Time-limited shared secret authentication

  1. Create a secret containing the TURN shared secret:
kubectl create secret generic turn-shared-secret --from-literal=turn-shared-secret=MY_TURN_SHARED_SECRET

NOTES: Replace MY_TURN_SHARED_SECRET with the shared secret of the TURN server, then changing the name turn-shared-secret to your preferred name of the Kubernetes secret, with the egl.yml file also being changed accordingly.

  1. Uncomment the lines in the egl.yml file related to TURN server usage, updating the TURN_HOST and TURN_PORT environment variable as needed:
- name: TURN_HOST
  value: "turn.example.com"
- name: TURN_PORT
  value: "3478"
- name: TURN_SHARED_SECRET
  valueFrom:
    secretKeyRef:
      name: turn-shared-secret
      key: turn-shared-secret
- name: TURN_PROTOCOL
  value: "udp"
- name: TURN_TLS
  value: "false"

NOTES: It is possible to skip the first step and directly provide the shared secret with value:, but this exposes the shared secret in plain text. Set TURN_PROTOCOL to tcp if you were able to only open TCP ports while creating your own coTURN Deployment/DaemonSet, or if your client network throttles or blocks the UDP protocol.

Legacy long-term authentication

  1. Create a secret containing the TURN password:
kubectl create secret generic turn-password --from-literal=turn-password=MY_TURN_PASSWORD

NOTES: Replace MY_TURN_PASSWORD with the password of the TURN server, then changing the name turn-password to your preferred name of the Kubernetes secret, with the egl.yml file also being changed accordingly.

  1. Uncomment the lines in the egl.yml file related to TURN server usage, updating the TURN_HOST, TURN_PORT, and TURN_USERNAME environment variable as needed:
- name: TURN_HOST
  value: "turn.example.com"
- name: TURN_PORT
  value: "3478"
- name: TURN_USERNAME
  value: "username"
- name: TURN_PASSWORD
  valueFrom:
    secretKeyRef:
      name: turn-password
      key: turn-password
- name: TURN_PROTOCOL
  value: "udp"
- name: TURN_TLS
  value: "false"

NOTES: It is possible to skip the first step and directly provide the TURN password with value:, but this exposes the TURN password in plain text. Set TURN_PROTOCOL to tcp if you were able to only open TCP ports while creating your own coTURN Deployment/DaemonSet, or if your client network throttles or blocks the UDP protocol.

Troubleshooting

I have an issue related to the WebRTC HTML5 interface.

Link

I want to use the keyboard layout of my own language.

Run Input Method: Configure Input Method from the start menu, uncheck Only Show Current Language, search and add from available input methods (Hangul, Mozc, Pinyin, and others) by moving to the right, then use Ctrl + Space to switch between the input methods. Raise an issue if you need more layouts.

The container does not work.

Check that the NVIDIA Container Toolkit is properly configured in the host. Next, check whether your host NVIDIA GPU driver is the nvidia-headless variant, which lacks the required display and graphics capabilities for this container.

After that, check the environment variable NVIDIA_DRIVER_CAPABILITIES after starting a shell interface inside the container. NVIDIA_DRIVER_CAPABILITIES should be set to all, or include a comma-separated list of compute (requirement for CUDA and OpenCL, or for the selkies-gstreamer WebRTC remote desktop interface), utility (requirement for nvidia-smi and NVML), graphics (requirement for OpenGL and part of the requirement for Vulkan), video (required for encoding or decoding videos using NVIDIA GPUs, or for the selkies-gstreamer WebRTC remote desktop interface), display (the other requirement for Vulkan), and optionally compat32 if you use Wine or 32-bit graphics applications.

Moreover, if you are using custom configurations, check if your shared memory path /dev/shm has sufficient capacity, where expanding the capacity is done by adding --tmpfs /dev/shm:rw to your Docker command or adding the below lines to your Kubernetes configuration file.

spec:
  template:
    spec:
      containers:
        volumeMounts:
        - mountPath: /dev/shm
          name: dshm
      volumes:
      - name: dshm
        emptyDir:
          medium: Memory

If you checked everything here, scroll down.

I want to use systemd, polkit, FUSE mounts, or sandboxed (containerized) application distribution systems like Flatpak, Snapcraft (snap), AppImage, and etc.

Use the option --appimage-extract-and-run or --appimage-extract with your AppImage to run them in a container. Alternatively, set export APPIMAGE_EXTRACT_AND_RUN=1 to your current shell. For controlling PulseAudio, use pactl instead of pacmd as the latter corrupts the audio system within the container. Use sudoedit to edit protected files in the desktop instead of using sudo followed by the name of the editor.

Open Long Answer

For systemd, polkit, FUSE mounts, or sandboxed application distribution systems, do not use them with containers. You can use them if you add unsafe capabilities to your containers, but it will break the isolation of the containers. This is especially bad if you are using Kubernetes. For controlling PulseAudio, use pactl instead of pacmd as the latter corrupts the audio system within the container. Because polkit does not work, use sudoedit to edit protected files with the GUI instead of using sudo followed by the name of the editor. There will likely be an alternative way to install the applications, including Personal Package Archives. For some applications, there will be options to disable sandboxing when running or options to extract files before running.

OpenGL does not work for certain applications.

This is likely an issue with VirtualGL, which is used to translate GLX commands to EGL commands and use OpenGL without Xorg. Some applications, including research workloads, show this problem. This cannot be solved by raising an issue here or contacting me.

First, check that the application works with docker-nvidia-glx-desktop in the same host environment. If it works, it is indeed a problem associated with VirtualGL. If it does not, raise an issue here. Second, use the error messages found with verbose mode and search similar issues for your application. Third, if there are no similar issues, raise the issue to the repository or contact the maintainers. Fourth, if the maintainers request that it should be redirected to VirtualGL, raise an issue there after confirming VirtualGL does not have similar issues raised. Note that in this case, you may have to wait for a new VirtualGL release and for this repository to use the new release.

Vulkan does not work.

Make sure that the NVIDIA_DRIVER_CAPABILITIES environment variable is set to all, or includes both graphics and display. The display capability is especially crucial to Vulkan, but the container does start without noticeable issues other than Vulkan without display, despite its name. AMD and Intel GPUs are not tested and therefore Vulkan is not guaranteed to work. A Vulkan ICD file is probably required to be added and related drivers like mesa-vulkan-drivers should be installed inside the container. People are welcome to share their experiences, however.

I want to use a specific GPU for OpenGL rendering when I have multiple GPUs in one container.

Use the VGL_DISPLAY environment variable, but only do so after you understand what it implicates with VirtualGL. Valid values are either egl[n], or /dev/dri/card[n] only when --device=/dev/dri was used for the container ([n] is the order of the GPUs, where simply egl without the number is the same as egl0). Note that docker --gpus 1 means any single GPU, not the GPU device ID of 1. Use docker --gpus '"device=1,2"' to provision GPUs with device IDs 1 and 2 to the container.


This work was supported in part by National Science Foundation (NSF) awards CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, OAC-2112167, CNS-2100237, CNS-2120019, the University of California Office of the President, and the University of California San Diego's California Institute for Telecommunications and Information Technology/Qualcomm Institute. Thanks to CENIC for the 100Gbps networks.

docker-nvidia-egl-desktop's People

Contributors

ehfd avatar numerical2017 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docker-nvidia-egl-desktop's Issues

how to modify the entrypoint.sh in container

Hi,
If I modified the /etc/entrypoint.sh in the container, then restart the container, the process will be less than first start, what's the problem?

user@c907d6e6b0c7:~$ ps -ef
UID PID PPID C STIME TTY TIME CMD
user 1 0 1 04:44 pts/0 00:00:00 /usr/bin/python3 /usr/bin/supervisord
user 8 1 0 04:44 pts/0 00:00:00 /bin/bash /etc/entrypoint.sh
root 29 1 0 04:44 ? 00:00:00 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups
message+ 43 1 0 04:44 ? 00:00:00 /usr/bin/dbus-daemon --system
user 105 8 51 04:44 pts/0 00:00:09 /usr/bin/java -Djava.util.logging.config.file=/opt/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djdk.tls.ephemeralDHKeySize=2048
user 106 8 0 04:44 pts/0 00:00:00 guacd -b 0.0.0.0 -f
user 161 0 2 04:44 pts/1 00:00:00 /bin/bash
user 172 161 0 04:44 pts/1 00:00:00 ps -ef

Cannot see the VNC like processes anymore, any ideas?
Thanks a lot.

Missing gstreamer plugins nice, webrtc, dtls, ...

Hi, thanks a lot for this repo, it's been extremely useful!

I've customized the container considerable and for some reason I get an error after running selkies-gstreamer --addr="127.0.0.1" --port="8080"

Traceback (most recent call last):
  File "/opt/conda/bin/selkies-gstreamer", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/selkies_gstreamer/__main__.py", line 527, in main
    app = GSTWebRTCApp(stun_servers, turn_servers, enable_audio, audio_channels, curr_fps, args.encoder, curr_video_bitrate, curr_audio_bitrate)
  File "/opt/conda/lib/python3.10/site-packages/selkies_gstreamer/gstwebrtc_app.py", line 83, in __init__
    self.check_plugins()
  File "/opt/conda/lib/python3.10/site-packages/selkies_gstreamer/gstwebrtc_app.py", line 654, in check_plugins
    raise GSTWebRTCAppError('Missing gstreamer plugins:', missing)
gstwebrtc_app.GSTWebRTCAppError: ('Missing gstreamer plugins:', ['nice', 'webrtc', 'dtls', 'srtp', 'rtp', 'sctp', 'rtpmanager', 'ximagesrc', 'nvcodec'])

The plugins seem to be there, any tip on what to look at I would appreciate.

ll /opt/gstreamer/lib/x86_64-linux-gnu/libgstwebrtc-1.0.so
ll /opt/gstreamer/lib/x86_64-linux-gnu/libgstwebrtc-1.0.so.0
ll /opt/gstreamer/lib/x86_64-linux-gnu/libgstwebrtc-1.0.so.0.2005.0
etc...

Will this work without i386 ?

I was wondering if I remove all i386 references and try to use only amd64 architecture, will this project still work? or does it really need all the i386 packages?

Unity 3d-Rendering app does not recognize the display/desktop in Docker env

I'm trying to run a headless render app (Unity3d) inside docker container.
Tried vglrun tool as well, but it doesn't work.
The app logs ending with Desktop is 0 x 0 @ 0 Hz.

  1. It does work on the normal ubuntu-desktop (not dockerized).
  2. Works with emulation tool xvfb-run --auto-servernum --server-args='-screen 0 640x480x24' ./app.x86_64, but it doesn't use GPU

Do you have any suggestions? Thanks

Choosing a specific device ID for NVIDIA GPUs

Thanks for your great job!

I have success run docker-nvidia-egl-desktop:latest.

However, I happened to this problem and failed.

I want to run Calar simulator which need to run in ubuntu 18.04.

It would be great to provide 18.04 version too.

Here is some information run in docker.

docker run --gpus 4 --device=/dev/dri:rw -it  -e SHARED=TRUE -e VNCPASS=vncpasswd -p 5901:5901 ehfd/nvidia-glx-desktop:18.04
user@b769976cec0e:~$ printf "3\nn\nx\n" | sudo /opt/VirtualGL/bin/vglserver_config

1) Configure server for use with VirtualGL (GLX + EGL back ends)
2) Unconfigure server for use with VirtualGL (GLX + EGL back ends)
3) Configure server for use with VirtualGL (EGL back end only)
4) Unconfigure server for use with VirtualGL (EGL back end only)
X) Exit

Choose:

Restrict framebuffer device access to vglusers group (recommended)?
[Y/n]
... Creating /etc/modprobe.d/virtualgl.conf to set requested permissions for
    /dev/nvidia* ...
... Attempting to remove nvidia module from memory so device permissions
    will be reloaded ...
rmmod: ERROR: Module nvidia is in use by: nvidia_uvm nvidia_modeset
... Granting write permission to /dev/nvidia-modeset /dev/nvidia-uvm /dev/nvidia-uvm-tools /dev/nvidia0 /dev/nvidia1 /dev/nvidia2 /dev/nvidia3 /dev/nvidiactl for all users ...
chmod: changing permissions of '/dev/nvidia-modeset': Read-only file system
chmod: changing permissions of '/dev/nvidia-uvm': Read-only file system
chmod: changing permissions of '/dev/nvidia-uvm-tools': Read-only file system
chmod: changing permissions of '/dev/nvidia0': Read-only file system
chmod: changing permissions of '/dev/nvidia1': Read-only file system
chmod: changing permissions of '/dev/nvidia2': Read-only file system
chmod: changing permissions of '/dev/nvidia3': Read-only file system
chmod: changing permissions of '/dev/nvidiactl': Read-only file system
chown: changing ownership of '/dev/nvidia-modeset': Read-only file system
chown: changing ownership of '/dev/nvidia-uvm': Read-only file system
chown: changing ownership of '/dev/nvidia-uvm-tools': Read-only file system
chown: changing ownership of '/dev/nvidia0': Read-only file system
chown: changing ownership of '/dev/nvidia1': Read-only file system
chown: changing ownership of '/dev/nvidia2': Read-only file system
chown: changing ownership of '/dev/nvidia3': Read-only file system
chown: changing ownership of '/dev/nvidiactl': Read-only file system
... Granting write permission to /dev/dri/card0 /dev/dri/card1 /dev/dri/card2 /dev/dri/card3 /dev/dri/card4 /dev/dri/card5 /dev/dri/card6 /dev/dri/card7 /dev/dri/card8 for all users ...
... Granting write permission to /dev/dri/renderD128 /dev/dri/renderD129 /dev/dri/renderD130 /dev/dri/renderD131 /dev/dri/renderD132 /dev/dri/renderD133 /dev/dri/renderD134 /dev/dri/renderD135 for all users ...

1) Configure server for use with VirtualGL (GLX + EGL back ends)
2) Unconfigure server for use with VirtualGL (GLX + EGL back ends)
3) Configure server for use with VirtualGL (EGL back end only)
4) Unconfigure server for use with VirtualGL (EGL back end only)
X) Exit

Choose:

The screen locker is broken and unlocking is not possible anymore

When using NOVNC_ENABLE=true we get the screen locked after some time showing a black screen with The screen locker is broken and unlocking is not possible anymore does anyone knows how to get passed this?
image

Using: ghcr.io/selkies-project/nvidia-egl-desktop:latest

Google Chrome with GPU

Hello, I am trying to run google-chrome with gpu acceleration, but it reports following errors.

libEGL warning: DRI2: failed to authenticate
[87:87:0115/224538.411482:ERROR:vaapi_wrapper.cc(830)] Could not get a valid VA display
[87:87:0115/224538.411770:ERROR:gpu_init.cc(523)] Passthrough is not supported, GL is egl, ANGLE is 
[87:87:0115/224538.566925:ERROR:gpu_memory_buffer_support_x11.cc(49)] dri3 extension not supported.
Warning: loader_scanned_icd_add: Driver /usr/lib/x86_64-linux-gnu/libvulkan_intel.so supports Vulkan 1.2, but only supports loader interface version 4. Interface version 5 or newer required to support this version of Vulkan (Policy #LDP_DRIVER_7)
Warning: loader_scanned_icd_add: Driver /usr/lib/x86_64-linux-gnu/libvulkan_radeon.so supports Vulkan 1.2, but only supports loader interface version 4. Interface version 5 or newer required to support this version of Vulkan (Policy #LDP_DRIVER_7)
Warning: loader_scanned_icd_add: Driver /usr/lib/x86_64-linux-gnu/libvulkan_lvp.so supports Vulkan 1.1, but only supports loader interface version 4. Interface version 5 or newer required to support this version of Vulkan (Policy #LDP_DRIVER_7)
Warning: Layer VK_LAYER_MESA_device_select uses API version 1.2 which is older than the application specified API version of 1.3. May cause issues.
error: XDG_RUNTIME_DIR not set in the environment.
error: XDG_RUNTIME_DIR not set in the environment.
[87:87:0115/224547.422974:ERROR:gl_display.cc(508)] EGL Driver message (Error) eglMakeCurrent: EGL_BAD_CONTEXT error: In eglMakeCurrent: Invalid EGLContext (0x3c08012326f0)

[87:87:0115/224547.459609:ERROR:gl_display.cc(508)] EGL Driver message (Error) eglMakeCurrent: EGL_BAD_CONTEXT error: In eglMakeCurrent: Invalid EGLContext (0x3c0805665a11)

Error: eglChooseConfig returned zero configs
    at Create (../../third_party/dawn/src/dawn/native/opengl/ContextEGL.cpp:53)

Error: EGL_EXT_create_context_robustness must be supported
    at Create (../../third_party/dawn/src/dawn/native/opengl/ContextEGL.cpp:67)

Despite those errors in console, google chrome reports everything works fine:
image

I am using following arguments, and starting it with vglrun:

  --disable-software-rasterizer
  --disable-frame-rate-limit
  --disable-gpu-driver-bug-workarounds
  --disable-gpu-driver-workarounds
  --disable-gpu-vsync
  --enable-accelerated-2d-canvas
  --enable-accelerated-video-decode
  --enable-accelerated-mjpeg-decode
  --enable-unsafe-webgpu
  --enable-features=Vulkan,UseSkiaRenderer,VaapiVideoEncoder,VaapiVideoDecoder,CanvasOopRasterization
  --disable-features=UseOzonePlatform,UseChromeOSDirectVideoDecoder
  --enable-gpu-compositing
  --enable-native-gpu-memory-buffers
  --enable-gpu-rasterization
  --enable-oop-rasterization
  --enable-raw-draw
  --enable-zero-copy
  --ignore-gpu-blocklist
  --use-gl=egl

Do you have any idea?

Thank you for this project, btw!

Choosing a specific GPU device rendering in docker console? / VGL_DISPLAY for VirtualGL

Hi,

Thanks for this great project. It helps a lot and we really appreciate it. We follow the instructions to run that docker container and it works. How to have GPU-based rendering in docker console?

However, when we try to choose a specific GPU device rendering in docker console, it seems not work.

NVIDIA_VISIBLE_DEVICES=2 vglrun /opt/VirtualGL/bin/glxspheres64
CUDA_VISIBLE_DEVICES=2 vglrun /opt/VirtualGL/bin/glxspheres64

We try some command above, but vglrun still run on the device 1. Could you give us some advice for that?
Thank you very much.

No --device=/dev/dri, glxgears is not running on nvidia gpu

My host machine is ubuntu18.04 with 4 nvidia gpus. And I don't have /dev/dri node, with nvidia driver, I have /dev/nvidia0, /dev/nvidia1, /dev/nvidia2, /dev/nvidia3 instead. I can docker run, and connect vnc, but can not use gpu.
image
image

I can run glxgears, or vglrun glxgears, but not on gpu.
Command to start container is :
docker run --gpus 1 -it -e SIZEW=1920 -e SIZEH=1080 -e CDEPTH=24 -e SHARED=TRUE -e VNCPASS=vncpasswd -p 5901:5901 -v "/tmp/.docker.xauth:/tmp/.docker.xauth" -e XAUTHORITY=/tmp/.docker.xauth --name vv --rm --runtime=nvidia ehfd/nvidia-egl-desktop:latest

Video stream issue

Hello,

when I try using the command you gave in the repo to run an container using image,The web login works,the function works,but stuck at “waiting for video stream”,please look into this,

thank you,

Question regarding building a pod based on this image

Dear @ehfd, thank you so much for this great repo. You really saved my life!

But I have a few questions regarding the images:

  1. As I'm training some deep learning models using Unity simulation (it has some renderings and visual observations), the remote server with a GUI would be truly helpful. In this case, if I build my pod/job to train my networks with your configuration YAML files: xgl.yml or egl.yml (both provide Vulkan and OpenGL), I don't need to install or deal with anything like Xvfb right? And from the document, I could directly use my (client) web browser to open the UI similar to VNC?
  2. However, when using these two YAML files to create my pod, both failed with the status CreateContainerConfigError. Do I need to modify anything in your example YAML files? Or is this error caused by yesterday's Nautilus broken or corruption issues?

Thank you again. Look forward to your reply!

Unable to choose node affinity

Hello,

I am trying to choose node affinity in order to specify which GPU I want to use for a specific program. Whenever I try to set the node affinity, the container gets stuck at the Pending status. Is there any way to set the node affinity / select the specific type of GPU to be used?

EDIT: Fixed! My CPU requirements were too high which made it so that no node satisfied my requirements. Thank you!

NVIDIA headless drivers do not support the container

INFO:signaling:Listening on http://0.0.0.0:8080
INFO:signaling:websocket server started
INFO:webrtc_input:Resetting keyboard modifiers.
INFO:webrtc_input:starting clipboard monitor
INFO:webrtc_input:Found XFIXES version 4.0
INFO:webrtc_input:starting cursor monitor
INFO:webrtc_input:watching for cursor changes
INFO:webrtc_input:sending clipboard content, length: 34
ERROR:asyncio:Future exception was never retrieved
future: <Future finished exception=IndexError('list index out of range')>
Traceback (most recent call last):
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/dist-packages/selkies_gstreamer/__main__.py", line 732, in <lambda>
    loop.run_in_executor(None, lambda: gpu_mon.start())
  File "/usr/local/lib/python3.10/dist-packages/selkies_gstreamer/gpu_monitor.py", line 42, in start
    gpu = GPUtil.getGPUs()[0]
IndexError: list index out of range
INFO:signaling:Connected to ('127.0.0.1', 59490)
INFO:signaling:Registered peer '0' at ('127.0.0.1', 59490)
INFO:signaling:'0' command 'SESSION 1'
WARNING:webrtc_input:exception from fetching cursor image: <class 'Xlib.error.BadAccess'>: code = 10, resource_id = 1293, sequence_number = 20, major_opcode = 138, minor_opcode = 4
INFO:signaling:'0' command 'SESSION 1'
INFO:signaling:Connected to ('172.38.0.3', 36944)
INFO:signaling:'0' command 'SESSION 1'
INFO:signaling:Registered peer '1' at ('172.38.0.3', 36944)
INFO:signaling:'0' command 'SESSION 1'
INFO:signaling:Session from '0' (('127.0.0.1', 59490)) to '1' (('172.38.0.3', 36944))
INFO:gstwebrtc_app:starting pipeline
0:00:06.375390028    60 0x55645ad7c690 WARN     GST_ELEMENT_FACTORY gstelementfactory.c:701:gst_element_factory_make_with_properties: no such element factory "cudaupload"!
0:00:06.375402718    60 0x55645ad7c690 WARN     GST_ELEMENT_FACTORY gstelementfactory.c:701:gst_element_factory_make_with_properties: no such element factory "cudaconvert"!
0:00:06.375430511    60 0x55645ad7c690 WARN     GST_ELEMENT_FACTORY gstelementfactory.c:754:gst_element_factory_make_valist: no such element factory "nvh264enc"!
ERROR:main:Caught exception: 'NoneType' object has no attribute 'set_property'
INFO:webrtc_input:stopping clipboard monitor
INFO:webrtc_input:stopping cursor monitor
INFO: gst-python install looks OK
WARNING:webrtc_input:exception from fetching cursor image: <class 'Xlib.error.BadAccess'>: code = 10, resource_id = 1293, sequence_number = 23, major_opcode = 138, minor_opcode = 4
INFO:webrtc_input:exiting cursor monitor

How to have GPU-based rendering in docker console?

Hi,

Thanks for this great project. It helps a lot and I really appreciate it. I follow the instructions to run that docker container and log in through webVNC. I can confirm that the GPU-based is working as I got the following output:

user@351d9e1b605d:/opt/VirtualGL/bin$ ./glxspheres64 
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
GLX FB config ID of window: 0x11 (8/8/8/0)
Visual ID of window: 0x21
Context is Direct
OpenGL Renderer: NVIDIA GeForce RTX 3090/PCIe/SSE2
1041.901382 frames/sec - 1162.761942 Mpixels/sec
1005.491010 frames/sec - 1122.127967 Mpixels/sec

However, when I try to use login to that container through docker console instead, the rendering seems to happen on CPU instead.

This is how I log in to the container through docker console:

docker exec -it <container_id> /bin/bash

And the output for the same command:

user@351d9e1b605d:/opt/VirtualGL/bin$ ./glxspheres64
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
GLX FB config ID of window: 0x13d (8/8/8/0)
Visual ID of window: 0x3d0
Context is Direct
OpenGL Renderer: llvmpipe (LLVM 12.0.0, 256 bits)
71.574763 frames/sec - 79.877436 Mpixels/sec

I understand there might be some extra setups if I go through the console instead of webVNC. Can you help me with a pointer?

LMK if you need more information from me.

NVIDIA Drivers required with nvidia-headless

Hello

I have created an image which would not have been possible without the work you have done here (and the gstreamer interface) - Amazing job.

I have discovered that EGL is only possible with /dev/dri exposed within the container unless the Nvidia drivers are installed in the same way as you do with the GLX variant.

I'm happy to submit a PR to add this on, but I'll be mostly giving your own code back to you. I wanted to check what you wanted first.

Rob

ASK: how to start single window application

Hello, I love this repo. thanks so much.

I'm trying to start a single X client instead of the full xfce4-session. So far I've tried with xstart but no result so far. any ideas?

My end goal is to run:
startx /usr/bin/qemu-system-x86_64 ...

selkies-gstream hit 100% CPU

Wishing you a Happy New Year! I've set up a node using Standard_NC16as_T4_v3 in AKS . However, we're encountering issue where the pod runs for a few minutes with CPU usage under 9% for selkies-gstream process , but then it abruptly jumps to over 100%, causing the pod to freeze again and become inaccessible . Please could you advise if there is anything that have been missing while deploying this apps within azure environment

image

Instruction to test

I am testing this repository for remote desktop with GPU, using this g3.medium from https://docs.jetstream-cloud.org/general/vmsizes/#jetstream2-gpu.

This instance has Nvidia driver 525.60.13. As per support matrix, this gpu (A100) doesnt support NVENC, so I replaced WEBRTC_ENCODER with vp9enc. Here is the full docker command:

docker run --gpus 1 -it --tmpfs /dev/shm:rw -e TZ=UTC -e SIZEW=1920 -e SIZEH=1080 -e REFRESH=60 -e DPI=96 -e CDEPTH=24 -e PASSWD=mypasswd -e WEBRTC_ENCODER=vp9enc -e BASIC_AUTH_PASSWORD=mypasswd -p 8080:8080 ghcr.io/selkies-project/nvidia-egl-desktop:latest

here is the output of the this:

2023-01-19 19:32:05,876 INFO Set uid to user 1000 succeeded
2023-01-19 19:32:05,878 INFO supervisord started with pid 1
2023-01-19 19:32:06,882 INFO spawned: 'entrypoint' with pid 7
2023-01-19 19:32:06,885 INFO spawned: 'pulseaudio' with pid 8
2023-01-19 19:32:06,888 INFO spawned: 'selkies-gstreamer' with pid 9
2023-01-19 19:32:06,898 INFO reaped unknown pid 15 (exit status 1)
2023-01-19 19:32:06,980 INFO reaped unknown pid 21 (exit status 1)
2023-01-19 19:32:06,980 INFO reaped unknown pid 24 (exit status 1)
2023-01-19 19:32:06,980 INFO reaped unknown pid 27 (exit status 1)
2023-01-19 19:32:06,981 INFO reaped unknown pid 31 (exit status 1)
2023-01-19 19:32:06,981 INFO reaped unknown pid 36 (exit status 1)
2023-01-19 19:32:06,981 INFO reaped unknown pid 39 (exit status 1)
2023-01-19 19:32:06,981 INFO reaped unknown pid 43 (exit status 1)
2023-01-19 19:32:06,988 INFO reaped unknown pid 46 (exit status 1)
2023-01-19 19:32:07,900 INFO success: entrypoint entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-01-19 19:32:07,900 INFO success: pulseaudio entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-01-19 19:32:07,901 INFO success: selkies-gstreamer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-01-19 19:32:07,931 INFO reaped unknown pid 65 (exit status 1)
2023-01-19 19:32:07,931 INFO reaped unknown pid 62 (exit status 1)
2023-01-19 19:32:07,931 INFO reaped unknown pid 69 (exit status 1)
2023-01-19 19:32:10,103 INFO reaped unknown pid 144 (exit status 0)
2023-01-19 19:32:10,239 INFO reaped unknown pid 151 (exit status 0)
2023-01-19 19:32:10,752 INFO reaped unknown pid 165 (exit status 0)
2023-01-19 19:32:10,962 INFO reaped unknown pid 254 (exit status 0)
2023-01-19 19:32:12,550 INFO reaped unknown pid 333 (exit status 0)

When I login with the provided user name and password, things seem to stuck at the "Waiting for video stream" step. I am not getting any desktop, even if I wait for 10 minutes. Am I missing some steps from instructions?

And these are the status logs from the client, if they are of any help:

[11:33:17] [webrtc] [ERROR] attempt to send data channel message before channel was open.

[11:33:19] [signalling] Connecting to server.

[11:33:21] [signalling] Registering with server, peer ID: 1

[11:33:23] [signalling] Registered with server.

[11:33:23] [signalling] Waiting for video stream.

[11:33:25] [webrtc] [ERROR] attempt to send data channel message before channel was open.

[11:33:25] [webrtc] Received incoming video stream from peer

[11:33:25] [webrtc] Received incoming audio stream from peer

[11:33:25] [webrtc] Completed ICE candidates from peer connection

[11:33:43] [webrtc] [ERROR] attempt to send data channel message before channel was open.

[11:33:46] [webrtc] [ERROR] attempt to send data channel message before channel was open.

[11:33:46] [webrtc] [ERROR] attempt to send data channel message before channel was open.

[11:33:48] [webrtc] [ERROR] attempt to send data channel message before channel was open.


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.