git submodule update --init --recursive
- add sudo group
sudo usermod -aG sudo ${USER}
- add docker group
sudo usermod -aG docker ${USER}
Install DGX based pytorch docker container
Pull Docker image
docker pull nvcr.io/nvidia/pytorch:21.08-py3
START PYTHORCH DOCKER
docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/pytorch:21.08-py3
This README would normally document whatever steps are necessary to get your application up and running.
- load module
module load Singularity
- pull container
singularity pull dgx_pytorch.sif docker://nvcr.io/nvidia/pytorch:21.08-py3
- start container
--for checking only @front-end node
singularity shell <container_name>
--run using SBATCH scripts
singularity exec --nv <singularity_name>.sif <command>
#Run container --nv is for DGX node, In case of multiple-clusters use srun in front of the code
pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html