Imaginaire - COCO-FUNIT TUTORIAL

A step by step tutorial to train COCO-FUNIT using the animal faces dataset. Forked from the imaginaire project. The original readme can be found here

This tutorial was tested on Pop! OS 22.04

License

Imaginaire is released under NVIDIA Software license. For commercial use, please consult NVIDIA Research Inquiries.

Linux Installation

Installing Dependency's

Install Docker

Install docker engine using the official guide. The link for each platform is shown under the server heading.

After installation you should add your user to the docker group :
```
sudo groupadd docker
sudo usermod -aG docker $USER
```
Then restart so that your group membership is re-evaluated.

You can verify that docker has been set up correctly by running the hello-world image :
```
docker run hello-world
```

Install NVIDIA Container Toolkit

Currently at the time of writing I could not get the official installation method to work. The following commands were performed to install NVIDIA Container Toolkit version 1.10.0-1 :

Ubuntu:

    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
    && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
    && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
        sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
        sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

    sudo apt-get update

    sudo apt-get install nvidia-container-toolkit=1.10.0-1
    sudo apt-get install libnvidia-container1=1.10.0-1
    sudo apt-get install libnvidia-container-tools=1.10.0-1

Pop! os:

Because the nvidia-container-toolkit is only supported by a couple of distribution, you have some manipulations to do to be able to install it on Pop! OS

    distribution="ubuntu22.04" \
    && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
    && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

    sudo apt-get update

    sudo apt-get install nvidia-container-toolkit=1.10.0-1
    sudo apt-get install libnvidia-container1=1.10.0-1
    sudo apt-get install libnvidia-container-tools=1.10.0-1

Testing:

A working setup can be tested by running a base CUDA container:

docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

This should result in a console output shown below:

    +-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Setup COCO-FUNIT

Clone the repo

git clone https://github.com/Jesse27/imaginaire-coco-funit.git

Build docker image

After cloning navigate to the path/to/imaginaire-coco-funit/ directory and run the build script :

Note all scripts should be run from this directory.
```
bash scripts/build_docker.sh 21.06
```
Start docker image
```
bash scripts/start_local_docker.sh 21.06
```
This should result in a console output shown below where 0f388ec0d8b2 is the docker CONTAINER ID:
```
root@0f388ec0d8b2:/workspace/coco-funit#
```
If you have run the code previously you may receive the error below.
```
docker: Error response from daemon: Conflict. The container name "/coco-funit" is already in use by container ...
```
This can be solved by running the following commands to stop and remove the existing container.
```
docker stop coco-funit
docker rm coco-funit
```
If you wish to keep the existing container you can use the following command to open it's terminal shell
```
docker exec -it coco-funit /bin/bash
```

Training on the animal faces dataset

Downloading the data

The example animal-faces dataset can be downloaded using the download_dataset.py script. This should be run in the docker container from the /workspace/coco-funit/ directory
```
python scripts/download_dataset.py --dataset animal_faces
```

Build the lmdbs

for f in train train_all val; do
python scripts/build_lmdb.py \
--config  configs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml \
--data_root dataset/animal_faces/${f} \
--output_root projects/coco_funit/data/lmdb/training/animal_faces/${f} \
--overwrite
done

Start Training

--nproc_per_node=1 configures the number of GPU's used in training, it is set to 1 by default.

Other configuration parameters are found in configs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml
```
python -m torch.distributed.launch --nproc_per_node=1 train.py --config configs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml --logdir logs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml
```
Note that you may encounter a git config issue. This can be solved by running the command displayed in terminal then re-running the train command:
```
git config --global --add safe.directory path/to/imaginaire-coco-funit/
```
Output

The output contains images, TensorBoard logs and model checkpoints.

The training output is found in path/to/imaginaire-coco-funit/logs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml/

The number of rows shown in the output image is equal to the batch size per GPU.
TensorBoard

TensorBoard logs should be opened within the docker container. To access the docker container while the model is training open another terminal and run the following command:

Note that coco-funit is the name of the docker container, this can be found by running docker container ls. It is set to coco-funit by default.
```
docker exec -it coco-funit /bin/bash
```
This should result in a console output shown below where 0f388ec0d8b2 is the docker CONTAINER ID:
```
root@0f388ec0d8b2:/workspace/coco-funit#
```
To start TensorBoard run the following command in the docker container:
```
tensorboard --logdir logs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml/tensorboard
```
TensorBoard can then be opened at 0.0.0.0:8083 on the local machine.

jesse27 / imaginaire-coco-funit Goto Github PK

imaginaire-coco-funit's Introduction

Imaginaire - COCO-FUNIT TUTORIAL

License

Linux Installation

Installing Dependency's

Setup COCO-FUNIT

Training on the animal faces dataset

imaginaire-coco-funit's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent