Giter VIP home page Giter VIP logo

imaginaire-coco-funit's Introduction

Imaginaire - COCO-FUNIT TUTORIAL

A step by step tutorial to train COCO-FUNIT using the animal faces dataset. Forked from the imaginaire project. The original readme can be found here

This tutorial was tested on Pop! OS 22.04

License

Imaginaire is released under NVIDIA Software license. For commercial use, please consult NVIDIA Research Inquiries.

Linux Installation

Installing Dependency's

  1. Install Docker

    Install docker engine using the official guide. The link for each platform is shown under the server heading.

    After installation you should add your user to the docker group :

    sudo groupadd docker
    sudo usermod -aG docker $USER

    Then restart so that your group membership is re-evaluated.

    You can verify that docker has been set up correctly by running the hello-world image :

    docker run hello-world
    
  2. Install NVIDIA Container Toolkit

    Currently at the time of writing I could not get the official installation method to work. The following commands were performed to install NVIDIA Container Toolkit version 1.10.0-1 :

    • Ubuntu:

          distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
          && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
          && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
              sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
              sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
          sudo apt-get update
          sudo apt-get install nvidia-container-toolkit=1.10.0-1
          sudo apt-get install libnvidia-container1=1.10.0-1
          sudo apt-get install libnvidia-container-tools=1.10.0-1
    • Pop! os:

      Because the nvidia-container-toolkit is only supported by a couple of distribution, you have some manipulations to do to be able to install it on Pop! OS

          distribution="ubuntu22.04" \
          && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
          && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
          sudo apt-get update
          sudo apt-get install nvidia-container-toolkit=1.10.0-1
          sudo apt-get install libnvidia-container1=1.10.0-1
          sudo apt-get install libnvidia-container-tools=1.10.0-1
    • Testing:

      A working setup can be tested by running a base CUDA container:

      docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
      

      This should result in a console output shown below:

          +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
      | N/A   34C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
      

Setup COCO-FUNIT

  1. Clone the repo

    git clone https://github.com/Jesse27/imaginaire-coco-funit.git
  2. Build docker image

    After cloning navigate to the path/to/imaginaire-coco-funit/ directory and run the build script :

    Note all scripts should be run from this directory.

    bash scripts/build_docker.sh 21.06
  3. Start docker image

    bash scripts/start_local_docker.sh 21.06

    This should result in a console output shown below where 0f388ec0d8b2 is the docker CONTAINER ID:

    root@0f388ec0d8b2:/workspace/coco-funit#
    

    If you have run the code previously you may receive the error below.

    docker: Error response from daemon: Conflict. The container name "/coco-funit" is already in use by container ...
    

    This can be solved by running the following commands to stop and remove the existing container.

    docker stop coco-funit
    docker rm coco-funit

    If you wish to keep the existing container you can use the following command to open it's terminal shell

    docker exec -it coco-funit /bin/bash

Training on the animal faces dataset

  1. Downloading the data

    The example animal-faces dataset can be downloaded using the download_dataset.py script. This should be run in the docker container from the /workspace/coco-funit/ directory

    python scripts/download_dataset.py --dataset animal_faces
  2. Build the lmdbs

    for f in train train_all val; do
    python scripts/build_lmdb.py \
    --config  configs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml \
    --data_root dataset/animal_faces/${f} \
    --output_root projects/coco_funit/data/lmdb/training/animal_faces/${f} \
    --overwrite
    done
  3. Start Training

    --nproc_per_node=1 configures the number of GPU's used in training, it is set to 1 by default.

    Other configuration parameters are found in configs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml

    python -m torch.distributed.launch --nproc_per_node=1 train.py --config configs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml --logdir logs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml

    Note that you may encounter a git config issue. This can be solved by running the command displayed in terminal then re-running the train command:

    git config --global --add safe.directory path/to/imaginaire-coco-funit/
  4. Output

    The output contains images, TensorBoard logs and model checkpoints.

    The training output is found in path/to/imaginaire-coco-funit/logs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml/

    The number of rows shown in the output image is equal to the batch size per GPU.

  5. TensorBoard

    TensorBoard logs should be opened within the docker container. To access the docker container while the model is training open another terminal and run the following command:

    Note that coco-funit is the name of the docker container, this can be found by running docker container ls. It is set to coco-funit by default.

    docker exec -it coco-funit /bin/bash

    This should result in a console output shown below where 0f388ec0d8b2 is the docker CONTAINER ID:

    root@0f388ec0d8b2:/workspace/coco-funit#
    

    To start TensorBoard run the following command in the docker container:

    tensorboard --logdir logs/projects/coco_funit/animal_faces/base64_bs8_class119.yaml/tensorboard

    TensorBoard can then be opened at 0.0.0.0:8083 on the local machine.

imaginaire-coco-funit's People

Contributors

arunmallya avatar mingyuliutw avatar jesse27 avatar xunhuang1995 avatar tcwang0509 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.