Giter VIP home page Giter VIP logo

poda's Introduction

PØDA: Prompt-driven Zero-shot Domain Adaptation

Mohammad Fahes1, Tuan-Hung Vu1,2, Andrei Bursuc1,2, Patrick Pérez1,2, Raoul de Charette1
1 Inria, Paris, France.

2 valeo.ai, Paris, France.

TL; DR: PØDA (or PODA) is a simple feature augmentation method for zero-shot domain adaptation guided by a single textual description of the target domain.

Project page: https://astra-vision.github.io/PODA/
Paper: https://arxiv.org/abs/2212.03241

Citation

@article{fahes2022poda,
  title={P{\O}DA: Prompt-driven Zero-shot Domain Adaptation},
  author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
  journal={arXiv preprint arXiv:2212.03241},
  year={2022}
}

Overview

Overview of PØDA

Method

Optimizing AdaIN layer based on "feature/target domain description" similarity

Teaser

Test on unseen youtube video of night driving:
Training dataset: Cityscapes
Prompt: "driving at night"

Table of Content

Installation

Dependencies

First create a new conda environment with the required packages:

conda env create --file environment.yml

Then activate environment using:

conda activate poda_env

Datasets

  • CITYSCAPES: Follow the instructions in Cityscapes to download the images and semantic segmentation ground-truths. Please follow the dataset directory structure:

    <CITYSCAPES_DIR>/             % Cityscapes dataset root
    ├── leftImg8bit/              % input image (leftImg8bit_trainvaltest.zip)
    └── gtFine/                   % semantic segmentation labels (gtFine_trainvaltest.zip)
  • ACDC: Download ACDC images and ground truths from ACDC. Please follow the dataset directory structure:

    <ACDC_DIR>/                   % ACDC dataset root
    ├── rbg_anon/                 % input image (rgb_anon_trainvaltest.zip)
    └── gt/                       % semantic segmentation labels (gt_trainval.zip)
  • GTA5: Download GTA5 images and ground truths from GTA5. Please follow the dataset directory structure:

    <GTA5_DIR>/                   % GTA5 dataset root
    ├── images/                   % input image 
    └── labels/                   % semantic segmentation labels

Pretrained models

The source-only pretrained models are available here

Running PODA

Source training

python3 main.py \
  --dataset <source_dataset> \
  --data_root <path_to_source_dataset> \
  --data_aug \
  --lr 0.1 \
  --crop_size 768 \
  --batch_size 2 \
  --freeze_BB \
  --ckpts_path saved_ckpts

Feature optimization

python3 f_aug_with_CLIP.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--total_it 100 \
--resize_feat \
--domain_desc <target_domain_description>  \
--save_dir <directory_for_saved_statistics>

Model adaptation

python3 main.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--ckpt <path_to_source_checkpoint> \
--batch_size 16 \
--lr 0.001 \
--ckpts_path adapted \
--freeze_BB \
--train_aug \
--total_itrs 2000 \ 
--path_mu_sig <path_to_augmented_statistics> \
--mix <put_if_style_mixing>

Evaluation

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_path> \
--ckpt <path_to_tested_model> \
--test_only \
--val_batch_size 1 \
--ACDC_sub <ACDC_subset_if_tested_on_ACDC>   

Inference & Visualization

To test any model on any image and visualize the output, please add the images to predict_test directory and run:

python3 predict.py \
--ckpt <ckpt_path> \
--save_val_results_to <directory_for_saved_output_images>

License

PODA is released under the Apache 2.0 license.

Acknowledgement

The code is based on this implementation of DeepLabv3+, and uses code from CLIP


↑ back to top

poda's People

Contributors

rdecharette avatar mfahes avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.