Giter VIP home page Giter VIP logo

pixmatch's Introduction

PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training

Paper Conference Project Page

Description

Unsupervised domain adaptation is a promising technique for semantic segmentation and other computer vision tasks for which large-scale data annotation is costly and time-consuming. In semantic segmentation particularly, it is attractive to train models on annotated images from a simulated (source) domain and deploy them on real (target) domains. In this work, we present a novel framework for unsupervised domain adaptation based on the notion of target-domain consistency training. Intuitively, our work is based on the insight that in order to perform well on the target domain, a model’s output should be consistent with respect to small perturbations of inputs in the target domain. Specifically, we introduce a new loss term to enforce pixelwise consistency between the model's predictions on a target image and perturbed version of the same image. In comparison to popular adversarial adaptation methods, our approach is simpler, easier to implement, and more memory-efficient during training. Experiments and extensive ablation studies demonstrate that our simple approach achieves remarkably strong results on two challenging synthetic-to-real benchmarks, GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes.

How to run

Dependencies

  • PyTorch (tested on version 1.7.1, but should work on any version)
  • Hydra 1.1: pip install hydra-core --pre
  • Other: pip install albumentations tqdm tensorboard
  • WandB (optional): pip install wandb

General

We use Hydra for configuration and Weights and Biases for logging. With Hydra, you can specify a config file (found in configs/) with --config-name=myconfig.yaml. You can also override the config from the command line by specifying the overriding arguments (without --). For example, you can disable Weights and Biases with wandb=False and you can name the run with name=myname.

We have prepared example configs for GTA5 and SYNTHIA in configs/gta5.yaml and configs/synthia.yaml.

Data Preparation

To run on GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes, you need to download the respective datasets. Once they are downloaded, you can either modify the config files directly, or organize/symlink the data in the datasets/ directory as follows:

datasets
├── cityscapes
│   ├── gtFine
│   │   ├── train
│   │   │   ├── aachen
│   │   │   └── ...
│   │   └── val
│   └── leftImg8bit
│       ├── train
│       └── val
├── GTA5
│   ├── images
│   ├── labels
│   └── list
├── SYNTHIA
│   └── RAND_CITYSCAPES
│       ├── Depth
│       │   └── Depth
│       ├── GT
│       │   ├── COLOR
│       │   └── LABELS
│       ├── RGB
│       └── synthia_mapped_to_cityscapes
├── city_list
├── gta5_list
└── synthia_list

Initial Models

  • For GTA5-to-Cityscapes, we start with a model pretrained on the source (GTA5): Download
  • For SYNTHIA-to-Cityscapes, we start with a model pretrained on ImageNet: Download

SYNTHIA-to-Cityscapes

To run a baseline PixMatch model with standard data augmentations, we can use a command such as:

python main.py --config-name=synthia lam_aug=0.10 name=synthia_baseline

It is also easy to run a model with multiple augmentations:

python main.py --config-name=synthia lam_aug=0.00 lam_fourier=0.10 lam_cutmix=0.10 name=synthia_fourier_and_cutmix

GTA5-to-Cityscapes

python main.py --config-name=synthia lam_aug=0.10 name=gta5_baseline

Evaluation

To evaluate, simply set the train argument to False:

python main.py train=False

Pretrained models

To evaluate a pretrained/trained model, you can run:

# GTA (default)
CUDA_VISIBLE_DEVICES=3 python main.py train=False wandb=False model.checkpoint=$(pwd)/pretrained/GTA5-to-Cityscapes-checkpoint.pth

# SYNTHIA
CUDA_VISIBLE_DEVICES=3 python main.py --config-name synthia train=False wandb=False model.checkpoint=$(pwd)/pretrained/GTA5-to-Cityscapes-checkpoint.pth

Citation

@inproceedings{melaskyriazi2021pixmatch,
  author    = {Melas-Kyriazi, Luke and Manrai, Arjun},
  title     = {PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training},
  booktitle = cvpr,
  year      = {2021}
}

pixmatch's People

Contributors

lukemelas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pixmatch's Issues

gta5_source_pretrained

Hi author, thanks for your great job! I notice that for GTA5-to-Cityscapes, you used a model pretrained on the source GTA5. I want to know the training details of gta5-source-pretrained model and I want to train this pretrain from scratch.

About target size

Dear author, thanks for your great job. I found that you use [1280,640] as target img size, and common UDA semantic segmentation methods use [1024, 512]. Then I tried to use [1024, 512] as trg img size but got worse result. I wonder whether the trg img size is an important hyperparameter. Why you choose [1280, 640] ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.