Giter VIP home page Giter VIP logo

yasserben / clouds Goto Github PK

View Code? Open in Web Editor NEW
38.0 3.0 0.0 831 KB

[CVPR 2024] Official Implementation of Collaborating Foundation models for Domain Generalized Semantic Segmentation

Home Page: https://arxiv.org/abs/2312.09788

License: Apache License 2.0

Python 84.18% Shell 0.17% C++ 1.56% Cuda 14.10%
deep-learning detectron2 domain-adaptation domain-generalization foundation-models mask2former semantic-segmentation transformer

clouds's Introduction

Collaborating Foundation models for Domain Generalized Semantic Segmentation

This repository contains the code for the paper: Collaborating Foundation models for Domain Generalized Semantic Segmentation.

Overview

Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference. Existing DGSS methods typically effectuate robust features by means of Domain Randomization (DR). Such an approach is often limited as it can only account for style diversification and not content. In this work, we take an orthogonal approach to DGSS and propose to use an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic Segmentation (CLOUDS). In detail, CLOUDS is a framework that integrates FMs of various kinds: (i) CLIP backbone for its robust feature represen- tation, (ii) text-to-image generative models to diversify the content, thereby covering various modes of the possible target distribution, and (iii) Segment Anything Model (SAM) for iteratively refining the predictions of the segmentation model. Extensive experiments show that our CLOUDS excels in adapting from synthetic to real DGSS benchmarks and under varying weather conditions, notably outperforming prior methods by 5.6% and 6.7% on averaged mIoU, respectively.

Installation

See installation instructions.

Getting Started

See Preparing Datasets for CLOUDS.

See Getting Started with CLOUDS.

Relevant Files :

train_net.py : The training script of CLOUDS

clouds/clouds.py : This file defines the model class and its forward function, which forms the core of our model's architecture and forward pass logic

generate_txt_im.py : The script to generate a dataset using Stable Diffusion

prompt_llama70b.txt : The text file containing 100 generated prompts using Llama70b-Chat

Checkpoints & Generated dataset

We provide the following checkpoints for CLOUDS:

Citation

If you find our work useful in your research, please consider citing:

@misc{benigmim2023collaborating,
      title={Collaborating Foundation models for Domain Generalized Semantic Segmentation}, 
      author={Yasser Benigmim and Subhankar Roy and Slim Essid and Vicky Kalogeiton and Stéphane Lathuilière},
      year={2023},
      eprint={2312.09788},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Relevant Files :

train_net.py : The training script of CLOUDS

clouds/clouds.py : This file defines the model class and its forward function, which forms the core of our model's architecture and forward pass logic

generate_txt_im.py : The script to generate a dataset using Stable Diffusion

prompt_llama70b.txt : The text file containing 100 generated prompts using Llama70b-Chat

Acknowledgements

CLOUDS draws its foundation from the following open-source projects, and we'd like to acknowledge their authors for making their source code available :

FC-CLIP

Mask2Former

HRDA

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.