Giter VIP home page Giter VIP logo

sag's Introduction

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

arXiv Badge

plot

This repository contains the official source code for our paper:

Shatter and Gather: Learning Referring Image Segmentation with Text Supervision
Dongwon Kim1, Namyup Kim1, Cuiling Lan2, and Suha Kwak1
1POSTECH CSE, 2Microsoft Research Asia
ICCV, Paris, 2023.

Dataset setup

Setting

  • Download the MS COCO images are under data/coco/images/train2014/
  • Download the ReferItGame data are under data/referit/images and data/referit/mask
  • Download TF-resnet and TF-deeplab under external folder. Then strictly foll
  • Download refer under external. Then strictly follow the Setup and Download section. Also put the refer folder in PYTHONPATH
  • Download the MS COCO API also under external (i.e. external/coco/PythonAPI/pycocotools)

Data preparation

python build_batches.py -d Gref -t train 
python build_batches.py -d Gref -t val 
python build_batches.py -d unc -t train 
python build_batches.py -d unc -t val 
python build_batches.py -d unc -t testA 
python build_batches.py -d unc -t testB 
python build_batches.py -d unc+ -t train 
python build_batches.py -d unc+ -t val 
python build_batches.py -d unc+ -t testA 
python build_batches.py -d unc+ -t testB

Final ./data directory structure

./data              
├─ refcoco   
│   ├─ Gref
│   │   ├─ train_batch
│   │   │   ├─ Gref_train_0.npz
│   │   │   ├─ Gref_train_1.npz
│   │   │   └─ ...
│   │   ├─ train_image
│   │   ├─ train_label 
│   │   ├─ val_batch
│   │   ├─ val_image
│   │   └─ val_label
│   ├─ unc
│   │   └─ ...
│   └─ unc+
│       └─ ...
├─ phrasecut
│   └─ images
│      ├─ refer_train_ris.json
│      ├─ refer_val_ris.json
│      └─  refer_test_ris.json
├─ Gref_emb.npy
├─ referit_emb.npy
├─ vocabulary_Gref.txt
└─ vocabulary_referit.txt

Environment setup

  • Python 3.10.9
  • PyTorch 1.13.1+cu117

Instructions:

conda create -n sag python=3.10 -y
conda activate sag
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu11
pip install einops tqdm wandb transformers
pip install matplotlib timm opencv-python

Train & eval

sh ./train_eval_gref.sh # Gref
sh ./train_eval_unc.sh # UNC
sh ./train_eval_unc+.sh # UNC+

Acknowledgement

Parts of our codes are adopted from the following repositories.

Dataset Setup instruction is from TF-phrasecut-public repository.

sag's People

Contributors

kdwonn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.