Giter VIP home page Giter VIP logo

2021-neurips-ncr's Introduction

PyTorch implementation for Learning with Noisy Correspondence for Cross-modal Matching (NeurIPS 2021 Oral).

Introduction

NCR framework

Requirements

  • Python 3.7
  • PyTorch ~1.7.1
  • numpy
  • scikit-learn
  • Punkt Sentence Tokenizer:
import nltk
nltk.download()
> d punkt

Datasets

MS-COCO and Flickr30K

We follow SCAN to obtain image features and vocabularies.

CC152K

We use a subset of Conceptual Captions (CC), named CC152K. CC152K contains training 150,000 samples from the CC training split, 1,000 validation samples and 1,000 testing samples from the CC validation split. We follow the pre-processing step in SCAN to obtain the image features and vocabularies.

Download Dataset

Training and Evaluation

Training new models from scratch

Modify the data_path and vocab_path, then train and evaluate the model(s):


# CC152K
python ./NCR/run.py --gpu 0 --workers 2 --warmup_epoch 10 --data_name cc152k_precomp --data_path data_path --vocab_path vocab_path

# MS-COCO: noise_ratio = {0, 0.2, 0.5}
python ./NCR/run.py --gpu 0 --workers 2 --warmup_epoch 10 --data_name coco_precomp --num_epochs 20 --lr_update 10 --noise_ratio 0.2 --data_path data_path --vocab_path vocab_path

# Flickr30K: noise_ratio = {0, 0.2, 0.5}
python ./NCR/run.py --gpu 0 --workers 2 --warmup_epoch 5 --data_name f30k_precomp --noise_ratio 0.2 --data_path data_path --vocab_path vocab_path

It should train the model from scratch and evaluate the best model.

Pre-trained models and evaluation

The pre-trained models are available here:

  1. CC152K model Download
  2. MS-COCO 0% noise model Download
  3. MS-COCO 20% noise model Download
  4. MS-COCO 50% noise model Download
  5. F30K 0% noise model Download
  6. F30K 20% noise model Download
  7. F30K 50% noise model Download

Modify the model_path, data_path, vocab_path in the evaluation.py file. Then run evaluation.py:

python ./NCR/evaluation.py

Note that for MS-COCO, please set split to testall, and fold5 to false (5K evaluation) or true (Five-fold 1K evaluation).

Experiment Results:

Citation

If NCR is useful for your research, please cite the following paper:

@article{huang2021learning,
  title={Learning with Noisy Correspondence for Cross-modal Matching},
  author={Huang, Zhenyu and Niu, Guocheng and Liu, Xiao and Ding, Wenbiao and Xiao, Xinyan and Wu, Hua and Peng, Xi},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

License

Apache License 2.0

Acknowledgements

The code is based on SGRAF and SCAN licensed under Apache 2.0.

2021-neurips-ncr's People

Contributors

hi-zhenyu avatar xlearning-scu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.