Giter VIP home page Giter VIP logo

context-art-retrieval's Introduction

2022/01/05: Downloading links for the pre-trained models have been updated. Sorry for the wait.

Context Embeddings for Cross-Modal Retrieval

Pytorch code for the cross-modal retrieval part of our ICMR 2019 paper Context-Aware Embeddings for Automatic Art Analysis. For the classification part, check this other repository.

Setup

  1. Download dataset from here.

  2. Clone the repository:

    git clone https://github.com/noagarcia/context-art-retrieval.git

  3. Install dependencies:

    • Python 2.7
    • pytorch (conda install pytorch=0.4.1 cuda90 -c pytorch)
    • torchvision (conda install torchvision)
    • visdom (check tutorial here)
    • pandas (conda install -c anaconda pandas)
    • nltk (conda install -c anaconda nltk)
    • sklearn (conda install scikit-learn)
  4. Download our pre-trained context-aware models obtained with the classification code and save them into Models/ folder:

Train

  • To train cross-modal retrieval model with MTL context embeddings run:

    python main.py --mode train --model mtl --dir_dataset $semart

  • To train cross-modal retrieval model with KGM context embeddings run:

    python main.py --mode train --model kgm --att $attribute --dir_dataset $semart

Where $semart is the path to SemArt dataset and $attribute is the classifier type (i.e. type, school, time, or author).

Test

  • To test cross-modal retrieval model with MTL context embeddings run:

    python main.py --mode test --model mtl --dir_dataset $semart

  • To test cross-modal retrieval model with KGM context embeddings run:

    python main.py --mode test --model kgm --att $attribute --dir_dataset $semart --model_path $model-file

Where $semart is the path to SemArt dataset, $attribute is the classifier type (i.e. type, school, time, or author), and $model-file is the path to the trained model.

You can download our pre-trained cross-modal retrieva models with context embeddings from:

Results

Text-to-Image retrieval results on SemArt dataset:

Model R@1 R@5 R@10 MedR
CML 0.164 0.384 0.505 10
MTL Type 0.145 0.358 0.474 12
MTL School 0.196 0.428 0.536 8
MTL TF 0.171 0.394 0.525 9
MTL Author 0.232 0.452 0.567 7
KGM Type 0.152 0.367 0.506 10
KGM School 0.162 0.371 0.483 12
KGM TF 0.175 0.399 0.506 10
KGM Author 0.247 0.477 0.581 6

Citation

@InProceedings{Garcia2017Context,
   author    = {Noa Garcia and Benjamin Renoust and Yuta Nakashima},
   title     = {Context-Aware Embeddings for Automatic Art Analysis},
   booktitle = {Proceedings of the ACM International Conference on Multimedia Retrieval},
   year      = {2019},
}

context-art-retrieval's People

Contributors

noagarcia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.