Giter VIP home page Giter VIP logo

mac-network-pytorch-gqa's Introduction

mac-network-pytorch

Memory, Attention and Composition (MAC) Network for CLEVR/GQA from Compositional Attention Networks for Machine Reasoning (https://arxiv.org/abs/1803.03067) implemented in PyTorch

Requirements:

  • Python 3.6
  • PyTorch 1.0.1
  • torch-vision
  • Pillow
  • nltk
  • tqdm
  • block.bootstrap.pytorch murel.bootstrap.pytorch

To train:

  1. Download and extract either
    CLEVR v1.0 dataset from http://cs.stanford.edu/people/jcjohns/clevr/ or
    GQA dataset from https://cs.stanford.edu/people/dorarad/gqa/download.html

For GQA

cd data
mkdir gqa && cd gqa
wget https://nlp.stanford.edu/data/gqa/data1.2.zip
unzip data1.2.zip

mkdir questions mv balanced_train_data.json questions/gqa_train_questions.json mv balanced_val_data.json questions/gqa_val_questions.json mv balanced_testdev_data.json questions/gqa_testdev_questions.json cd ..

wget http://nlp.stanford.edu/data/glove.6B.zip unzip glove.6B.zip wget http://nlp.stanford.edu/data/gqa/objectFeatures.zip unzip objectFeatures.zip cd ..

  1. Preprocessing question data and extracting image features using ResNet 101 (Not required for GQA)
    For CLEVR
    a. Extract image features
python image_feature.py data/CLEVR_v1.0

b. Preprocess questions

python preprocess.py CLEVR data/CLEVR_v1.0

For GQA
a. Merge object features (this may take some time)

python merge.py --name objects
mv data/gqa_objects.hdf5 data/gqa_features.hdf5

b. Preprocess questions

python preprocess.py gqa data/gqa

!CAUTION! the size of file created by image_feature.py is very large! You may use hdf5 compression, but it will slow down feature extraction.

  1. Run train.py with dataset type as argument (gqa or CLEVR)
python train.py gqa

CLEVR -> This implementation produces 95.75% accuracy at epoch 10, 96.5% accuracy at epoch 20.

Parts of the code borrowed from https://github.com/rosinality/mac-network-pytorch and
https://github.com/stanfordnlp/mac-network.

mac-network-pytorch-gqa's People

Contributors

ronilp avatar rosinality avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.