renmengye / inc-few-shot-attractor-public Goto Github PK

Code for Paper "Incremental Few-Shot Learning with Attention Attractor Networks"

License: MIT License

Makefile 0.04% Python 99.87% Shell 0.09%

inc-few-shot-attractor-public's Introduction

inc-few-shot-attractor-public

This repository contains code for the following paper: Incremental Few-Shot Learning with Attention Attractor Networks. Mengye Ren, Renjie Liao, Ethan Fetaya, Richard S. Zemel. NeurIPS 2019. [arxiv]

Dependencies

cv2
numpy
pandas
python 2.7 / 3.5+
tensorflow 1.11
tqdm

Our code is tested on Ubuntu 14.04 and 16.04.

Setup

First, designate a folder to be your data root:

export DATA_ROOT={DATA_ROOT}

Then, set up the datasets following the instructions in the subsections.

miniImageNet

[Google Drive] (5GB)

# Download and place "mini-imagenet.tar.gz" in "$DATA_ROOT/mini-imagenet".
mkdir -p $DATA_ROOT/mini-imagenet
cd $DATA_ROOT/mini-imagenet
mv ~/Downloads/mini-imagenet.tar .
tar -xvf mini-imagenet.tar
rm -f mini-imagenet.tar

tieredImageNet

[Google Drive] (15GB)

# Download and place "tiered-imagenet.tar" in "$DATA_ROOT/tiered-imagenet".
mkdir -p $DATA_ROOT/tiered-imagenet
cd $DATA_ROOT/tiered-imagenet
mv ~/Downloads/tiered-imagenet.tar .
tar -xvf tiered-imagenet.tar
rm -f tiered-imagenet.tar

Note: Please make sure that the following hardware requirements are met before running tieredImageNet experiments.

Disk: 30 GB
RAM: 32 GB

Config files

Run make to make protobuf files.

git clone https://github.com/renmengye/inc-few-shot-attractor.git
cd inc-few-shot-attractor
make

Core Experiments

Pretraining

./run.sh {GPUID} python run_exp.py --config {CONFIG_FILE}     \
                  --dataset {DATASET}                         \
                  --data_folder {DATASET_FOLDER}              \
                  --results {SAVE_FOLDER}                     \
                  --tag {EXPERIMENT_NAME}

Possible DATASET options are mini-imagenet, tiered-imagenet.
Possible CONFIG options are any prototxt file in the ./configs/pretrain folder.

Meta-learning

./run.sh {GPUID} python run_exp.py --config {CONFIG_FILE}     \
                  --dataset {DATASET}                         \
                  --data_folder {DATASET_FOLDER}              \
                  --pretrain {PRETRAIN_CKPT_FOLDER}           \
                  --nshot {NUMBER_OF_SHOTS}                   \
                  --nclasses_b {NUMBER_OF_FEWSHOT_WAYS}       \
                  --results {SAVE_FOLDER}                     \
                  --tag {EXPERIMENT_NAME}                     \
                  [--eval]                                    \
                  [--retest]

Possible DATASET options are mini-imagenet, tiered-imagenet.
Possible CONFIG options are any prototxt file in the ./configs/attractors folder, e.g. \*-{mlp|lr}-attn-s{1|5}.prototxt means 1/5-shot model using MLP or LR as fast weights model.
You need to pass in PRETRAIN_CKPT_FOLDER option with the pretrained model.
Add --retest flag for restoring a fully trained model and re-run eval.

Baselines

Baseline configs are in ./configs/lwof and ./configs/imprint.
For ProtoNet baseline please run run_proto_exp.py with the same flags from the previous section.
Configs for ablation studies can be found in ./configs/ablation.

Citation

If you use our code, please consider cite the following:

Mengye Ren, Renjie Liao, Ethan Fetaya and Richard S. Zemel. Incremental Few-Shot Learning with Attention Attractor Networks. In Advances in Neural Information Processing Systems (NeurIPS), 2019.

@inproceedings{ren19incfewshot,
  author   = {Mengye Ren and
              Renjie Liao and
              Ethan Fetaya and
              Richard S. Zemel},
  title    = {Incremental Few-Shot Learning with Attention Attractor Networks,
  booktitle= {Advances in Neural Information Processing Systems (NeurIPS)},
  year     = {2019},
}

inc-few-shot-attractor-public's People

Contributors

Stargazers

Watchers

inc-few-shot-attractor-public's Issues

Question about tiered_imagenet_split csv files name

Hello,

I just got confused about naming rules of these csv files.

Here are some questions:

1)what is the difference between train_a_phase_train and train_phase_train? Does it refer to task a and b?

train_aa and train_a?

3)trainval and val?

train and rest of the train_ files?

Thank you and have a great day!

Question about the implementation

Thanks for the inspiring work! I have a question about the implementation of the rbp.
While in original derivation,
$v=\frac{\partial L^Q}{\partial W_b}$ (line 14 in Alg. 1), where $W_b$ is the classifier weight before dummy GD, the implementation
https://github.com/renmengye/inc-few-shot-attractor-public/blob/master/fewshot/models/rbp.py#L40 seems to compute the gradient w.r.t. classifier weight after dummy GD. Is there any problem?

KeyError: 'catname2label'

Thanks for sharing the data and code. But when I download your code and run it on my own computer, there is an error occurs:
File "/inc-few-shot-attractor-public-master/fewshot/data/mini_imagenet.py", line 153, in _read_cache
dic = datafile['catname2label']
KeyError: 'catname2label'
I find that the pkl file, such as mini-imagenet-cache-train.pkl, only contains 2 keys: 'image_data' and 'class_dict' without the key 'catname2label' after loading it. Did I miss something?

Some questions about ResNet.

It is a really nice job! We tried to use resnet as a backbone just like what you did in our own experiment. However, it didn't improve performance as we expected and even performed worse than 4-conv. I wonder if there are some tricks when you trained your model with a resnet. Thank you very much.

Questions about the mini-Imagenet experiment

When I run the pretrain experiment on mini-Imagenet, I found that there is a KeyError in the 97th line of the "/fewshot/data/mini_imagenet.py" file, where the CSV_FILE doesn't have a key named 'train_phase_train'. Is it a mistake of your code or my operation?

Details about results in the paper

Hi, thanks for your good job. It's a very interesting work.
I want to know more details about Table 2 in your paper. What does the Acc in the table mean? Top-1 acc or Top-5 acc?
In addition, I'm also very curious about the Train-A-Val top-1 accuracy of your resnet-18 model pretrained on the Train-A-Train part.

Thanks in advance.

Query regarding implementation

Hi, Thank you for this inspiring work.
I would like to know exactly how to get the results for the baseline experiments. Can you please provide the command to be used. Also please do clarify what exactly is the accuracy used in Table 2 of the paper. What do New, New2, Old and Old2 stand for, and also how to calculate the accuracy from these values so as to get the accuracy value equivalent to that found in the Table2 of the paper.

Thank you in advance.

But in the pretrain process, you still use the whole network for training and you fix the parameters of backbone or you only take the backbone for training?

Thanks in advance for your response!