Giter VIP home page Giter VIP logo

fewshotwithoutforgetting's Introduction

Dynamic Few-Shot Visual Learning without Forgetting

Introduction

The current project page provides pytorch code that implements the following CVPR2018 paper:
Title: "Dynamic Few-Shot Visual Learning without Forgetting"
Authors: Spyros Gidaris, Nikos Komodakis
Institution: Universite Paris Est, Ecole des Ponts ParisTech
Code: https://github.com/gidariss/FewShotWithoutForgetting
Arxiv: https://arxiv.org/abs/1804.09458

Abstract:
The human visual system has the remarkably ability to be able to effortlessly learn novel concepts from only a few examples. Mimicking the same behavior on machine learning vision systems is an interesting and very challenging research problem with many practical advantages on real world vision applications. In this context, the goal of our work is to devise a few-shot visual learning system that during test time it will be able to efficiently learn novel categories from only a few training data while at the same time it will not forget the initial categories on which it was trained (here called base categories). To achieve that goal we propose (a) to extend an object recognition system with an attention based few-shot classification weight generator, and (b) to redesign the classifier of a ConvNet model as the cosine similarity function between feature representations and classification weight vectors. The latter, apart from unifying the recognition of both novel and base categories, it also leads to feature representations that generalize better on unseen categories. We extensively evaluate our approach on Mini-ImageNet where we manage to improve the prior state-of-the-art on few-shot recognition (i.e., we achieve $56.20%$ and $73.00%$ on the 1-shot and 5-shot settings respectively) while at the same time we do not sacrifice any accuracy on the base categories, which is a characteristic that most prior approaches lack. Finally, we apply our approach on the recently introduced few-shot benchmark of Bharath and Girshick where we also achieve state-of-the-art results.

Citing FewShotWithoutForgetting

If you find the code useful in your research, please consider citing our CVPR2018 paper:

@inproceedings{gidaris2018dynamic,
  title={Dynamic Few-Shot Visual Learning without Forgetting},
  author={Gidaris, Spyros and Komodakis, Nikos},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={4367--4375},
  year={2018}
}

Requirements

It was developed and tested with pytorch version 0.2.0_4

License

This code is released under the MIT License (refer to the LICENSE file for details).

Running experiments on MiniImageNet.

First, you must download the MiniImagenet dataset from here and set in dataloader.py the path to where the dataset resides in your machine. We recommend creating a dataset directory mkdir datasets and placing the downloaded dataset there.

Training and evaluating our model on Mini-ImageNet.

(1) In order to run the 1st training stage of our approach (which trains a recognition model with a cosine-similarity based classifier and a feature extractor with 128 feature channels on its last convolution layer) run the following command:

CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128CosineClassifier

The above command launches the training routine using the configuration file ./config/miniImageNet_Conv128CosineClassifier.py which is specified by the --config argument (i.e., --config=miniImageNet_Conv128CosineClassifier). Note that all the experiment configuration files are placed in the ./config directory.

(2) In order to run the 2nd training state of our approach (which trains the few-shot classification weight generator with attenition based weight inference) run the following commands:

# Training the model for the 1-shot case on the training set of MiniImagenet.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128CosineClassifierGenWeightAttN1
# Training the model for the 5-shot case on the training set of MiniImagenet.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128CosineClassifierGenWeightAttN5 

(3) In order to evaluate the above models run the following commands:

# Evaluating the model for the 1-shot case on the test set of MiniImagenet.
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv128CosineClassifierGenWeightAttN1 --testset
# Evaluating the model for the 5-shot case on the test set of MiniImagenet.
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv128CosineClassifierGenWeightAttN5 --testset

(4) In order to train and evaluate our approach with different type of feature extractors (e.g., Conv32, Conv64, or ResNetLike; see our paper for a desciption of those feature extractors) run the following commands:

#************************** Feature extractor: Conv32 *****************************
# 1st training stage that trains the cosine-based recognition model.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv32CosineClassifier
# 2nd training stage that trains the few-shot weight generator for the 1-shot and 5-shot models.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv32CosineClassifierGenWeightAttN1 
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv32CosineClassifierGenWeightAttN5
# Evaluate the 1-shot and 5-shot models.
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv32CosineClassifierGenWeightAttN1 --testset
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv32CosineClassifierGenWeightAttN5 --testset

#************************** Feature extractor: Conv64 *****************************
# 1st training stage that trains the cosine-based recognition model.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv64CosineClassifier
# 2nd training stage that trains the few-shot weight generator for the 1-shot and 5-shot models.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv64CosineClassifierGenWeightAttN1 
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv64CosineClassifierGenWeightAttN5
# Evaluate the 1-shot and 5-shot models.
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv64CosineClassifierGenWeightAttN1 --testset
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv64CosineClassifierGenWeightAttN5 --testset

#************************** Feature extractor: ResNetLike *****************************
# 1st training stage that trains the cosine-based recognition model.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_ResNetLikeCosine
# 2nd training stage that trains the few-shot weight generator for the 1-shot and 5-shot models.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_ResNetLikeCosineClassifierGenWeightAttN1 
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_ResNetLikeCosineClassifierGenWeightAttN5
# Evaluate the 1-shot and 5-shot models.
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_ResNetLikeCosineClassifierGenWeightAttN1 --testset
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_ResNetLikeCosineClassifierGenWeightAttN5 --testset

Training and evaluating Matching Networks or Prototypical Networks on Mini-ImageNet.

In order to train and evaluate our implementations of Matching Networks[3] and Prototypical Networks[4] run the following commands:

# Train and evaluate the matching networks model for the 1-shot case.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128MatchingNetworkN1
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv128MatchingNetworkN1 --testset

# Train and evaluate the matching networks model for the 5-shot case.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128MatchingNetworkN5
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv128MatchingNetworkN5 --testset

# Train and evaluate the prototypical networks model for the 1-shot case.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128PrototypicalNetworkN1
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv128PrototypicalNetworkN1 --testset

# Train and evaluate the prototypical networks model for the 5-shot case.
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128PrototypicalNetworkN5
CUDA_VISIBLE_DEVICES=0 python evaluate.py --config=miniImageNet_Conv128PrototypicalNetworkN5 --testset

Experimental results on the test set of Mini-ImageNet.

Here we provide experimental results of our approach as well as of our implementations of Matching Networks and Prototypical Networks on the test of a Mini-ImageNet. Note that after cleaning and refactoring the implementation code of the paper and re-running the experiments, the results that we got are slightly different.

1-shot 5-way classification accuracy of novel categories

Approach Feature extractor Novel Base Both
Matching Networks [3] Conv64 43.60% - -
Prototypical Networks [4] Conv64 49.42% +/- 0.78 - -
Ravi and Laroche [5] Conv32 43.40% +/- 0.77 - -
Finn et al [6] Conv64 48.70% +/- 1.84 - -
Mishra et al [6] ResNet 55.71% +/- 0.99 - -
Matching Networks (our implementation) Conv64 53.65% +/- 0.80 - -
Matching Networks (our implementation) Conv128 54.32% +/- 0.80 - -
Prototypical Networks (our implementation) Conv64 53.30% +/- 0.79 - -
Prototypical Networks (our implementation) Conv128 54.14% +/- 0.80 - -
Prototypical Networks (our implementation) ResNet 53.74% +/- 0.91 - -
(Ours) Cosine & Att. Weight Gen. Conv32 54.49% +/- 0.83 61.59% 44.79%
(Ours) Cosine & Att. Weight Gen. Conv64 55.86% +/- 0.85 68.43% 47.75%
(Ours) Cosine & Att. Weight Gen. Conv128 56.62% +/- 0.84 70.80% 49.44%
(Ours) Cosine & Att. Weight Gen. ResNet 56.21% +/- 0.83 79.79% 52.81%

5-shot 5-way classification accuracy of novel categories

Approach Feature extractor Novel Base Both
Matching Networks [3] Conv64 55.30% - -
Prototypical Networks [4] Conv64 68.20% +/- 0.66 - -
Ravi and Laroche [5] Conv32 60.20% +/- 0.71 - -
Finn et al [6] Conv64 63.10% +/- 0.92 - -
Mishra et al [6] ResNet 68.88% +/- 0.92 - -
Matching Networks (our implementation) Conv64 65.76% +/- 0.68 - -
Matching Networks (our implementation) Conv128 65.97% +/- 0.65 - -
Prototypical Networks (our implementation) Conv64 70.33% +/- 0.65 - -
Prototypical Networks (our implementation) Conv128 70.74% +/- 0.66 - -
(Ours) Cosine & Att. Weight Gen. Conv32 70.12% +/- 0.67 60.83% 53.22%
(Ours) Cosine & Att. Weight Gen. Conv64 72.49% +/- 0.62 67.64% 57.21%
(Ours) Cosine & Att. Weight Gen. Conv128 72.82% +/- 0.63 71.00% 59.05%
(Ours) Cosine & Att. Weight Gen. ResNet 70.64% +/- 0.66 79.56% 59.48%

Running experiments on the ImageNet based Low-shot benchmark

Here provide instructions on how to train and evaluate our approach on the ImageNet based low-shot benchmark proposed by Bharath and Girshick [1].

(1) First, you must download the ImageNet dataset and set in dataloader.py the path to where the dataset resides in your machine. We recommend creating a dataset directory mkdir datasets and placing the downloaded dataset there.

(2) Launch the 1st training stage of our approach by running the following command:

CUDA_VISIBLE_DEVICES=0 python lowshot_train_stage1.py --config=imagenet_ResNet10CosineClassifier

The above command will train the a recognition model with a ResNet10 feature extractor and a cosine similarity based classifier for 100 epochs (which will take around ~120 hours). You can download the already trained by us recognition model from here. In that case you should place the model inside the './experiments' directory with the name './experiments/imagenet_ResNet10CosineClassifier'.

(3) Extract and save the ResNet10 features (with the model that we trained above) from images of the ImageNet dataset:

# Extract features from the validation image split of the Imagenet.
CUDA_VISIBLE_DEVICES=0 python lowshot_save_features.py --config=imagenet_ResNet10CosineClassifier --split=val
# Extract features from the training image split of the Imagenet.
CUDA_VISIBLE_DEVICES=0 python lowshot_save_features.py --config=imagenet_ResNet10CosineClassifier --split=train

(4) Launch the 2st training stage of our approach (which trains the few-shot classification weight generator with attenition based weight inference) by running the following commands:

# Training the model for the 1-shot.
CUDA_VISIBLE_DEVICES=0 python lowshot_train_stage2.py --config=imagenet_ResNet10CosineClassifierWeightAttN1
# Training the model for the 2-shot.
CUDA_VISIBLE_DEVICES=0 python lowshot_train_stage2.py --config=imagenet_ResNet10CosineClassifierWeightAttN2
# Training the model for the 5-shot.
CUDA_VISIBLE_DEVICES=0 python lowshot_train_stage2.py --config=imagenet_ResNet10CosineClassifierWeightAttN5
# Training the model for the 10-shot.
CUDA_VISIBLE_DEVICES=0 python lowshot_train_stage2.py --config=imagenet_ResNet10CosineClassifierWeightAttN10
# Training the model for the 20-shot.
CUDA_VISIBLE_DEVICES=0 python lowshot_train_stage2.py --config=imagenet_ResNet10CosineClassifierWeightAttN20

(5) Evaluate the above trained models by running the following commands:

# Evaluate the model for the 1-shot the model.
CUDA_VISIBLE_DEVICES=0 python lowshot_evaluate.py --config=imagenet_ResNet10CosineClassifierWeightAttN1 --testset
# Evaluate the model for the 2-shot the model.
CUDA_VISIBLE_DEVICES=0 python lowshot_evaluate.py --config=imagenet_ResNet10CosineClassifierWeightAttN2 --testset
# Evaluate the model for the 5-shot the model.
CUDA_VISIBLE_DEVICES=0 python lowshot_evaluate.py --config=imagenet_ResNet10CosineClassifierWeightAttN5 --testset
# Evaluate the model for the 10-shot the model.
CUDA_VISIBLE_DEVICES=0 python lowshot_evaluate.py --config=imagenet_ResNet10CosineClassifierWeightAttN10 --testset
# Evaluate the model for the 20-shot the model.
CUDA_VISIBLE_DEVICES=0 python lowshot_evaluate.py --config=imagenet_ResNet10CosineClassifierWeightAttN20 --testset

Experimental results on the ImageNet based Low-shot benchmark

Here we evaluate our approach on the ImageNet based low-shot benchmark proposed by Bharath and Girshick [1] using the improved evaluation metrics proposed by Wang et al [2]. All the approaches use a ResNet10 feature extractor. Note that after cleaning and refactoring the implementation code of the paper and re-running the experiments, the results that we got are slightly different. A pre-trained ResNet10 model with cosine-similarity based classifier is provided here: imagenet_ResNet10CosineClassifier

Top-5 1-shot classification accuracy

Approach Novel All All with prior
Prototypical Networks (from[2]) 39.30% 49.50% 53.60%
Matching Networks (from[2]) 43.60% 54.40% 54.50%
Logistic regression (from[2]) 38.40% 40.80% 52.90%
Logistic regression w/ H (from[2]) 40.70% 52.20% 53.20%
Prototype Matching Nets [2] 43.30% 55.80% 54.70%
Prototype Matching Nets w/ H [2] 45.80% 57.60% 56.40%
(Ours) Cosine & Att. Weight Gen. 46.26% +/- 0.20 58.29% +/- 0.13 56.88% +/- 0.13

Top-5 2-shot classification accuracy

Approach Novel All All with prior
Prototypical Networks (from[2]) 54.40% 61.00% 61.40%
Matching Networks (from[2]) 54.00% 61.00% 60.70%
Logistic regression (from[2]) 51.10% 49.90% 60.40%
Logistic regression w/ H (from[2]) 50.80% 59.40% 59.10%
Prototype Matching Nets [2] 55.70% 63.10% 62.00%
Prototype Matching Nets w/ H [2] 57.80% 64.70% 63.30%
(Ours) Cosine & Att. Weight Gen. 57.46% +/- 0.16 65.11% +/- 0.10 63.67% +/- 0.09

Top-5 5-shot classification accuracy

Approach Novel All All with prior
Prototypical Networks (from[2]) 66.30% 69.70% 68.80%
Matching Networks (from[2]) 66.00% 69.00% 68.20%
Logistic regression (from[2]) 64.80% 64.20% 68.60%
Logistic regression w/ H (from[2]) 62.00% 67.60% 66.80%
Prototype Matching Nets [2] 68.40% 71.10% 70.20%
Prototype Matching Nets w/ H [2] 69.00% 71.90% 70.60%
(Ours) Cosine & Att. Weight Gen. 69.27% +/- 0.09 72.70% +/- 0.06 71.24% +/- 0.06

Top-5 10-shot classification accuracy

Approach Novel All All with prior
Prototypical Networks (from[2]) 71.20% 72.90% 72.00%
Matching Networks (from[2]) 72.50% 73.70% 72.60%
Logistic regression (from[2]) 71.60% 71.90% 72.90%
Logistic regression w/ H (from[2]) 69.30% 72.80% 71.70%
Prototype Matching Nets [2] 74.00% 75.00% 73.90%
Prototype Matching Nets w/ H [2] 74.30% 75.20% 74.00%
(Ours) Cosine & Att. Weight Gen. 74.84% +/- 0.06 76.51% +/- 0.04 75.00% +/- 0.04

Top-5 20-shot classification accuracy

Approach Novel All All with prior
Prototypical Networks (from[2]) 73.90% 74.60% 73.80%
Matching Networks (from[2]) 76.90% 76.50% 75.60%
Logistic regression (from[2]) 76.60% 76.90% 76.30%
Logistic regression w/ H (from[2]) 76.50% 76.90% 76.30%
Prototype Matching Nets [2] 77.00% 77.10% 75.90%
Prototype Matching Nets w/ H [2] 77.40% 77.50% 76.20%
(Ours) Cosine & Att. Weight Gen. 78.11% +/- 0.05 78.74% +/- 0.03 77.28% +/- 0.03

References

[1] B. Hariharan and R. Girshick. Low-shot visual recognition by shrinking and hallucinating features.
[2] Y.-X. Wang and R. Girshick, M. Hebert, B. Hariharan. Low-shot learning from imaginary data.
[3] O. Vinyals et al. Matching networks for one shot learning.
[4] J. Snell, K. Swersky, and R. S. Zemel. Prototypical networks for few-shot learning.
[5] S. Ravi and H. Larochelle. Optimization as a model for few-shot learning.
[6] C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks.

fewshotwithoutforgetting's People

Contributors

gidariss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fewshotwithoutforgetting's Issues

The dataset

Who can share the pickle of miniimagenet with me,I can’t download it!

Hard-coded parameters in evaluate.py leads to errors

In evaluate.py:

dloader_test = FewShotDataloader(
     dataset=MiniImageNet(phase=test_split),
     nKnovel=5, # number of novel categories on each training episode.
     nKbase=64, # number of base categories.
     nExemplars=nExemplars, # num training examples per novel category
     nTestNovel=15 * 5, # num test examples for all the novel categories
     nTestBase=15 * 5, # num test examples for all the base categories
     batch_size=1,
     num_workers=0,
     epoch_size=epoch_size, # num of batches per epoch
)

When evaluating the Proto Nets/Matching Nets, the hard-coded parameters above leads to some error related to missing Kbase_ids in FewShot.py because nKbase is set as non-zero (64). A possible fix is:

data_test_opt  = config['data_test_opt']
dloader_test = FewShotDataloader(
    dataset=MiniImageNet(phase=test_split),
    nKnovel=data_test_opt['nKnovel'], # number of novel categories on each training episode.
    nKbase=data_test_opt['nKbase'], # number of base categories.
    nExemplars=data_test_opt['nExemplars'], # num training examples per novel category
    nTestNovel=data_test_opt['nTestNovel'], # num test examples for all the novel categories
    nTestBase=data_test_opt['nTestBase'], # num test examples for all the base categories
    batch_size=data_test_opt['batch_size'],
    num_workers=args_opt.num_workers, #0
    epoch_size=epoch_size # num of batches per epoch # data_test_opt['epoch_size']
)

ValueError: The provided metric AccuracyNovel for keeping the best model is not computed by the evaluation routine.

Can you give me some advice about this Error?

2019-07-31 08:51:00,403 - algorithms.Algorithm - INFO   - Training: miniImageNet_Conv32CosineClassifier
 20%|█▉        | 197/1000 [00:05<00:21, 37.46it/s]2019-07-31 08:51:06,434 - algorithms.Algorithm 
100%|██████████| 1000/1000 [00:27<00:00, 35.99it/s]
2019-07-31 08:51:28,193 - algorithms.Algorithm - INFO   - ==> Training stats: {'loss': 2.469}
2019-07-31 08:51:28,200 - algorithms.Algorithm - INFO   - Evaluating: miniImageNet_Conv32CosineClassifier
2019-07-31 08:51:28,200 - algorithms.Algorithm - INFO   - ==> Dataset: MiniImageNet_val [2000 batches]
  0%|          | 0/2000 [00:00<?, ?it/s]/data1/zjj/meta-code/withoutForgetting/algorithms/FewShot.py:185: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  images_test_var = Variable(images_test, volatile=is_volatile)
/data1/withoutForgetting/algorithms/FewShot.py:190: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  images_train_var = Variable(images_train, volatile=is_volatile)
100%|██████████| 2000/2000 [01:14<00:00, 26.36it/s]
2019-07-31 08:52:43,083 - algorithms.Algorithm - INFO   - ==> Results: {'loss': 3.1131, 'AccuracyNovel_cnf': 0.4092}
2019-07-31 08:52:43,084 - algorithms.Algorithm - INFO   - ==> Evaluation stats: {'loss': 3.1131, 'AccuracyNovel_cnf': 0.4092}
Traceback (most recent call last):
  File "train.py", line 110, in <module>
    algorithm.solve(dloader_train, dloader_test)
  File "/data1/withoutForgetting/algorithms/Algorithm.py", line 288, in solve
    self.keep_record_of_best_model(eval_stats, self.curr_epoch)
  File "/data1/withoutForgetting/algorithms/Algorithm.py", line 359, in keep_record_of_best_model
    .format(metric_name))
ValueError: The provided metric AccuracyNovel for keeping the best model is not computed by the evaluation routine.

The error occurred in:

    def keep_record_of_best_model(self, eval_stats, current_epoch):
        if self.keep_best_model_metric_name is not None:
            metric_name = self.keep_best_model_metric_name
            if (metric_name not in eval_stats):
                raise ValueError('The provided metric {0} for keeping the best '
                                 'model is not computed by the evaluation routine.'
                                 .format(metric_name))
            metric_val = eval_stats[metric_name]

Thanks a lot!

Dynamic Few-Shot Visual Learning without Forgetting

It is a really nice job! We tried to use resnet as a backbone just like what you did in our own experiment. However, it didn't improve performance as we expected and even performed worse than 4-conv. I wonder if there are some tricks when you trained your model with a resnet. Thank you very much.

A question about the training process

Hello brother, it's really a good job. But what confused me is that in training step 2, when we need to train a weight generator, you keep on training the weight_base, it seems weight_base has already trained well in step 1(pretrian step) , so is there any special reason for this operators? how can we ensure the compatibility between weight_base and generated parameters , and the compatibility between generated parameters?
Also, can such method be used in situations when N is very large(N-way K-shot)? In extreme cases,maybe N is larger than the number of weight_base, If possible, I hope you can give me some suggestions.
Thank you~

question of data_train_opt

  1. in training stage one, although base category setting is nTestBase =32, batch_size=8 , actual batchsize is 32 *8 as like general training procedure. so why you split the one arg 'batch_size' to two arg 'nTestBase' and 'batch_size'.
  2. in training stage two, settings is data_train_opt['nTestNovel'] = nKnovel * 3
    data_train_opt['nTestBase'] = nKnovel * 3
    , why we need * 3 and i can not find any info in original paper.

it is so kind if you can help me to understand the code. thanks a lot

Obtained test accuracy lower than reported for miniImageNet

Hello , thank you for the implementation of your work. For the first stage, I used the Conv128CosineClassifier configuration. Later with Attention based 5way 1-shot training(untouched configuration), I get 55.9 on novel class and 70.3 on base class for the test set. It is lower than your report. I wonder whether the result I got is reasonable or not?

Installation Requirements

Hi, @gidariss could you please list the installation requirements/dependencies to run your code? i.e., python 2.X/3.X, pytorch 0.X, other packages, etc?

Test the model on own dataset.

Ok lets assume that ia have created the pickles files with the data for the training phase. What parameters should i change in order to match my number of base and novel class?

Training issue with Pytorch1.1, python2.7

Hi
Thanks for sharing the code. I have some problems when i run the second stage:
CUDA_VISIBLE_DEVICES=0 python train.py config=miniImageNet_Conv128CosineClassifierGenWeightAttN1.
The erros is : File "/xulan/code/few_shot_withoutforgetting /FewShotWithoutForgetting-master/algorithms/FewShot.py", line 65, in set_tensors
self.tensors['labels_train_1hot'].resize_(labels_train_1hot_size).fill_(0).scatter_(
I check the labels_train_1hot_size, it is [8, 5, tensor([5, 5, 5, 5, 5, 5, 5, 5])]. I think there might be some issue tensor([5, 5, 5, 5, 5, 5, 5, 5]. Do you know what's the problem here. My envirionment is python2.7+pytorch1.1

Bias in the Classifier Class

First off, thanks for sharing your code!
I was looking at the Classifier class in ClassifierWithFewShotGenerationModule.py and I realized that self.bias is a scalar rather than a vector. In the simple dot product classifier (or linear classifier), the bias should be a vector with size equal to the number of classes. Am I missing something or is this a bug in the code?

the code problem in your paper : how to get weight_base

Dear author

We would like to quote your article. After seeing your code in architectures/ClassifierWithFewShotGenerationModule.

I am a little confused about how the weight_base is obtained in class: Classifier in line 112 and185.

In def s init__(self, opt), only initializations are made.

In def get_classification_weights, I don't know how the weight_base is obtained without passing parameters.

weight_base = torch.FloatTensor(nKall, nFeat).normal_(

weight_base = self.weight_base[Kbase_ids.view(-1)]

implementation differs with the reported

Hi, I am studying your code. I found there are some places in the code that are different from what were reported in the paper.

  1. There are overlaps between base class and novel classes. In the implementation, some classes are used both as novel class and base class, which is not the case as described in the paper.
  2. The weights obtained in the first training stage are not used in the second stage. The weights in the second stage are randomly sampled from a normal distribution, rather than using what were obtained in the first stage.

Could you please give some explanation for your purpose of doing so? Thanks.

Another place that I am unclear is that it seems the novel classes are always the last five classes (labeled with 59~63), which however should be the case in practice.

I will keep reading your code -- maybe I misunderstand something. But your explanations could enlighten me for fully understanding your algorithm. Thanks,

Could you please give me some advices for improving acc_both?

Hi, gidariss:
Thanks for your shared code.
I tested this mechanism for my own feature model and dataset. And got high acc_base and acc_novel except acc_both, could you please give me some advices?

I trained a model on my own dataset(trainset samples > 110k) and got acc_base 91.22%, acc_novel 90.48% and acc_both 74.97% in stage1. And the model is training in stage2 now.
But I found the acc_both is still very low.
And should I use the feature net with best acc_both instead of best acc_novel from stage1 in stage2?

Thanks.

Details about the implementation

Dear authors, I tried to re-implement the paper but failed to get the performance in the paper. I think there may be some details I missed. I trained a cosine-based classifier on base classes (64 classes) and the accuracy is about 57% on validation set. When I extract the feature before classification, which is 3200-d for C128, the 5-way, 1-shot accuracy is only about 46% on test classes and 43% on val classes. Do you have some suggestions on the implementation? Thanks.

ask for environment details

Thanks for the code! I am wondering if I can have the details of the environment, such as the list of packages the their versions, cuda9 or cuda8 etc.
I am trying to run your code for my school project but I've got too many errors about that...
Thank you.

The first stage training report an error

I have reported the following mistakes in training, if you know how to solve them, please tell me, thank you!

Traceback (most recent call last):
File "I:/scienceresearch/lowshotlearn/05FewShotWithoutForgetting2/FewShotWithoutForgetting/train.py", line 100, in
algorithm = alg.FewShot(config)
File "I:\scienceresearch\lowshotlearn\05FewShotWithoutForgetting2\FewShotWithoutForgetting\algorithms\FewShot.py", line 32, in init
Algorithm.init(self, opt)
File "I:\scienceresearch\lowshotlearn\05FewShotWithoutForgetting2\FewShotWithoutForgetting\algorithms\Algorithm.py", line 24, in init
self.set_log_file_handler()
File "I:\scienceresearch\lowshotlearn\05FewShotWithoutForgetting2\FewShotWithoutForgetting\algorithms\Algorithm.py", line 67, in set_log_file_handler
self.log_fileHandler = logging.FileHandler(self.log_file)
File "E:\anacoda3\lib\logging_init_.py", line 1030, in init
StreamHandler.init(self, self.open())
File "E:\anacoda3\lib\logging_init
.py", line 1059, in _open
return open(self.baseFilename, self.mode, encoding=self.encoding)
OSError: [Errno 22] Invalid argument: 'I:\scienceresearch\lowshotlearn\05FewShotWithoutForgetting2\FewShotWithoutForgetting\experiments\miniImageNet_Conv128CosineClassifier\logs\LOG_INFO_2021-07-21_20:47:21.087383.txt'

Error when training

When executing the command below:
CUDA_VISIBLE_DEVICES=0 python train.py --config=miniImageNet_Conv128CosineClassifier

It prompts:

Exception KeyError: KeyError(<weakref at 0x7f619db132b8; to 'tqdm' at 0x7f619db23090>,) in <bound method tqdm.__del__ of
  0%|                                                                 | 0/2000 [00:00<?, ?it/s]> ignored
Traceback (most recent call last):
  File "train.py", line 110, in <module>
    algorithm.solve(dloader_train, dloader_test)
  File "/teamscratch/msravcshare/v-weijxu/code/few-shot/DynamicFewShot/algorithms/Algorithm.py", line 286, in solve
    eval_stats = self.evaluate(data_loader_test)
  File "/teamscratch/msravcshare/v-weijxu/code/few-shot/DynamicFewShot/algorithms/Algorithm.py", line 330, in evaluate
    eval_stats_this = self.evaluation_step(batch)
  File "/teamscratch/msravcshare/v-weijxu/code/few-shot/DynamicFewShot/algorithms/FewShot.py", line 84, in evaluation_ste
p
    return self.process_batch(batch, do_train=False)
  File "/teamscratch/msravcshare/v-weijxu/code/few-shot/DynamicFewShot/algorithms/FewShot.py", line 87, in process_batch
    process_type = self.set_tensors(batch)
  File "/teamscratch/msravcshare/v-weijxu/code/few-shot/DynamicFewShot/algorithms/FewShot.py", line 60, in set_tensors
    nKnovel = 1 + labels_train.max() - self.nKbase
RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #3 'other'

Environment:
Python 2.7
PyTorch 0.4 @ CUDA 9.1

Code release time

It's really an excellent work! Do you have concrete time for releasing the code? Thanks!

Obtained test accuracy higher than reported for miniImageNet

Hello , thank you for the implementation of your work. For the first stage, I used the ResNetLikeCosineClassifier configuration and changed the batch size to 6 (due to memory limitation of my gpu). Later with Attention based 5way 1-shot training(untouched configuration), I get 56.89+/-0.83 on novel class and 78.70 on base class for the test set. What I'm wondering is, if there is any additional step involved ?

Reminder: installation tips

Lots of you may be using pytorch 1.0++. But using the required version can save you a lot of time. Using pytorch 1.2, but you need to spend a lot of time on fixing bugs.
To sucsefully run this code, plz download cu75/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl
and

pip install torch*whl
pip install torchvision=0.2.0

The above versions have been tested on RTX2080Ti without modifying any lines of the code.
Also pytorch0.4 also works.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.