๐๐๐ญ๐-๐๐ซ๐๐ง๐ฌ๐๐๐ซ ๐๐๐๐ซ๐ง๐ข๐ง๐ ๐๐๐ง๐ฌ๐จ๐ซ๐ ๐ฅ๐จ๐ฐ
This repository contains the TensorFlow implementation for CVPR 2019 Paper "Meta-Transfer Learning for Few-Shot Learning" by Qianru Sun*, Yaoyao Liu*, Tat-Seng Chua and Bernt Schiele.
If you have any problems when running this repository, feel free to send me an email or open an issue. I will reply to you as soon as I see them. (Email: liuyaoyao at tju.edu.cn)
Meta-learning has been proposed as a framework to address the challenging few-shot learning setting. The key idea is to leverage a large number of similar few-shot tasks in order to learn how to adapt a base-learner to a new task for which only a few labeled samples are available. As deep neural networks (DNNs) tend to overfit using a few samples only, meta-learning typically uses shallow neural networks (SNNs), thus limiting its effectiveness. In this paper we propose a novel few-shot learning method called meta-transfer learning (MTL) which learns to adapt a deep NN for few shot learning tasks. Specifically, meta refers to training multiple tasks, and transfer is achieved by learning scaling and shifting functions of DNN weights for each task. In addition, we introduce the hard task (HT) meta-batch scheme as an effective learning curriculum for MTL. We conduct experiments using (5-class, 1-shot) and (5-class, 5-shot) recognition tasks on two challenging few-shot learning benchmarks: miniImageNet and Fewshot-CIFAR100. Extensive comparisons to related works validate that our meta-transfer learning approach trained with the proposed HT meta-batch scheme achieves top performance. An ablation study also shows that both components contribute to fast convergence and high accuracy.
Figure: Meta-Transfer Learning. (a) Parameter-level fine-tuning (FT) is a conventional meta-training operation, e.g. in MAML. Its update works for all neuron parameters, ๐ and ๐. (b) Our neuron-level scaling and shifting (SS) operations in meta-transfer learning. They reduce the number of learning parameters and avoid overfitting problems. In addition, they keep large-scale trained parameters (in yellow) frozen, preventing โcatastrophic forgettingโ.
In order to run this repository, we advise you to install python 2.7 and TensorFlow 1.3.0 with Anaconda.
You may download Anaconda and read the installation instruction on their official website: https://www.anaconda.com/download/
Create a new environment and install tensorflow on it:
conda create --name mtl python=2.7
conda activate mtl
conda install tensorflow-gpu==1.3.0
Clone this repository:
git clone https://github.com/y2l/meta-transfer-learning-tensorflow.git
cd meta-transfer-learning-tensorflow
Install other requirements:
pip install scipy
pip install tqdm
pip install opencv-python
The miniImageNet dataset was proposed by Vinyals et al. for few-shot learning evaluation. Its complexity is high due to the use of ImageNet images but requires fewer resources and infrastructure than running on the full ImageNet dataset. In total, there are 100 classes with 600 samples of 84ร84 color images per class. These 100 classes are divided into 64, 16, and 20 classes respectively for sampling tasks for meta-training, meta-validation, and meta-test.
To generate this dataset from ImageNet, you may use the repository miniImageNet tools. You may also directly download processed images. [Download Page]
Fewshot-CIFAR100 (FC100) is based on the popular object classification dataset CIFAR100. The splits were proposed by TADAM. It offers a more challenging scenario with lower image resolution and more challenging meta-training/test splits that are separated according to object super-classes. It contains 100 object classes and each class has 600 samples of 32 ร 32 color images. The 100 classes belong to 20 super-classes. Meta-training data are from 60 classes belonging to 12 super-classes. Meta-validation and meta-test sets contain 20 classes belonging to 4 super-classes, respectively.
You may directly download processed images. [Download Page]
The tieredImageNet dataset is a larger subset of ILSVRC-12 with 608 classes (779,165 images) grouped into 34 higher-level nodes in the ImageNet human-curated hierarchy.
To generate this dataset from ImageNet, you may use the repository tieredImageNet dataset: tieredImageNet tools. You may also directly download processed images. [Download Page]
.
โโโ data_generator # dataset generator
| โโโ pre_data_generator.py # data genertor for pre-train phase
| โโโ meta_data_generator.py # data genertor for meta-train phase
โโโ models # tensorflow model files
| โโโ models.py # basic model class
| โโโ pre_model.py.py # pre-train model class
| โโโ meta_model.py # meta-train model class
โโโ trainer # tensorflow trianer files
| โโโ pre.py # pre-train trainer class
| โโโ meta.py # meta-train trainer class
โโโ utils # a series of tools used in this repo
| โโโ misc.py # miscellaneous tool functions
โโโ main.py # the python file with main function and parameter settings
โโโ run_experiment.py # the script to run the whole experiment
To run the experiments:
python run_experiment.py
You may edit the run_experiment.py
file to change the hyperparameters and options.
LOG_DIR
Name of the folder to save the log filesGPU_ID
GPU device idPRE_TRA_LABEL
Additional label for pre-train modelPRE_TRA_ITER_MAX
Iteration number for the pre-train phasePRE_TRA_DROP
Dropout keep rate for the pre-train phasePRE_DROP_STEP
Iteration number for the pre-train learning rate reducingPRE_LR
Pre-train learning rateSHOT_NUM
Sample number for each classWAY_NUM
Class number for the few-shot tasksMAX_MAX_ITER
Iteration number for meta-train phaseMETA_BATCH_SIZE
Meta batch sizePRE_ITER
Iteration number for the pre-train model used in the meta-train phaseUPDATE_NUM
Epoch number for the base learningSAVE_STEP
Iteration number to save the meta modelMETA_LR
Meta learning rateMETA_LR_MIN
Meta learning rate min valueLR_DROP_STEP
Iteration number for the meta learning rate reducingBASE_LR
Base learning ratePRE_TRA_DIR
Directory for the pre-train phase imagesMETA_TRA_DIR
Directory for the meta-train imagesMETA_VAL_DIR
Directory for the meta-validation imagesMETA_TES_DIR
Directory for the meta-test images
The file run_experiment.py
is just a script to generate commands for main.py
. If you want to change other settings, please see the comments and descriptions in main.py
.
In the default setting, if you run python run_experiment.py
, the pretrain process will be conducted before the meta-train phase starts. If you want to use the model pretrained by us, you may download the model by the following link then replace the pretrain model loading directory in trainer/meta.py
.
Download Pretain Model (miniImageNet): [Google Drive] [็พๅบฆ็ฝ็] (ๆๅ็ : efsv)
We will release more pre-trained models later.
- ๐๐๐ซ๐ ๐ญ๐๐ฌ๐ค ๐ฆ๐๐ญ๐-๐๐๐ญ๐๐ก. The implementation of hard task meta-batch is not included in the published code. I still need time to rewrite the hard task meta batch code for the current framework.
- ๐๐จ๐ซ๐ ๐ง๐๐ญ๐ฐ๐จ๐ซ๐ค ๐๐ซ๐๐ก๐ข๐ญ๐๐๐ญ๐ฎ๐ซ๐๐ฌ. We will add new backbones to the framework like ResNet18 and ResNet34.
- ๐๐ฒ๐๐จ๐ซ๐๐ก ๐ฏ๐๐ซ๐ฌ๐ข๐จ๐ง. We will release the code for MTL on pytorch. It may takes several months to be completed.
Please cite our paper if it is helpful to your work:
@inproceedings{sun2019mtl,
title={Meta-Transfer Learning for Few-Shot Learning},
author={Qianru Sun and Yaoyao Liu and Tat{-}Seng Chua and Bernt Schiele},
booktitle={CVPR},
year={2019}
}
Our implementation uses the source code from the following repositories: