Giter VIP home page Giter VIP logo

iclr23-memo's Introduction

A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning

The code repository for "A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning" (ICLR'23 Spotlight) in PyTorch. If you use any content of this repo for your work, please cite the following bib entry:

@inproceedings{zhou2023model,
  title={A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning},
  author={Zhou, Da-Wei and Wang, Qi-Wei and Ye, Han-Jia and Zhan, De-Chuan},
  booktitle={ICLR},
  year={2023}
}

MEMO: Memory-Efficient expandable MOdel

Real-world applications require the classification model to adapt to new classes without forgetting old ones. Correspondingly, Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement. Typical CIL methods tend to save representative exemplars from former classes to resist forgetting, while recent works find that storing models from history can substantially boost the performance. However, the stored models are not counted into the memory budget, which implicitly results in unfair comparisons. We find that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work, especially for the case with limited memory budgets. As a result, we need to holistically evaluate different CIL methods at different memory scales and simultaneously consider accuracy and memory size for measurement. On the other hand, we dive deeply into the construction of the memory buffer for memory efficiency. By analyzing the effect of different layers in the network, we find that shallow and deep layers have different characteristics in CIL. Motivated by this, we propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel. MEMO extends specialized layers based on the shared generalized representations, efficiently extracting diverse representations with modest cost and maintaining representative exemplars. Extensive experiments on benchmark datasets validate MEMO’s competitive performance.

Prerequisites

Training scripts

  • Train CIFAR100
python main_memo.py -model memo -init 10 -incre 10 -ms 3312 -net memo_resnet32 -p fair -d 3 --train_base -d 0 1 2 3

Acknowledgment

We thank the following repos providing helpful components/functions in our work.

Contact

If there are any questions, please feel free to contact with the authors: Da-Wei Zhou ([email protected]) and Qi-Wei Wang ([email protected]). Enjoy the code.

visitors

iclr23-memo's People

Contributors

wangkiw avatar zhoudw-zdw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

iclr23-memo's Issues

Some confusion about parameters and loss functions

Thanks for your excellent work!

But I have some confusion about the params "-ms 3312", can you explain why you chose this parameter size?
And about the loss function "loss_aux=F.cross_entropy(aux_logits,aux_targets)", I noted the definition of the FC layer, "self.aux_fc=self.generate_fc(self.out_dim, new_task_size+1)", can you provide more explanation about aux_fc?

Many thanks and looking forward to the reply

About outputting as nan

I encountered a problem during the process of reproducing this code. In the second training stage, the output of the old model for new data was nan. I debugged the code and found that the distribution of the model's output data was significantly different, ultimately leading to numerical overflow. May I ask what tricks were used in your implementation process to avoid such issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.