continualai / avalanche Goto Github PK

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.

Home Page: http://avalanche.continualai.org

License: MIT License

Python 80.49% Batchfile 0.01% Shell 0.13% Jupyter Notebook 19.37% Dockerfile 0.01%

continual-learning deep-learning pytorch lifelong-learning framework benchmarks strategies metrics continualai evaluation

avalanche's People

Contributors

Stargazers

Watchers

Forkers

andreacossu ggraffieti lrzpellegrini antoniocarta jeremyforest trendingtechnology ankitshah009 msrocean rachmadvwp mattiasangermano julioushurtado subutai aterterian ryanlindeborg oyt9306 rrmina pkraison cela96 mattdl 1boch1 tyler-hayes shikhar-srivastava zeta1999 trinh-hoang-hiep vlomonaco feddybear rgmoller mingzailao o7s8r6 edwardnguyen1705 chanr-analytics kwcooper dsoselia pandinosaurus matianbao tomveniat verwimpeli arylwen andrearosasco vishalbelsare lebrice adbmd moenaga ricklentz gunjanrt04 prashant118 nchuramani mohzulfikar stjordanis wanyili mathieu4141 ayush1399 nicklucche sagar16812 timmhess digantamisra98 bharathvarma008 psbd christophalt yangjirui yangxue0827 linhduongtuan wanghd-mvp jizongfox giang12 sayedmaheenbasheer wonmin-byeon tachyonicclock wutong8023 espizo amalapuram nuwangunasekara ashok-arjun gab709 curiszhou kaustubholpadkar ektagavas albinsou anerirana dannyfgithub hikmatkhan jhonathan-pedroso vincentwei2021 xmatmul junfenggo lxqpku zohaibrizvi michalbortkiewicz hamedhemati rudysemola kiminh trenta3 hejleo rahullabs leatherking linzhiqiu ashijoshi zalakbhalani niniack saibezugam

avalanche's Issues

EvalProtocol not working on Split MNIST

I tried to run getting_started.py with Split MNIST instead of Permuted MNIST but EvalProtocol is crashing.
It is probably some problem caused by the missing classes in each task but I didn't look deeply at the code yet.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/avalanche/examples/getting_started.py in <module>
     59
     60     # testing
---> 61     results.append(clmodel.test(test_full))

~/avalanche/avalanche/training/strategies/strategy.py in test(self, test_set)
    183             self.after_task_test()
    184
--> 185         self.eval_protocol.update_tb_test(res, self.batch_processed)
    186
    187         self.after_test()

~/avalanche/avalanche/evaluation/eval_protocol.py in update_tb_test(self, res, step)
    100             in_out_scalars = {
    101                 "in_class": np.average(in_class_diff),
--> 102                 "out_class": np.average(out_class_diff)
    103             }
    104

<__array_function__ internals> in average(*args, **kwargs)

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in average(a, axis, weights, returned)
    391
    392     if weights is None:
--> 393         avg = a.mean(axis)
    394         scl = avg.dtype.type(a.size/avg.size)
    395     else:

~/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
    149             is_float16_result = True
    150
--> 151     ret = umr_sum(arr, axis, dtype, out, keepdims)
    152     if isinstance(ret, mu.ndarray):
    153         ret = um.true_divide(

ValueError: operands could not be broadcast together with shapes (4,) (6,)

Add CUB200 to the Benchmarks

Canonical Correlation Analysis (CCA) Metric

We need to add the CCA metric (described here). Adding it only in Tensorboard would be fine I think.

Add Timing Metric

We should add a simple metric that keeps track of the elapsed time of the experiment.

Automatically download CORe50 instead of requiring user to download it.

Unlike other datasets like MNIST, CORe50 must be explicitly downloaded from the user. It would be better to have the same interface for all datasets.
Moreover, this behavior also breaks the script examples/simple_core50.py, since the provided data folder does not exist.

CWR* and AR1 optimizer initialization

CWR* and AR1 can either use an optimizer passed by the caller, or create one with the lr and momentum parameters.

I think the optimizer should not be an argument because right now they are silently overriding the user's choice. Let me know if there is any reason for this otherwise I will remove the parameter.

Create Conda Package from GitHub

Check if it's possible to create a conda package (with dependencies) from GitHub.

Create generic datasets generator given filelists

This will be useful to create a "scenario" that follows the NCGenericScenario API and works with any CL dataset based on filelists.

CPU/GPU process usage Metric

We need to add the metric related to the CPU/GPU consumption over time.

New Usage Examples

Add additional usage examples in the "examples" directory as a showcase of the avalanche functionalities.

Figures are never closed in CM metrics

Matplotlib complains that figures are left open when creating images of Confusion Matrices.

The exact warning is:

avalanche/evaluation/metrics.py:292: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning).

We should consider closing them in a "finally" block.

CIFAR-10 Dataset

We need to add the CIFAR-10 dataset (split of 2 classes each).

CifarSplit and ICifar should download the datasets (10 and 100 version) automatically.

Up to now, CifarSplit and ICifar require the user to have the dataset stored locally. We should make the interface the same as all the other datasets and download them automatically if needed.

Add OpenLORIS to the Benchamarks

Add ICaRL to the CL Baselines

Add the ICaRL strategy to the main CL Baselines.

Add ImageNet to Benchmarks

Add ImageNet as a benchmark based on the Pytorch data loader.

Add EWC as CL Strategy

Add the EWC strategy as additional CL baseline.

Install Avalanche Pip

We need to create the instructions to install the avalanche package directly from the master on GitHub, so we don't have to package it every time.

AnimalWeb dataset

Could be an interesting dataset to have / explore.
paper here: https://arxiv.org/abs/1909.04951

what do you think ?

LWF "warmup_train" fun never used

Hi @AntonioCarta, I've noticed this function in the LearningWithoutForgetting class is never used, do we need it?

Add Basic Rehearsal Strategy

Add and test a basic rehearsal strategy!

Add Tiny-ImageNet

Add Tiny-ImageNet to the benchmarks.

Add Tensorboard Logging Object

We need a custom Tensorboard logging class to pass to the evaluation protocol class. This class is important to give the user more control over tb logging regardless of the chosen metrics.

Strategy interface proposal

I am starting to play around with the Strategy class and I would like to propose some changes:

We need a method to change the loss function and add regularizers. Currently, if you want to define a regularization-based method you must necessarily redefine the train method with the entire training loop. This causes a lot of code duplication. I would like to have a compute_loss method that gets called inside train.
The callbacks (after_train, before_train, ...) are all implemented as abstract methods. This means that each new strategy must define the method itself. Most of the strategies will implement these as empty methods. Can't we give a default empty implementation?
The multi_head is not used. What is it doing?

Finally, I think that we should try to separate logging code (tensorboard, print statements) from training code. The EvalProtocol should be the only one doing the logging. However, this is less urgent right now.

I can do the changes, but first I wanted to discuss them with you.
Notice that already existing code is not affected by these changes.

Add, improve and consolidate API documentation

Before releasing our project to the public, the API should be well documented and easy to understand.

In many parts of the project the documentation is totally lacking or is very poor. Every class, function and public method should be well documented, with a completely description of parameters, return value etc.
In some part of the project the documentation exists, but it's incomplete or should be improved.
The documentation style is inconsistent, in some part is written in the reST (reStructuredText) format, in other part in the Google style format. We've chosen reST as the preferred code documentation style, so a conversion of all the apidoc in reST is necessary.

Data Loaders should have the same API

Check if all data loaders have the same API: In particular, they should return PyTorch tensors, not numpy!

Add FashionMNIST to the benchmarks

Add GEM and A-GEM to the Strategies

Add CWR* to the Strategies

Add tests

Add unit tests using unittest.

Add AR1 to the CL Strategies

Add the AR1 strategy to the other available baselines.

Metrics and EvalProtocol API

Metrics and EvalProtocol are a little bit unclear to me.

What is EvalProtocol's job? Most of the code implements Tensorboard logging operations but the name hints to something more than that.
Right now metrics do not have a uniform API, and each one takes different argument for the compute method. Each time we add a new metric, we also have to add a new if case inside EvalProtocol's get_results.

I would prefer a generic EvalProtocol that controls printing and logging and only delegates the computations to the metrics (e.g. instead of printing inside compute EvalProtocol calls the __str__ method). I would also prefer to be able to choose where to print the metrics (output file, tensorboard, stdout).

Add Learning without Forgetting Strategy

Add a basic LwF strategy.

More clear visualization of build results

When a pull request is opened, it is build by travis CI to evaluate the correctness of the code. By now travis only signal if a build is passed ot failed, without any further information (e.g. why is failed, what are the error ecc..). It would be nice to have more feedback from travis, in order to immediately know why a build is failed without the need to open travis and inspect the console.

Search for CI bots that display the errors in the pull request discussion.
Split the build in many parts, one for linting, one for test, one for documentation, in order to know what is wrong.
Have a look at circleCI or other alternative build systems.

Add generic New Classes scenario manager

Already implemented in my private codebase, working on porting it to Avalanche.

This class will allow the user to create a NC (New Classes) scenario given a couple of generic train and test Datasets.

The user will be able to create a manager instance that will be an iterable. This iterable will output the incremental "task"s or "batch"es (terminology to be defined) and will also allow the user to execute certain task/batch complex management operations.

This is very similar to the current loader being implemented in the Avalanche codebase, but will allow the user to plug in his/her own dataset. Also, due to being extremely generic, this will speed-up the integration of new datasets.

The code I've already implemented in my private codebase works fine, but is complex. I'm working on slimming it down a bit. Here is a list of already implemented features, please feel free to comment if you feel we need even more features!

Features (already implemented):

Get current/cumulative/task-specific train/test datasets
Variable number of tasks (or "incremental batches" for task-free scenarios)
Allow the user to customize the number of classes in each task
Class shuffling given a seed
Ability to define a fixed class order (for results reproducibility)
Remapping class original IDs to range(0, n_classes) (very useful when creating confusion matrices and for algorithms based on dynamic head expansion which require class idxs in ascending order)

Side features (already implemented):

Wrapper Dataset class that makes any Dataset sliceable and funny-indexable
Wrapper Dataset class that makes any Dataset transformable (like the ones in torchvision)

To be defined (even in future development phases):

Terminology: the manager will output "task"s. That is, training/test sets made only of patterns of certain classes. The question is: is the "task" terminology "ok"? Or is too much related to task-oriented scenarios? Consider that, apart from the terminology considerations, the users will be able to use this manager both in multi-task and task-free (single incremental task) scenarios...

Feel free to comment.

Keep up with the excellent work you've been doing!

Expand Readme Description

Add generic New Instances scenario manager

Similar to the generic New Classes manager, a New Instance manager will allow us to streamline the creation of New Insances benchmarks.

The NI Manager should mainly focus on the SIT scenario.

Key features:

Support for PyTorch Datasets
Support for the SIT scenario
Different options to balance class distribution among batches

The NI manager should also include features found in the NCScenario. For instance:

Getter for current/past/growing/future train and test sets
List of already encountered classes (considering this is a NI scenario, a counter of n. patterns for each class should be exposed to the user)
Customizable size (n. of patterns) of incremental batches
...

Automatic testing, style checking and deployment

We should release Avalanche with a proper Continuous Integration system.

The CI setup should include:

Build & Test (of master and pull requests)
Package deployment (pypi?, conda?)
Docs deployment

@ggraffieti is already working on #3 for docs: I think that it'll be a good starting point.

Major obstacles are:

Defining a complete (Travis?) CI setup
Define tests for each and every part of Avalanche!!
Define the release channels

I'm creating this issue as "low priority" but we should definitely consider releasing the 0.1.0 version of Avalanche not before a decent CI setup has been defined.

pytorchcv ImportError

I created a new environment for the project.
Using the environment.yml file I was not able to install pytorchcv through conda.

I used pip to install pytorchcv 0.0.58.

However, the example in examples/getting_started.py is not working for me.

ImportError: cannot import name 'DwsConvBlock' from 'pytorchcv.models.mobilenet' (/home/carta/anaconda3/envs/avalanche-env/lib/python3.8/site-packages/pytorchcv/models/mobilenet.py)

Maybe I need a different version of the package?
@vlomonaco or anyone else that is able to run the examples, can you tell me you pytorchcv version?
you can use pip list | grep pytorchcv.

I do not know the library but from a quick look at the implementation it seems that they refactored their code, changing the name and location of the convolutional blocks.

Duplicated ICifar100 dataset

ICifar100 dataset is present both in benchmarks/cdata_loaders/cifar_split.py and in benchmarks/cdata_loaders/icifar100.py. The versions however are different.
Is there a reason to keep two different versions? Otherwise we should either keep the correct version or merge them into icifar100.py.

Moreover, in cifar_split.py there is a typo in get_grow_test_set method, which should be get_growing_testset in order to be compliant with the avalanche interface.