Hi, Thanks so much for the cool project. I tried to reproduce the re

And here is the json config that I have used in my experiment: <div class="snippet

And here is the experiment configuration of foster: <div class="snippet-clipboard-

Here is bic configuration: <div class="snippet-clipboard-content notranslate posit

Not quite sure about the error. In another survey (<a href="https://github.com/zhoudw-

Reproducibility about pycil HOT 11 CLOSED

nttung1110 commented on September 2, 2024

Reproducibility

from pycil.

Comments (11)

nttung1110 commented on September 2, 2024

And here is the json config that I have used in my experiment:

{
    "prefix": "reproduce",
    "dataset": "cifar100",
    "memory_size": 2000,
    "memory_per_class": 20,
    "fixed_memory": false,
    "shuffle": true,
    "init_cls": 10,
    "increment": 10,
    "model_name": "icarl",
    "convnet_type": "resnet32",
    "device": ["1"],
    "seed": [1993]
}

from pycil.

zhoudw-zdw commented on September 2, 2024

I am not quite sure which result you would like to reproduce.

It seems your final NME accuracy is about 49.4, which is the same as we reported in figure b (https://github.com/G-U-N/PyCIL/blob/master/resources/cifar100.png) and figure 1(a) here (https://arxiv.org/pdf/2112.12533v2.pdf)

from pycil.

nttung1110 commented on September 2, 2024

Thanks for your reply. I was confused about NME and CNN and realized that NME is chosen for icarl instead of CNN. How about the following results of foster? I could not reproduce the result reported in the figure (The final CNN accuracy is about 55 but it should be higher than 60 based on your reported figures right?)

2023-05-23 01:18:52,588 [foster.py] => Exemplar size: 2000
2023-05-23 01:18:52,589 [trainer.py] => CNN: {'total': 54.55, '00-09': 48.8, '10-19': 33.5, '20-29': 51.3, '30-39': 46.8, '40-49': 59.0, '50-59': 52.7, '60-69': 66.9, '70-79': 60.5, '80-89': 64.0, '90-99': 62.0, 'old': 53.72, 'new': 62.0}
2023-05-23 01:18:52,589 [trainer.py] => NME: {'total': 48.13, '00-09': 46.9, '10-19': 32.9, '20-29': 49.5, '30-39': 42.9, '40-49': 52.0, '50-59': 42.4, '60-69': 50.2, '70-79': 43.1, '80-89': 45.6, '90-99': 75.8, 'old': 45.06, 'new': 75.8}
2023-05-23 01:18:52,589 [trainer.py] => CNN top1 curve: [90.8, 81.2, 76.27, 67.72, 63.86, 60.65, 59.81, 56.8, 55.58, 54.55]
2023-05-23 01:18:52,589 [trainer.py] => CNN top5 curve: [99.2, 95.75, 93.83, 90.25, 87.82, 85.35, 84.29, 83.19, 82.04, 81.68]
2023-05-23 01:18:52,589 [trainer.py] => NME top1 curve: [90.2, 82.35, 77.07, 68.8, 64.28, 59.82, 57.11, 51.94, 51.09, 48.13]
2023-05-23 01:18:52,589 [trainer.py] => NME top5 curve: [99.2, 96.85, 95.03, 91.02, 88.38, 85.8, 83.57, 81.66, 79.91, 76.7]

from pycil.

nttung1110 commented on September 2, 2024

And here is the experiment configuration of foster:

{
    "prefix": "cil",
    "dataset": "cifar100",
    "memory_size": 2000,
    "memory_per_class": 20,
    "fixed_memory": false,
    "shuffle": true,
    "init_cls": 10,
    "increment": 10,
    "model_name": "foster",
    "convnet_type": "resnet32",
    "device": ["3"],
    "seed": [1993],
    "beta1":0.96,
    "beta2":0.97,
    "oofc":"ft",
    "is_teacher_wa":false,
    "is_student_wa":false,
    "lambda_okd":1,
    "wa_value":1,
    "init_epochs": 200,
    "init_lr" : 0.1,
    "init_weight_decay" : 5e-4,
    "boosting_epochs" : 170,
    "compression_epochs" : 130,
    "lr" : 0.1,
    "batch_size" : 128,
    "weight_decay" : 5e-4,
    "num_workers" : 8,
    "T" : 2
}

from pycil.

zhoudw-zdw commented on September 2, 2024

https://github.com/G-U-N/PyCIL/blob/0011ca4658d779224f0a5d499063ba5356347d9e/models/foster.py#L13C1-L13C103

Check this line in foster. Data augmentation in the source repo is needed to reproduce foster.

from pycil.

nttung1110 commented on September 2, 2024

Thanks. I'll check that. How about the following results of bic? Both NME and CNN can not reach the reported results in your paper. And there is something strange with performance on class 60-69 of CNN where 0.0 is returned

2023-05-23 01:51:11,906 [bic.py] => Exemplar size: 2000
2023-05-23 01:51:11,906 [trainer.py] => CNN: {'total': 39.06, '00-09': 53.9, '10-19': 36.5, '20-29': 47.1, '30-39': 39.9, '40-49': 51.6, '50-59': 42.1, '60-69': 0.0, '70-79': 11.5, '80-89': 46.4, '90-99': 61.6, 'old': 36.56, 'new': 61.6}
2023-05-23 01:51:11,906 [trainer.py] => NME: {'total': 46.11, '00-09': 49.6, '10-19': 32.3, '20-29': 50.6, '30-39': 38.5, '40-49': 50.4, '50-59': 38.4, '60-69': 46.0, '70-79': 45.5, '80-89': 52.6, '90-99': 57.2, 'old': 44.88, 'new': 57.2}
2023-05-23 01:51:11,906 [trainer.py] => CNN top1 curve: [86.5, 71.3, 68.4, 62.35, 61.42, 58.17, 48.27, 42.31, 40.74, 39.06]
2023-05-23 01:51:11,906 [trainer.py] => CNN top5 curve: [99.0, 94.25, 91.77, 88.98, 87.98, 86.15, 72.44, 70.03, 68.52, 66.57]
2023-05-23 01:51:11,906 [trainer.py] => NME top1 curve: [86.3, 71.6, 68.73, 62.7, 61.72, 57.62, 55.69, 50.46, 48.24, 46.11]
2023-05-23 01:51:11,906 [trainer.py] => NME top5 curve: [98.7, 94.0, 91.43, 88.88, 87.56, 84.62, 82.54, 79.0, 77.08, 75.35]

from pycil.

nttung1110 commented on September 2, 2024

Here is bic configuration:

{
    "prefix": "reproduce",
    "dataset": "cifar100",
    "memory_size": 2000,
    "memory_per_class": 20,
    "fixed_memory": false,
    "shuffle": true,
    "init_cls": 10,
    "increment": 10,
    "model_name": "bic",
    "convnet_type": "resnet32",
    "device": ["0","1","2","3"],
    "seed": [1993]
}

from pycil.

zhoudw-zdw commented on September 2, 2024

Not quite sure about the error. In another survey (https://github.com/zhoudw-zdw/CIL_Survey) we sucessfully reproduce the results of bic with the same code. We shall check it later.

from pycil.

nttung1110 commented on September 2, 2024

Thanks. That would be great. Do you think it is due to some unexpected errors while running on a single gpu? The only thing that I have modified when running experiment on your code is just single gpu training

from pycil.

zhoudw-zdw commented on September 2, 2024

It seems that the default number of epochs for bias tuning could be too high, which could result in poor performance for new classes. You can modify it to achieve better performance.

https://github.com/G-U-N/PyCIL/blob/0b99ae5c4ea34df1c913c15e1d366a9e718e7569/models/bic.py#L177C10-L177C10

from pycil.

nttung1110 commented on September 2, 2024

@zhoudw-zdw Then what is the exact configuration that you have used to reproduce results of bic in your paper? I would assume that the parameters (epoch, lr, ...) u have specified in training script of bic should be able to reproduce the approximated results. It would be great if u could clarify the correct configuration of reproduction so that other works could benefit from your project. Thanks.

from pycil.

Reproducibility about pycil HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent