Hello Guys, First of all congratulations for your work! :) <p di

For args.md maybe trying: a linear scheduler that converge lin

questions about performances about pgdf HOT 14 CLOSED

bupt-ai-cz commented on August 13, 2024

questions about performances

from pgdf.

Comments (14)

buptcwk commented on August 13, 2024

Thanks for your question!

Iterating on the dataset is really a good idea. Although we did not try this on the PGDF before, our previous work [1] did a similar thing through this intuition. You can refer to Fig. 9 in the work [1]. In work [1], the experiment is set on a low noise ratio (20%) and after the first processing of the original dataset, the noise ratio is decreased to a quite low value (<1%). As a result, iterations bring very little performance gain. But I think it may work in the heavy noise scenario. Our team's further research may work on this.

Reference:
[1] Zhu, Chuang, et al. "Hard sample aware noise robust learning for histopathology image classification." IEEE Transactions on Medical Imaging 41.4 (2021): 881-894.

from pgdf.

nikokks commented on August 13, 2024

Hi ;)
Tanks for your well documented reply !

I think by changing with another model in each iterations (resnet, vit, mobilenet, clip, etc.) can improve too ! I know that each model has its own "perception of vision". The more they will be different, the more the perception are different.

Another proposition is to change SGD optimizers by Adam or Adabelief (faster convergence, better convergence).

Would you be interested that I work on it with you ?

from pgdf.

nikokks commented on August 13, 2024

Moreover, I think for better warmup convergence, using early stopping could increase your results. As I see you have fixed a variable to set the number of warmup_step. It could be better to use the best checkpoint on val to go on the second step training. With this you will have a "soft-parameter" than an "hard parameter".

from pgdf.

nikokks commented on August 13, 2024

Another thing,
I think that using the confidence entropy to get the prediction confidence (relative percentage from threshold ) could be better to evaluate if the data is in the noisy labels or not.. a better filtter based on the capacity of the model to get good predictions. It should remove some hard hyperparamters too 😉 and having better results..

from pgdf.

buptcwk commented on August 13, 2024

Hi,

Your ideas are very interesting and impressive! Thanks a lot for your reply and invitation. However, I will graduate and start working in a company next month, so I may not have enough time to work on it in the future. Thanks again for your kindness and wish you success in your research 😊.

from pgdf.

nikokks commented on August 13, 2024

Another tip,
as I see your concurrents used larger models, maybe doing like them should improve results..

I have a question: when you talk about cifar10-sym90, you say that 10% only is good labeled and 90% is random label from the 10 classes ? If it is yes, I imagine that your work could label all type of image classification without any data labeled !

so Maybe trying to handle the problem with any labeled data could be a good think to test. If you are confident about this, the only problem should be to match on val the outputs by selecting the outputs based on the val.. If it not works maybe by doing some self-supervised learning like one of the last papers should help.

I think that open your future work to text classification and to tabular data, should make some noise in the domain.

from pgdf.

buptcwk commented on August 13, 2024

The answer is yes. But when the noise ratio is at a high level, the performance becomes unstable. It is a common issue in many LNL algorithms. And work [1] mentioned that pretraining the model weight by contrast learning can significantly achieve performance gain. You can also try on this.

Reference:

[1] Zheltonozhskii, Evgenii, et al. "Contrast to divide: Self-supervised pre-training for learning with noisy labels." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2022.

from pgdf.

nikokks commented on August 13, 2024

I would try another thing
not use the same model twice in the iteration but two different architectures.
And waiting in warmup step that the 2 models converge independently

from pgdf.

nikokks commented on August 13, 2024

Another thing (sorry for disturbing);
it it better to quantify the overall probability of confidence over all the class than just using best prediciton for prob_his1 and prob_his2 like

prob1 = m*prob1_gmm + (1-m)*prob_his1
prob2 = m*prob2_gmm + (1-m)*prob_his2

prob_gmm is a probability of app appartenance.
Like the NegEntropy (or other) I think it would be better to not use prob_his1 but the probability P from this kind of formula : lop(p_i) + sum ( log(1-p_not_i)) => log(P)
this is like a binary crossentropy.
or maybe using directly the NegEntropy and convert it to a scalar P: from sum(p_i log(p_i)) to P

from pgdf.

nikokks commented on August 13, 2024

For args.md maybe trying:

a linear scheduler that converge linearly from 0 to 1
using a metric like accuracy on val (if 96% of accuracy) to fix m to 0.96. Something like that.
doing a clustering on the two points prob_mm and prob_his with GMM and predict the probability of appaertenance to the class (this is like one of your mainstream idea)
With that you ll not have a hyperparamter args.md to fix.

from pgdf.

nikokks commented on August 13, 2024

on the lines
pred1 = prob1 > 0.5
pred2 = prob2 > 0.5
I think determining the best threshold that maximize f1_score or precision or recall (you have the choice) on the val (or train) could improve the thredhold you fixed as 0.5. Improving this threshold should improve convergence and reduce the number of epochs

from pgdf.

nikokks commented on August 13, 2024

It should be better to not use this on hyperparmeter

lr = args.learning_rate
if epoch >= args.lr_switch_epoch:
            lr /= 10

but more a LrScheduler like these:
lr_scheduler.LinearLR
It can deliver the same results without having an hyperparameter or have better results.
See this link https://pytorch.org/docs/stable/optim.html
maybe by searching state of the arte in optimizers classifcation with LrScheduler should improve your results (faster convergence, better convergence)

from pgdf.

nikokks commented on August 13, 2024

In definitive , the less hyperparameters you will have, the more stable your results will be.
If you want some help in the next months, I can make a state of the art in the meantime to help you make the code :)
It would be a pleasure to participate !!

from pgdf.

buptcwk commented on August 13, 2024

Thanks again for your helpful advice! 👍 I wish your research goes well!

from pgdf.

questions about performances about pgdf HOT 14 CLOSED

Comments (14)

Related Issues (4)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent