Comments (20)
- In our paper the linear probe result 52.3% for InfoNCE-rgb was trained for 800 epochs, if InfoNCE-rgb was trained for 500 epochs I get 46.8% linear probe results. I have updated the NeurIPS final version and (soon) the Arxiv version to correct this. You can also check this helpful issue: #3 (comment)
- Also about RandomResizeCrop was discussed in the issue above. The "consistent" is still clip-wise, in pretraining stage, I concatenate two tensors together, apply "RandAug1" to the first half and "RandAug2" to the second half. In finetune stage, this consistent augment has no effect.
from coclr.
I do not use dropout in the pre-training stage. The self-supervised pre-training is expected to overfit the huge dataset (but usually limited by the model capacity), it's not necessary to constraint network capacity by dropout some nodes.
I use dropout in downstream classification tasks to avoid fast overfitting the UCF101 and HMDB51 training set (which are much smaller in size).
from coclr.
Hi! Sorry for the late reply.
- yes the output you provided shows mode is initialized.
You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88
The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. - Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage.
- I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code.
- BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
It seems OK to delete 'decode' directly. Now the model seems hard to converge, 20 epoch for 0.09 top-1 acc, have you met such a situation? Thanks.
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
It seems OK to delete 'decode' directly. Now the model seems hard to converge, 20 epoch for 0.09 top-1 acc, have you met such a situation? Thanks.
thank you for your reply.
In your case, do you mean that you get a 0.09 top-1 acc when you run it 20 epochs?
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
It seems OK to delete 'decode' directly. Now the model seems hard to converge, 20 epoch for 0.09 top-1 acc, have you met such a situation? Thanks.
thank you for your reply.
In your case, do you mean that you get a 0.09 top-1 acc when you run it 20 epochs?
Yeah, now it ran 200 epochs and got 0.69 top-1 acc when training. Did it make sense? I am a little confused that why it converged so slow with adam and lr 0.001.
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
I use python3
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
It seems OK to delete 'decode' directly. Now the model seems hard to converge, 20 epoch for 0.09 top-1 acc, have you met such a situation? Thanks.
thank you for your reply.
In your case, do you mean that you get a 0.09 top-1 acc when you run it 20 epochs?Yeah, now it ran 200 epochs and got 0.69 top-1 acc when training. Did it make sense? I am a little confused that why it converged so slow with adam and lr 0.001.
This is train/val curve of one of my experiments on fine-tuning InfoNCE-UCF101-RGB pre-trained models. Reduce lr by x0.1 on epoch 300. At 20 epochs I get 40+% accuracy. But it's true at 200 epochs I get ~70% accuracy.
I think the reason for slow converge is I use 0.9 dropout (to prevent fast overfitting).
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
It seems OK to delete 'decode' directly. Now the model seems hard to converge, 20 epoch for 0.09 top-1 acc, have you met such a situation? Thanks.
thank you for your reply.
In your case, do you mean that you get a 0.09 top-1 acc when you run it 20 epochs?Yeah, now it ran 200 epochs and got 0.69 top-1 acc when training. Did it make sense? I am a little confused that why it converged so slow with adam and lr 0.001.
This is train/val curve of one of my experiments on fine-tuning InfoNCE-UCF101-RGB pre-trained models. Reduce lr by x0.1 on epoch 300. At 20 epochs I get 40+% accuracy. But it's true at 200 epochs I get ~70% accuracy.
I think the reason for slow converge is I use 0.9 dropout (to prevent fast overfitting).
Got it. Thank you!
from coclr.
Hi! Sorry for the late reply.
1. yes the output you provided shows mode is initialized. You can check here: https://github.com/TengdaHan/CoCLR/blob/main/utils/utils.py#L88 The "weights not used from pretrained file" is actually None; the "weights not loaded into new model" are all related to momentum queue, I choose to re-accumulate these variables for a better-quality queue. 2. Alternation stage. My acc is always between 0-1 (before percentage), do you mean acc is less than 0.01? that's strange. Acc of alternation stage should be similar/better than infoNCE stage. 3. I didn't get this "AttributeError: 'str' object has no attribute 'decode'" with the same code. 4. BTW, I just slightly updated the code since I am also running more experiments with the same version of code. You can have a look.
thank you for your answer!
I understand most of your answers! (I understand the question of accuracy! have the same accuracy as you. between 0 and 1)
But I have a more question.
- I ran it again with the code you posted a few hours ago, but i got an "AttributeError: 'str' object has no attribute 'decode'". Do you use python2?
It seems OK to delete 'decode' directly. Now the model seems hard to converge, 20 epoch for 0.09 top-1 acc, have you met such a situation? Thanks.
thank you for your reply.
In your case, do you mean that you get a 0.09 top-1 acc when you run it 20 epochs?Yeah, now it ran 200 epochs and got 0.69 top-1 acc when training. Did it make sense? I am a little confused that why it converged so slow with adam and lr 0.001.
This is train/val curve of one of my experiments on fine-tuning InfoNCE-UCF101-RGB pre-trained models. Reduce lr by x0.1 on epoch 300. At 20 epochs I get 40+% accuracy. But it's true at 200 epochs I get ~70% accuracy.
I think the reason for slow converge is I use 0.9 dropout (to prevent fast overfitting).
Hi, Tengda, I finished the pre-training and fixed several minor bugs in eval/main_classifier.py. Now I got 0.452 linear evaluation performance with 100 epochs, which is lower than 0.523, reported in the paper. Do I misunderstand some methods or configs? Besides, I found that you use non-consistent RandomResizeCrop when pre-training but consistent RandomResizeCrop in fine-tuning, could you please tell me what your hypothesis is for this setting? Thanks. Looking forward to your reply. I used commands for evaluation is CUDA_VISIBLE_DEVICES=0 python main_classifier.py --net s3d \ --dataset ucf101 --ds 1 --batch_size 32 -j 0 --center_crop \ --test log-eval-linclr/ucf101-128_sp1_lincls_s3d_Adam_bs32_lr0.001_dp0.9_wd0.001_seq1_len32_ds1_train-last_pt\=..-log-pretrain-infonce_k2048_ucf101-2clip-128_s3d_bs32_lr0.001_seq2_len32_ds1-model-model_best_epoch292.pth.tar/model/model_best_epoch95.pth.tar
from coclr.
Got it. Thanks for your prompt and clear answers.
from coclr.
- In our paper the linear probe result 52.3% for InfoNCE-rgb was trained for 800 epochs, if InfoNCE-rgb was trained for 500 epochs I get 46.8% linear probe results. I have updated the NeurIPS final version and (soon) the Arxiv version to correct this. You can also check this helpful issue: #3 (comment)
- Also about RandomResizeCrop was discussed in the issue above. The "consistent" is still clip-wise, in pretraining stage, I concatenate two tensors together, apply "RandAug1" to the first half and "RandAug2" to the second half. In finetune stage, this consistent augment has no effect.
Did you mean that you train 300 epochs for pre-training stage and 800 epochs for fine-tuning stage?
from coclr.
- In our paper the linear probe result 52.3% for InfoNCE-rgb was trained for 800 epochs, if InfoNCE-rgb was trained for 500 epochs I get 46.8% linear probe results. I have updated the NeurIPS final version and (soon) the Arxiv version to correct this. You can also check this helpful issue: #3 (comment)
- Also about RandomResizeCrop was discussed in the issue above. The "consistent" is still clip-wise, in pretraining stage, I concatenate two tensors together, apply "RandAug1" to the first half and "RandAug2" to the second half. In finetune stage, this consistent augment has no effect.
Hi, Tengda, I found another thing confused me that the training set of UCF101-split1 has 9537 videos, but when I set bs=32, the total batches per epoch was 149 which is half of 9537//32=298. I haven't figured out why this happened.
from coclr.
- the roadmap of our paper Table1 experiment is (pretrain epochs in parenthesis):
InfoNCE-rgb(300) --------> CoCLR-Cyclex2(100x2) --------> our CoCLR-rgb, totally 500 epochs, 70.2% linear probe.
InfoNCE-rgb(300) ---> continue to train InfoNCE(200) ---> InfoNCE-rgb baseline for a fair comparison, totally 500 epochs, 46.8% linear probe.
The 800 epochs I mentioned above is also 'pretrain epochs'. InfoNCE-rgb(300+200) is a fair comparison and should be in Table1, but I unnecessarily put an InfoNCE-rgb(800) results, which I have corrected in NeurIPS final version.
Hope this is clear now? BTW, thanks for the feedback.
- My batch_size is batchsize-per-GPU:
Line 494 in c95eba9
from coclr.
- the roadmap of our paper Table1 experiment is (pretrain epochs in parenthesis):
InfoNCE-rgb(300) --------> CoCLR-Cyclex2(100x2) --------> our CoCLR-rgb, totally 500 epochs, 70.2% linear probe. InfoNCE-rgb(300) ---> continue to train InfoNCE(200) ---> InfoNCE-rgb baseline for a fair comparison, totally 500 epochs, 46.8% linear probe.
The 800 epochs I mentioned above is also 'pretrain epochs'. InfoNCE-rgb(300+200) is a fair comparison and should be in Table1, but I unnecessarily put an InfoNCE-rgb(800) results, which I have corrected in NeurIPS final version.
Hope this is clear now? BTW, thanks for the feedback.
- My batch_size is batchsize-per-GPU:
Line 494 in c95eba9
Are you using 2 GPUs? if yes, then 9537//(32*2)=149 is correct.
Got it. Thanks. I found that there is no dropout in main_nce.py and main_coclr.py but found it in eval/main_classifier.py (when fine-tuning the entire network). Does this mean that you only use dropout when fine-tuning the entire network?
from coclr.
I do not use dropout in the pre-training stage. The self-supervised pre-training is expected to overfit the huge dataset (but usually limited by the model capacity), it's not necessary to constraint network capacity by dropout some nodes.
I use dropout in downstream classification tasks to avoid fast overfitting the UCF101 and HMDB51 training set (which are much smaller in size).
Got it. Thanks.
from coclr.
Hi, Tengda, I found that A.RandomSizedCrop
was used in validation and test, while people usually use isotropically resize + center crop
for inference. Could you please tell me why you choose such a setting? I tested isotropically resize + center crop
. There is not much performance difference. Thanks.
Lines 738 to 744 in 110c83d
from coclr.
Val is just to monitor performance, doesn't really matter.
For test, I actually use "isotropically" 10-crop ((4corners + center) * 2flip):
Line 457 in 110c83d
The line you pointed out was re-written during the final inference.
from coclr.
Val is just to monitor performance, doesn't really matter.
For test, I actually use "isotropically" 10-crop ((4corners + center) * 2flip):
Line 457 in 110c83d
The line you pointed out was re-written during the final inference.
Got it. Thanks.
from coclr.
Related Issues (20)
- 2-Stream evaluation on UCF101 HOT 1
- Possible to share S3D model trained on UCF101 -RGB only end to end fine tuned (reported accuracy 81.4) HOT 1
- Making Kinetics-400 RGB lmdb dataset. HOT 2
- Command for Linear Probe and Finetuning HOT 1
- How to generate corresponding 'video_source.json'? HOT 1
- AttributeError: 'str' object has no attribute 'decode' HOT 2
- information about requirements and dateset preparations HOT 1
- lmdb_dataset.py txn.get(self.get_video_id[name].encode('ascii'))) HOT 6
- Issue about DistributedDataParallel(DDP) HOT 3
- Reproducing CoCLR RGB Results??? HOT 1
- fusing two-stream predictions HOT 1
- Question about the code in eval/main_classifier.py HOT 2
- question about ten crop
- Can we design a downstream task for action detection? HOT 5
- Two-stream feature
- TypeError: order must be str, not int HOT 2
- main_classifier.py: error: unrecognized arguments: --final_bn
- I can't download the pretrained model HOT 1
- A question about convert_video_to_lmdb.py HOT 2
- Download Link of Checkpoint HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from coclr.