Giter VIP home page Giter VIP logo

bike's Issues

CUDA out of memory

I am using 4 3090ti cards, and I have set the batch size to very small, but this situation occurs every time the first epoch is clicked

Traceback (most recent call last):
File "train.py", line 522, in
main(args)
File "train.py", line 316, in main
prec1, output_list, labels_list = validate(epoch, val_loader, classes, device, model, video_head, config, n_class, logger, save_score)
File "train.py", line 433, in validate
cls_feature, text_features = model.module.encode_text(text_inputs, return_token=True) # [n_cls, feat_dim]
File "/home/chenshengyi/depthstudy/gesturecode/BIKE-main/clip/model.py", line 443, in encode_text
x = self.transformer(x)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/chenshengyi/depthstudy/gesturecode/BIKE-main/clip/model.py", line 253, in forward
x = checkpoint(r, x)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 235, in checkpoint
return CheckpointFunction.apply(function, preserve, *args)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 96, in forward
outputs = run_function(*args)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/chenshengyi/depthstudy/gesturecode/BIKE-main/clip/model.py", line 227, in forward
x = x + self.drop_path(self.attention(self.ln_1(x)))
File "/home/chenshengyi/depthstudy/gesturecode/BIKE-main/clip/model.py", line 219, in attention
return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
File "/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/anaconda3/lib/python3.8/site-packages/torch/nn/modules/activation.py", line 1153, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
File "/opt/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 5131, in multi_head_attention_forward
v = v.contiguous().view(v.shape[0], bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: CUDA out of memory. Tried to allocate 1.96 GiB (GPU 0; 23.70 GiB total capacity; 15.94 GiB already allocated; 657.56 MiB free; 16.10 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Frozen label encoder

What is the difference and connection between Frozen label encoder and category encoder, I see in table6(a) it shows that adding frozen label encoder improves the result. And what is (technical) Transf?

key models

Congratulations on your work. May I ask if you can upload some key models to the cloud for download (e.g. k400-vit-l-14-f16.pt, etc.)

Questions about the pre-generated attributes on ucf101/hmdb51. The original intention was that I couldn't replicate your results on UCF101

Congratulation to your excellent work! And I saw that you provided the attribute JSON file for k400. May I ask if you could provide the train/val JSON files for ucf and hmdb51? The main reason is that when conducting few shot training on the UCF101 dataset (shot=1/2/5), I cannot reproduce your results (95.2/96.1/96.5), while my results are (90.22/94.15/95.32). I have trained multiple times and have taken the highest value. Of course, the above are all the results of the video branch and do not use the attribute branch mentioned in your paper. Also, I wish you a happy National Day! Looking forward to your reply!Thank you!

Questions about the pre-generated attributes

Congratulation to your excellent work! It seems that current published code doesn't include the 'pre-generated attributes' which should be in a json file , and I also find that the test code doesn't include the attributes branch now. I wonder that will the complete code be published soon ?

num_sample=1——>4

When changing num_sample from 1 to 4, how is the function in train modified: in particular, the following code
images = images.view((-1,config.data.num_segments,3)+images.size()[-2:]) # bt 3 h w b,t,c,h,w = images.size() images= images.view(-1,c,h,w)
and
image_embedding, cls_embedding, text_embedding, logit_scale = model(images, texts, return_token=True)
After the incoming parameters of the model are changed from a tensor to a list

About:wanrning:“None of the inputs have requires_grad=True. Gradients will be None“

I'm reaching out regarding an issue I've encountered while working with the model. I'm receiving a warning that says, 'None of the inputs have requires_grad=True. Gradients will be None.' I've been trying to troubleshoot this and was wondering if you might have insights into resolving this particular warning.

From my understanding, it seems like there might be an issue related to the setting of the requires_grad attribute for the inputs, leading to the absence of gradients during the training process. But the inputs(images) should not be set "requires_grad=True"

Could you kindly offer guidance or share any specific steps or considerations to address this warning? Any advice or direction you could provide would be greatly appreciated.

Thank you very much for your time and assistance.

About testing charades: the result of the test should use the last recorded mAP ?

Hi!

I noticed that AverageMeter() is used in calculating the mAP and the output is the global avg, and this is not a problem when calculating accuracy.

However, the result of the mAP calculation should be obtained using all the predictions. Therefore the value of the last maper.value().numpy() should be used directly as the final result.

Am I understanding this correctly?

question about Kinetics400 88.7

ViT-L/14; 16x336 4x3 top1= 88.7 on Kinetics400. However, the accuracy in the corresponding log link is not 88.7 but only 87.9.

about inference

Hi! I want to know whether attributes branch is used when inference?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.