Hi, I am relatively new to the AI space so apologies if I am missing

VRAM Usage relative to n_frames about tokenflow HOT 7 CLOSED

omerbt commented on May 25, 2024

VRAM Usage relative to n_frames

from tokenflow.

Comments (7)

DBarker774 commented on May 25, 2024

I should note that I have tried to process a video in batches of 30 frames successfully however this introduces inconsistences along with the number of batches.

from tokenflow.

DBarker774 commented on May 25, 2024

Below is an example output of a video cut into 3 batches of 24 frames.
Note the inconsistencies or jumps between batches which are very noticeable.

Running the full 450 frames of this video results in CUDA out of memory.

tokenflow_PnP_fps_30_1.mp4

from tokenflow.

rakesh-reddy95 commented on May 25, 2024

Can you show the results after preprocessing?

from tokenflow.

DBarker774 commented on May 25, 2024

It's worth mentioning that I also tried processing the video using google collab with an A100 40GB and still ran out of memory.

I'm wondering if I am missing something when it comes to processing longer videos.

from tokenflow.

MichalGeyer commented on May 25, 2024

Hi there!
Just to make sure -- the inconsistencies in your result come from treating the video as 3 different videos when running our method. You shouldn't see such inconsistencies if you were run our method on the full video.

In terms of memory, the main bottleneck is the computation of extended attention on the keyframes, which is a massive matrix multiplication.
I think it can be lightened (at the expense of run time though) by adding more for loops instead of batch matrix multiplication in this computation : https://github.com/omerbt/TokenFlow/blob/06f51a0d0c19bef88f0b9b521146b5b849fbfb76/tokenflow_utils.py#L168C13-L168C16
It's currently written such that above 96 frames it loops over the frames to computes the cross-frame attention of each keyframe. Also it loops over the different attention heads. This was designed for our resources, and you can add a loop and loop over the dimension of the attention sequence_length.

For reference, I was able to run the method on 200+ frames using 48G gpu mem.

Hope this helps!

from tokenflow.

DBarker774 commented on May 25, 2024

Hi there! Just to make sure -- the inconsistencies in your result come from treating the video as 3 different videos when running our method. You shouldn't see such inconsistencies if you were run our method on the full video.

In terms of memory, the main bottleneck is the computation of extended attention on the keyframes, which is a massive matrix multiplication. I think it can be lightened (at the expense of run time though) by adding more for loops instead of batch matrix multiplication in this computation : https://github.com/omerbt/TokenFlow/blob/06f51a0d0c19bef88f0b9b521146b5b849fbfb76/tokenflow_utils.py#L168C13-L168C16 It's currently written such that above 96 frames it loops over the frames to computes the cross-frame attention of each keyframe. Also it loops over the different attention heads. This was designed for our resources, and you can add a loop and loop over the dimension of the attention sequence_length.

For reference, I was able to run the method on 200+ frames using 48G gpu mem.

Hope this helps!

Thank you for such a detailed reply.
I am somewhat of a beginner but what you have mentioned makes sense.

You clarification is completely correct. This is not an issue with consistency of your method at all, more of inconsistencies introduced by my workaround to keep VRAM under control.

Unfortunately I do not have the skillset or know-how to make the updated for loops as you have suggested.

Just curious how you are able to have a card with 48gb memory?

I tried provisioning an a100 80gb from google but was denied as I'm not a business ahaha.

Big fan of your method and would love to put this into practice on higher resolution longer, videos.

from tokenflow.

DBarker774 commented on May 25, 2024

Closing this as it has largely been answered.

from tokenflow.

VRAM Usage relative to n_frames about tokenflow HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent