Since I got it to work on my GForce 1050GTX / 2GB , at least for videos not longer tha

There are multiple things you could do. Lower t

Ok. Added Line <a class="issue-link js-issue-link" data-error-text="Failed to load

Ok. Added Line <a class="issue-link js-issue-link" data-error-text="F

Limitation in processing number of video frames according to GPU memory? about simswap HOT 13 CLOSED

neuralchen commented on May 22, 2024

Limitation in processing number of video frames according to GPU memory?

from simswap.

Comments (13)

instant-high commented on May 22, 2024 3

Just found a more simple solution for "cuda out of memory" problem while running SimSwap on 2GB GPU:

I only insert a with torch.no_grad(): command in ../util/videoswap.py between lines 48 and 49
(and add 4 more spaces indent to every following line from 49 to 84)

and it works perfect

from simswap.

ExponentialML commented on May 22, 2024 1

There are multiple things you could do.

Lower the size of your input videos.
Split the chunks into separate files, then loop over them or do it one by one (painful).
Modify videoswap.py using the below as a starting point.

SimSwap/util/videoswap.py

Line 38 in fc4b701

frame_count = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
Use a subprocess and use ffmpeg to split the video into chunks, then do a for loop over each video chunk using the python script, then merge them after the fact in a video editor or with ffmpeg. For example:

~pseudo code~ 

for video_file in video_file_directory:
    python test_video_swapsingle.py video_file ...

I would go with number 3, with pseudo code being something like.

this isn't tested, it's just to give you an idea

current_frame = 0
max_frame = 14

for frame_index in tqdm(range(frame_count)): 
        ret, frame = video.read()
        if  ret:
           current_frame += 1
           if current_frame == max_frame:
               Do something to empty video memory here
               current_frame = 0   
           
            detect_results = detect_model.get(frame,crop_size)

            if detect_results is not None:
            .....

Like I said, I haven't tested it, but it could be a bit of work to get it implemented from scratch as I haven't looked into how the models are loaded into memory yet. The above is definitely enough to work out your own solution though without messing with torch though.

from simswap.

ExponentialML commented on May 22, 2024 1

The way torch.cuda.empty_cache() works is it only frees the memory that it's able to. Remember what I said about me not being aware of how the models are loaded into memory with this project? This is what I was referring to. They may be instantiated in different parts of the script, so it may be a bit more work, but you can try what I said below.

Also, you're running that torch call every frame which isn't necessary / can lead to some issues. Also, you don't need to use user input to go to the next iteration. It probably runs out of memory because it's still executing in the background while waiting for your input. Try this instead (untested as I'm away from my machine).

# Add these two lines above the for loop.
current_frame = 0 
max_frame = 14 

for frame_index in tqdm(range(frame_count)): 
        ret, frame = video.read()
        if  ret:
        # If ret returns true, increment the current_frame counter by 1.
           current_frame += 1 
           # if the current frame count equals the max frame count, do something.
           if current_frame == max_frame: 
              # Let's empty the cache.
               torch.cuda.empty_cache() 
              # Reset the counter back to 0.
               current_frame = 0

from simswap.

instant-high commented on May 22, 2024

Ok.
Added Line #50 in util/videoswap.py
torch.cuda.empty_cache()
This lets me process 99 frames before out of memory....
I'll try to free memory also in the second for loop

from simswap.

ExponentialML commented on May 22, 2024

Ok.
Added Line #50 in util/videoswap.py
torch.cuda.empty_cache()
This lets me process 99 frames before out of memory....
I'll try to free memory also in the second for loop

Great to hear. I would try to create a little wrapper function where you can tune your own parameters (max frame count), and plug it into line 50 where it executes torch.cuda.empty_cache() every nth frame.

from simswap.

instant-high commented on May 22, 2024

Yes.
But why does it run out of memory after 99 frames even if i call "empty_cache" after each frame?
I cannot find anything filling the cache. Searched all other scripts in simswap.
Btw.:
I don't know much about python... just beginner after 30 years coding in (visual)basic and a little c++

from simswap.

instant-high commented on May 22, 2024

So I need little bit of help.
I've inserted the following code:

for frame_index in tqdm(range(frame_count)):
    torch.cuda.empty_cache() 
    ret, frame = video.read()
    if frame_index == 98:
        print (frame_index)
        input("Press Enter to continue...")
        break

Then it begins to write the video_file(1) containing the first 98 frames

Is there a way to jump back to video_swap but continue with frame 99 for the next 98 frames?
Call video_swap or something like goto video_swap?
After the break it just had to write video_file(2)
And so on...
Don't know if this would work and how to....

EDIT.
Got it to work as written above, but when calling swap_video again (break after 10 frames) it runs out of memory immediately.

from simswap.

instant-high commented on May 22, 2024

The user input is just for some testing purpose after 98 frames.
Resetting frame index to 0 would process the same part of the input video and overwrite the temporary image sequence...
I think I found a solution how to process the whole input via batch and some additional parameters in video_swapsingle.py (start frame, end frame) without the need to split it into shorter parts.
But I have a daytime job now.....

from simswap.

ExponentialML commented on May 22, 2024

Resetting frame index to 0 would process the same part of the input video and overwrite the temporary image sequence...

Read my proposed code again please. It's not about setting the frame index to 0, it's about creating a separate counter variable that increases a certain amount of times in the loop, and once it hits a certain limit (max_frame), the counter resets.

You said that your GPU runs out of memory every 15th frame or so. In theory, clearing your GPU cache via torch's methods (which may not work) or doing something to alleviate GPU resources every 15 frames would prevent you from doing all of the extra steps you've mentioned.

from simswap.

instant-high commented on May 22, 2024

Looks like i accidently got it to work on 2GB GPU VRAM....

Not the way I initially planned .... but it works.
No problem processing testvideo duration 23 sec. / 1396 frames.

As soon as I have cleaned up the code I will post the changes I've made.
(test_video_swapsingle.py / videoswap.py / test_options.py)

EDIT:
I will write a simple GUI (VB6 :-)

from simswap.

instant-high commented on May 22, 2024

Here are the changes I made to run SimSwap on 2GB VRAM:

./options/test_options.py

self.parser.add_argument("--first_frame", dest="first_frame", type=int, default=0, help="Set frame to start from.")

./util/videoswap.py
.
.
.
from util.add_watermark import watermark_image
#frame_index = 0
first_frame = 0
.
.
.
def video_swap(first_frame , video_path, id_vetor, swap_model, detect_model, save_path, temp_results_dir='./temp_results', crop_size=224, no_simswaplogo = False):
.
.
.
for frame_index in tqdm(range(first_frame,frame_count)):
torch.cuda.empty_cache()
ret, frame = video.read()
if frame_index == 1:
break
.
.
.
video.release()
if frame_index > 1:
image_filename_list = []
path = os.path.join(temp_results_dir,'*.jpg')
image_filenames = sorted(glob.glob(path))
clips = ImageSequenceClip(image_filenames,fps = fps)

test_video_swapsingle.py
.
.
.
first_frame = 0
video_swap(first_frame, opt.video_path, latend_id, model, app,
opt.output_path,temp_results_dir=opt.temp_path,no_simswaplogo=opt.no_simswaplogo)

first_frame = 2
video_swap(first_frame, opt.video_path, latend_id, model, app, opt.output_path,temp_results_dir=opt.temp_path,no_simswaplogo=opt.no_simswaplogo)

test_video_swapsingle.py calls video_swap in ./util/videoswap.py and processes the first 2 frames before the break
then it calls video_swap again, starting at frame 2 and runs until the end of the input file. Tested so far on 2600 frames but there seems to be no limit.
torch.cuda.empty_cache() clears the VRAM before processing every single frame...

Not perfect but (working for me).....

from simswap.

NNNNAI commented on May 22, 2024

Just found a more simple solution for "cuda out of memory" problem while running SimSwap on 2GB GPU:

I only insert a with torch.no_grad(): command in ../util/videoswap.py between lines 48 and 49
(and add 4 more spaces indent to every following line from 49 to 84)

and it works perfect

OMG， I forgot to add this .I will make it done in the next update.

from simswap.

instant-high commented on May 22, 2024

:-)
Came across it making some more mods to first order motion model and co-part segmentation

from simswap.

Limitation in processing number of video frames according to GPU memory? about simswap HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent