hi, how come you calculate the fvd score with the output of the logits layer? Does

Hello, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

FVD calculation about common_metrics_on_video_quality HOT 4 CLOSED

oszilevi commented on May 27, 2024 1

FVD calculation

from common_metrics_on_video_quality.

Comments (4)

JunyaoHu commented on May 27, 2024 2

Hello, @oszilevi, does my answer explain this question clearly? You can use my recently updated repository, which has been stripped of redundant code. 😊

from common_metrics_on_video_quality.

JunyaoHu commented on May 27, 2024

Thank you for your question.

TLDR

As for the FVD calculation implementation of styleganv, there is no need to use the function get_fvd_logits.

our implementation

In our calculation process calculate_fvd.py,

Step 1: preprocess the video to ensure the proper shape and value range.

common_metrics_on_video_quality/calculate_fvd.py

Lines 30 to 31 in 241f953

 videos1 = trans(videos1) 

 videos2 = trans(videos2)

Step 2: use pre-trained model (i3d) to get the feature space tensor of the video.

common_metrics_on_video_quality/calculate_fvd.py

Lines 47 to 48 in 241f953

 feats1 = get_fvd_feats(videos_clip1, i3d=i3d, device=device) 

 feats2 = get_fvd_feats(videos_clip2, i3d=i3d, device=device)

Step 3: calculate FVD score

common_metrics_on_video_quality/calculate_fvd.py

Line 51 in 241f953

fvd_results[clip_timestamp] = frechet_distance(feats1, feats2)

about Step 2

In our function get_fvd_feats, it calls the function get_feats.

common_metrics_on_video_quality/fvd/fvd.py

Lines 53 to 57 in 241f953

 def get_fvd_feats(videos, i3d, device, bs=10): 

 # videos in [0, 1] as torch tensor BCTHW 

 # videos = [preprocess_single(video) for video in videos] 

 embeddings = get_feats(videos, i3d, device, bs) 

 return embeddings

the input of the function get_feats is the video, and the output is the i3d embedding. and the parameter return_features=True makes the output as features not logits.

common_metrics_on_video_quality/fvd/fvd.py

Lines 43 to 50 in 241f953

 def get_feats(videos, detector, device, bs=10): 

 # videos : torch.tensor BCTHW [0, 1] 

 detector_kwargs = dict(rescale=False, resize=False, return_features=True) # Return raw features before the softmax layer. 

 feats = np.empty((0, 400)) 

 with torch.no_grad(): 

 for i in range((len(videos)-1)//bs + 1): 

 feats = np.vstack([feats, detector(torch.stack([preprocess_single(video) for video in videos[i*bs:(i+1)*bs]]).to(device), **detector_kwargs).detach().cpu().numpy()]) 

 return feats

there is no need to use the function get_fvd_logits. I didn't comment out this function, and I'm sorry if that misinterpreted you, but this code is redundant.

from common_metrics_on_video_quality.

oszilevi commented on May 27, 2024

yes! Thank u very much 👍🏻

from common_metrics_on_video_quality.

JunyaoHu commented on May 27, 2024

Hello, I updated the repo just now, and it can support 2 pytorch FVD calculation methods (styleganv and videogpt).

As you say, the method calculates 'the fvd score with the output of the logits layer' is the implementation of videogpt.

I must say the method of videogpt is not wrong, instead, is also right, maybe its function name is not good...

Actually, logits are features

In google's origin I3D model, the tail of the model structure is 'Mixed_5c', 'Logits', and 'Predictions'. here

The two models both remove the module of 'Predictions' (softmax):

As for i3d_pretrained_400.pt, it use the I3D model file in our repo, the model end with 'Logits', without 'Predictions' (softmax). So it is right.
As for i3d_torchscript.pt, it use the parameter return_features=True. So is it right.

Finally, in our repo, I copied the file with the original function name, only importing it with an alias to refer to it.

common_metrics_on_video_quality/calculate_fvd.py

Lines 15 to 22 in 7098471

 def calculate_fvd(videos1, videos2, device, method='mcvd'): 

 if method == 'mcvd': 

 from fvd.styleganv.fvd import get_fvd_feats, frechet_distance, load_i3d_pretrained 

 elif method == 'videogpt': 

 from fvd.videogpt.fvd import load_i3d_pretrained 

 from fvd.videogpt.fvd import get_fvd_logits as get_fvd_feats 

 from fvd.videogpt.fvd import frechet_distance

I apologize for my upper incomplete and incorrect answers.

from common_metrics_on_video_quality.

FVD calculation about common_metrics_on_video_quality HOT 4 CLOSED

Comments (4)

TLDR

our implementation

about Step 2

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	feats1 = get_fvd_feats(videos_clip1, i3d=i3d, device=device)
	feats2 = get_fvd_feats(videos_clip2, i3d=i3d, device=device)

	def get_fvd_feats(videos, i3d, device, bs=10):
	# videos in [0, 1] as torch tensor BCTHW
	# videos = [preprocess_single(video) for video in videos]
	embeddings = get_feats(videos, i3d, device, bs)
	return embeddings

	def get_feats(videos, detector, device, bs=10):
	# videos : torch.tensor BCTHW [0, 1]
	detector_kwargs = dict(rescale=False, resize=False, return_features=True) # Return raw features before the softmax layer.
	feats = np.empty((0, 400))
	with torch.no_grad():
	for i in range((len(videos)-1)//bs + 1):
	feats = np.vstack([feats, detector(torch.stack([preprocess_single(video) for video in videos[ibs:(i+1)bs]]).to(device), **detector_kwargs).detach().cpu().numpy()])
	return feats

	def calculate_fvd(videos1, videos2, device, method='mcvd'):

	if method == 'mcvd':
	from fvd.styleganv.fvd import get_fvd_feats, frechet_distance, load_i3d_pretrained
	elif method == 'videogpt':
	from fvd.videogpt.fvd import load_i3d_pretrained
	from fvd.videogpt.fvd import get_fvd_logits as get_fvd_feats
	from fvd.videogpt.fvd import frechet_distance