Currently Vbench can evaluate on the list of dimension ['subject_consistency', 'ba

score range for each dimension? about vbench HOT 5 OPEN

rebuttalpapers commented on July 28, 2024

score range for each dimension?

from vbench.

Comments (5)

ziqihuangg commented on July 28, 2024

Hi, thanks for your question! The score range for all dimensions are 0 to 1.

For samples of different scores, at different dimensions, you can refer to section G of supplementary materials: https://arxiv.org/pdf/2311.17982, where for each dimension we provided some samples at varying scores.

from vbench.

rebuttalpapers commented on July 28, 2024

Thanks Ziqi for your helpful answer. May I ask why we have
dynamic degree as false,
while subject consistency & imaging_quality much larger than 1?

from vbench.

ziqihuangg commented on July 28, 2024

Hi, I assume you are asking about the scores for individual videos in the generated eval_results.json file.

For dynamic_degree, each video undergoes binary classification, with true referring to dynamic, while false referring to static. The final score for the dynamic_degree dimension is defined as the percentage of videos classified as dynamic.

For other dimensions that have values larger than 1, it could be due to these two reasons: (1) The individual videos' raw score is in the range of 0-100. (2) The individual video's raw score hasn't been divided by the frame count yet.

We retain these raw scores for individual videos in case users need them for debugging. However, you should refer to the final aggregated score for each dimension to assess the model's performance in that particular dimension.

from vbench.

Jason-xin commented on July 28, 2024

Hi, I assume you are asking about the scores for individual videos in the generated eval_results.json file.

For dynamic_degree, each video undergoes binary classification, with true referring to dynamic, while false referring to static. The final score for the dynamic_degree dimension is defined as the percentage of videos classified as dynamic.

For other dimensions that have values larger than 1, it could be due to these two reasons: (1) The individual videos' raw score is in the range of 0-100. (2) The individual video's raw score hasn't been divided by the frame count yet.

We retain these raw scores for individual videos in case users need them for debugging. However, you should refer to the final aggregated score for each dimension to assess the model's performance in that particular dimension.

how can I get the original classification probability of dynamic_degree?

from vbench.

ziqihuangg commented on July 28, 2024

Hi, it's not probability-based classification, but based on threshold.

from vbench.

Recommend Projects

score range for each dimension? about vbench HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent