Giter VIP home page Giter VIP logo

dvqa's Introduction

DVQA - Deep learning-based Video Quality Assessment

News

  • 12/17/2019 add pretrained model on PGC videos

Installation

We recommend to run the code with virtualenv. The code is developed with Python3.

Please install other prerequisites with the following command after invoking a virtual env.

pip install -r requirements.txt

All packages are required to run the code.

Dataset

Please prepare a dataset if you want to evaluate in batch or train the code from scratch on your own GPUs. The dataset should be in json format, e.g. your_dataset.json

{
    "test": {
        "dis": ["dis_1.yuv", "dis_2.yuv"],
        "ref": ["ref_1.yuv", "ref_2.yuv"],
        "fps": [30, 24],
        "mos": [94.2, 55.8],
        "height": [1080, 720],
        "width": [1920, 1280]
    },
    "train": {
        "dis": ["dis_3.yuv", "dis_4.yuv"],
        "ref": ["ref_3.yuv", "ref_4.yuv"],
        "fps": [50, 24],
        "mos": [85.2, 51.8],
        "height": [320, 720],
        "width": [640, 1280]
    }
}

For the time being, only YUV is supported. We will update modules to read bitstream.

Eval a dataset

Put all YUV files (both dis and ref) in a folder and prepare your_dataset.json accordingly. Invoke virtualenv and run:

python eval.py --multi_gpu --video_dir /dir/to/yuv --score_file_path /path/to/your_dataset.json --load_model ./save/model_pgc.pt

Train from scratch

Prepare dataset as above and simply run:

python train.py --multi_gpu --video_dir /dir/to/yuv --score_file_path /path/to/your_dataset.json --save_model ./save/your_new_trained.pt

Please check train.sh and opts.py if you would like to tweak other hyper-parameters.

Known issues

The pretrained model was trained on 720P PGC videos compressed with H.264/AVC. It runs well with video of a resolution 1920x1080 and below.

We are not sure about the performance when the code is run with the following scenario,

  1. PGC with other distortion types, especially time-related distortions.
  2. PGC with post-processing filters, like de-nosing, super-resolution, artifacts reduction, etc.
  3. UGC videos with pre-processing filter.
  4. UGC videos compressed with common codecs.

We will try to answer above questions. Stay tuned.

dvqa's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dvqa's Issues

Minimum hardware requirement for the evaluation of 720p/1080p video

Thanks for the release of DVQA.
I have run the eval.py on my workstation, which has the following specs. But somehow I always meet the error "CUDA out of memory" when evaluating 720p and 1080p videos. Only videos with the resolution of 540p or lower worked fine . I am wondering if there is any configuration I can modify to evaluate 720p/1080p videos on my workstation, or simply I just need to upgrade the hardware.

OS: Windows 10 home
CPU: Intel i5-10400
RAM: 16GB
GPU: RTX2060 with 6GB memory
SSD: 256GB

Looking forward to your reply, Thanks a lot.

”dis" and "ref" in dataset json file

Very grateful the release of DVQA.
I have several questions.

  1. The dis yuv and corresponding ref(has the same index in json file) yuv are yuv files clipped from one source video, and must have the same size when coverted to tensor. Am i right?
  2. What is the meaning of ref and dis(if i have a mp4 video to predict mos, according to what rule can I create ref and dis file)? Please tell me so i can create my own dis and ref yuv file, thank you.

The performance on LIVE-VQA dataset.

Hi, I have trained the C3DVQA model on LIVE-VQA dataset, and the performance is very strange.

I find that in this line:

tst_mos.append(100.0 - float(mos[0]))

the mos should not be subtracted by 100.

Like what's shown in the following table:

change SROCC PLCC
in paper \ 92.61 91.22
origin 100-mos 30.46 29.3
bug fixed mos 31.43 42.2

And the performance is not as well as those in paper C3DVQA.

Can you help me? I'm new to this field. Thanks a lot!

How to use it by CPU?

I have prepared the envirionment.But I want to use by CPU.Could you tell me or add the way about how to use it by CPU instead of GPU

为什么对视频的评分需要的内存如此之大!

我需要测试的视频是yuv大小大约是433M 1366x768 大约280多帧。使用CPU机器进行评分。机器内存4核32G。 运行一段时间。提示需要52G的内存。为何需要如此之大的内存?能否解答下。或者是否有参数可以调节。

a question about our eval results

Hi tommyhq,

We now use your model to evaluate our own video samples,

image

the only differences between the samples are video bitrate, the bitrate of reference video is 1600Kbit/s,
and the 3 dis ones are 400kbit/s ,800kbit/s and 1200kbits/s,

Why higher bitrates has a lower PRED score??
image

The MOS provided is subjectively collected or pseudo ?

Very grateful the release of DVQA.

It seems that the provided dataset is from JND-VideoSet. But clearly VideoSet just provides the JND threshold to my knowledge. So, the MOS provided here is a subjectively collected result from experiment ? or a pseudo one with some tricks?

the performance when using some filters (USM) is not good

when i test it on a video using unsharp mask filter, the mos is better, but the value of dqva is lower,the value of vmaf is higher; may be caused by the training data(more likely), or by the network (the use of residual frames);

any solution for prefilter or postprefilter?

The question of the training setting of C3DVQA

你好, 在复现论文的过程中,发现性能结果与论文相差较大(训练、测试集划分以及具体的学习率等均采用原始代码所给参数)。想咨询一下,具体的论文中复现性能所采用的细节。
1.首先是视频帧的选取,原始代码中所采用的是跳帧操作。但我注意到论文中的描述是 Training segments are randomly cropped from videos for data augmentation.We select a random temporal position and sample a clip with 60 frames.如果我的理解没有错误的话,这里的意思是每个视频随机选取一个60帧的连续片段。 所以,我想问一下,这个结果是按照开源代码的操作还是论文的操作实现的?亦或是我的这部分理解有问题?
2.另外,以上两种方法我都尝试过,但似乎性能仍有较大差距。在论文中提到的重复实验部分,具体是怎么重复的,如果是对于数据集重新划分后实验,能否提供10次对应的划分json文件.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.