zzh-tech / bit Goto Github PK

[CVPR2023] Blur Interpolation Transformer for Real-World Motion from Blur

Home Page: https://zzh-tech.github.io/BiT/

License: MIT License

Python 98.83% Shell 1.17%

beam-splitter computer-vision cvpr cvpr2023 dataset deblurring deep-learning image-enhancement image-restoration image-to-video low-level-vision pytorch pytorch-implementation real-world-data video-deblurring video-enhancement video-frame-interpolation video-restoration

bit's Introduction

Hi there 👋

🌱 I’m currently a researcher at Shanghai AI Lab (AI for Sports team).
🔭 My current research interests:
- image/video restoration and enhancement
- 4D motion reconstruction and editing
✉️ Email: zhongzhihang [at] pjlab.org.cn
🍉 Website: https://zzh-tech.github.io/

We are looking for highly self-motivated students, and interns!
👉 If you are interested, please email me with your resume ;-)

bit's People

Contributors

Stargazers

Watchers

Forkers

jackzhousz rqfzpy juwairen33 ip-restoration yiyi1333 hanzc989 nooobkevin yeh-h

bit's Issues

I want to test with a single image,can you help me?

A minor issue with the config .yaml files

Hello, I am trying to run the example code you provided and noticed a potential issue with the .yaml config files. It seems that all of them contain absolute paths instead of relative paths:

I believe this might not be intended and wanted to bring it to your attention. Thank you for your work, and I'm excited to try it out!

How to deblur a video?

Thanks for your excellent work.
I have found that the demo is capable of deblurring images using a set of three images.
However, I am wondering if you could offer some guidance on how to deblur a video instead.
Would it be appropriate to deblur each frame using information from the previous and next frames?
I appreciate your help with this matter.

Questions about Pre-BiT++

Many thanks for the RBI dataset with real motion blur. In my opinion, this is a real revolution! It will finally be possible to train Joint Video Deblurring and Frame Interpolation models on a dataset with real motion blur. Also thanks for developing the BiT models and making them available for download.

I am creating on GitHub Video Frame Interpolation Rankings and Video Deblurring Rankings, where each ranking includes only the single best model for one method.

I now intend to add rankings based on the RBI dataset and I make no secret that these will be the most important rankings in my repository. The best results based on your paper were achieved by the Pre-BiT++ model. I have a couple of questions in relation to this:

Did I understand correctly that this Pre-BiT++ model is the model trained on Adobe240 and then on RBI?
Does Pre-BiT++ also achieve better results visually compared to BiT++(RBI) on the RBI dataset? I mean does Pre-BiT++ not introduce artifacts such as shown in Figure 5 in your paper as in the case of BiT++(Adobe240)?
If you were to apply your method to a real movie to be judged by human vision would you choose Pre-BiT++ instead of BiT++(RBI)?

May I ask what computing resources you used during the training process and how long the training took?

Questions about the dataset

First of all, congratulations on such a great public work, but I have some questions to ask you：

I would like to use your dataset to train a single image input, single image output deblurring model, I want to know how to find the sharp image corresponding to the blurred image.

huggingface example

Thank you for the interesting research, your results look very convincing.
Please provide https://huggingface.co/ example.

Request for a practical Pre-BiT++ model trained on perceptual loss

I'm a big fan of the practical use of video frame interpolation AI models to make watching movies, TV series and other video content as close to real life as possible. I also believe that for an even better representation of real life, the next step towards even better realism is to use joint video deblurring and frame interpolation AI models, due to the fact that almost all footage recorded at 24fps contains around 20.8ms of motion blur in each frame.

I am hugely grateful to you for the RBI dataset with real motion blur, as this will finally make it possible to develop models that will perform well with real video footage. Thank you also for the information about the Pre-BiT++ model, which is trained on Adobe240 and then on RBI in order to get even better results.

Only a practical Pre-BiT++ model trained on perceptual loss is missing to make it perfect. Why is it so important to train on perceptual loss to use models in practice? I described it in detail in the introduction to the rankings here: https://github.com/AIVFI/Video-Frame-Interpolation-Rankings-and-Video-Deblurring-Rankings

In short: training on perceptual loss recovers more fine details, which is more pleasing to the human eye. This is particularly important for models such as BiT, where all video frames will be replaced by new frames, unlike video frame interpolation models where the original frames are preserved. In addition, BiT, by removing motion blur and giving clear and sharp output frames, will further benefit from the ability to recovers fine details through training on perceptual loss.

So here is my big request to you to train a practical Pre-BiT++ model on perceptual loss. Unfortunately I am not a programmer myself and have no knowledge or skills in this area. These rankings of mine above are the pinnacle of my abilities and a way to connect with those who do model development on a daily basis. In this way, I want to help enthusiasts like me to find the best model for practical applications. I believe that a practical Pre-BiT++ trained on perceptual loss may be the best model for practical use, hence my request.

I also think that such a model would also attract even more attention to your repository, which is also important to me, as I want to see more models trained on RBI dataset with real motion blur in the future.

At the moment, of the 3 most popular frame interpolation methods on GitHub https://github.com/search?o=desc&q=Frame+Interpolation&s=stars&type=Repositories :

7.9k stars - DAIN (CVPR 2019)
3.4k stars - RIFE (ECCV2022)
2.1k stars - FILM (ECCV 2022)

developers of as many as two of those: RIFE and FILM have provided additional practical models that, although they do not reach as high PSNR and SSIM as the primary models of these methods, offer much better perceptual quality.

Thus, I believe that a practical Pre-BiT++ model trained on perceptual loss can gain very wide interest not only from researchers but also from a wide range of enthusiasts for restoring realism to movies, TV series and other video footage.

Issue When I try to train Train BiT++ on Colab

I am trying to train your model on Google colab uisng following command:
!python -m torch.distributed.launch --nproc_per_node=1 train_bit.py --config ./configs/bit++_rbi.yaml
But i get following error (most probably regarding some issue related to the GPU available on Colab):

`ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 7212) of binary: /usr/local/bin/python
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/usr/local/lib/python3.10/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train_bit.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2023-06-19_09:52:32
host : aea723e3180b
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 7212)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================`

As I dont know about the backend settings needed to run this code regarding GPU, can you please guide me how to solve this error

Channel Similarity

Hi there, I read your paper and was intrigued by Figure 6b. I was wondering, how did you visualise this and is there any code to do this?
Thanks