Giter VIP home page Giter VIP logo

h264codec's Introduction

rnd-codec-dev

RnD Codec

Deep architecture for H.264 I-Frame decoding

This repository is the implementation of Deep architecture for H.264 decoding.

Requirements

To install requirements:

pip install -r requirements.txt

To download dataset: tar.gz

This dataset is also accessible on Hugging Face, under the name BotniVision/h264_images.

Data preprocess:

  • untar the dataset
  • open h264/src/utils/cleanup_samples.py, make sure the dataset folders are correct
  • run:
python3 h264/src/utils/cleanup_samples.py

Training

To train the model(s) in the paper, run this command:

python3 h264/src/stages/train_h264_ae_dalle.py --config params_bytes_ae_mssiml1_vgg.yaml --basedir h264

!You need to enter the wandb password.

Sample data

To generate samples on Test Set, run:

python h264/src/stages/eval_h264_ae_dalle.py --config h264/params_bytes_ae_eval.yaml --basedir h264

!You can configure test samples in config.yaml/dataset/traintest, config.yaml/dataset/validtest or dataset.csv.

Pre-trained Models

You can download pretrained models here:

  • model 1 trained 800 epochs on 1.8k(duplicated samples) dataset h264_v20231127.
  • model 2 trained 500 epochs on 1.3k dataset h264_v20231127.
  • model 3 trained 250 epochs on 1M dataset h264_v20231127, h264_v20240206_1, h264_v20240206_2, h264_v20240206_3, h264_v20240206_4, h264_v20240206_5, h264_v20240206_6, h264_v20240206_7.

Results

Our model achieves the following performance on :

Model name epochs Training Loss Validation Loss Dataset
model 1 800 7.93 30.632 1.8k(duplicated samples)
model 2 500 12.863 30.448 1.3k
model 3 250 20.03 27.348 1M

#Related Work

https://github.com/apple/ml-cvnets

https://github.com/AntixK/PyTorch-VAE/blob/master/models/mssim_vae.py#L182

https://github.com/crowsonkb/vgg_loss

https://github.com/psyrocloud/MS-SSIM_L1_LOSS/tree/main

https://github.com/openai/DALL-E

h264codec's People

Contributors

annyfan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.