Giter VIP home page Giter VIP logo

dh_live's Introduction

Real-time Live Streaming Digital Human

实时直播数字人 bilibili video

Video Example

demo.mp4

Overview

This project is a real-time live streaming digital human powered by few-shot learning. It is designed to run smoothly on all 30 and 40 series graphics cards, ensuring a seamless and interactive live streaming experience.

Key Features

  • Real-time Performance: The digital human can interact in real-time with 25+ fps for common NVIDIA 30 and 40 series GPUs
  • Few-shot Learning: The system is capable of learning from a few examples to generate realistic responses.

Usage

Unzip the Model File

First, navigate to the checkpoint directory and unzip the model file:

cd checkpoint
gzip -d -c render.pth.gz.001 > render.pth

Prepare Your Video

Next, prepare your video using the data_preparation script. Replace YOUR_VIDEO_PATH with the path to your video:

python data_preparation YOUR_VIDEO_PATH

The result (video_info) will be stored in the ./video_data directory.

Run with Audio File

Run the demo script with an audio file. Make sure the audio file is in .wav format with a sample rate of 16kHz and 16-bit single channel. Replace video_data/test with the path to your video_info file, video_data/audio0.wav with the path to your audio file, and 1.mp4 with the desired output video path:

python demo.py video_data/test video_data/audio0.wav 1.mp4

Real-Time Run with Microphone

For real-time operation using a microphone, simply run the following command:

python demo_avatar.py

Acknowledgements

We would like to thank the contributors of wavlip, dinet, livespeechportrait repositories, for their open research and contributions.

License

This project is licensed under the MIT License.

Contact

For any questions or suggestions, please contact us at [[email protected]].

dh_live's People

Contributors

kleinlee avatar

Stargazers

JohnCachy avatar Eve avatar NeuroDonu avatar  avatar  avatar  avatar Dcz avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar liboxiao avatar  avatar  avatar firm-gold avatar Lai avatar zhipeng yu avatar  avatar webpon avatar DongdingLin avatar  avatar 曹佳伟 avatar Mario avatar  avatar  avatar 一掬净土 avatar 崖山漢魂 avatar kendricklee avatar  avatar  avatar  avatar  avatar  avatar Evan avatar Kory Kim avatar  avatar  avatar Kevin Yuan avatar  avatar  avatar  avatar satwik kumar avatar  avatar  avatar  avatar  avatar tongxin avatar  avatar  avatar  avatar RadEon9550_CN avatar MZ Chen avatar lmy avatar  avatar  avatar  avatar AI_master_workflow avatar  avatar xingray avatar  avatar Hi Boy`s and Girl avatar  avatar  avatar wuhao avatar  avatar lyawei avatar  avatar  avatar 颜玉刚 avatar  avatar 徐志祥 avatar  avatar  avatar leon avatar  avatar Yunfei Zhao avatar Carson Anick avatar 来去之迹 avatar  avatar Liam Parker avatar Devin avatar KUN avatar  avatar  avatar  avatar  avatar  avatar hhhaiai avatar Yinghuai Hong avatar  avatar Gavin avatar Guile Lindroth avatar Charles Song avatar  avatar  avatar hujili avatar tony avatar

Watchers

hhhaiai avatar Guile Lindroth avatar  avatar Ryan avatar balaji001@foxmail.com avatar  avatar

dh_live's Issues

Add a TTS text-driven lip-sync feature?

This is an excellent open-source project, sufficiently concise and lightweight, with a very friendly license. Thank you for your efforts. Are there any plans to add a TTS text-driven lip-sync feature? Thank you very much!

dataset to train the model

Dear author, thanks for your greate work. Could you share some information about training dataset? Such as dataset size, dataset person numbers?How to collect Chinese dataset? Thanks very much.

python demo_avatar.py

需要麦克风怎么使用的。我已经再本地打开了自己的麦克风但是还是没有任何反应

推理后视频消失,output文件夹为空

在推理时output文件夹内会产生一个包含视频文件的文件夹,推理完成后该文件夹消失,output一片空白。后台没有报错。不知道什么情况。
以下是cmd内容:

(dhlive) F:\SoVITS\DH_live>python demo.py video_data/test video_data/audio0.wav 1.mp4
(256, 256, 3)
Video path is set to: video_data/test
Audio path is set to: video_data/audio0.wav
output video name is set to: 1.mp4
C:\Users\Administrator\anaconda3\envs\dhlive\Lib\site-packages\sklearn\base.py:376: InconsistentVersionWarning: Trying to unpickle estimator PCA from version 1.3.0 when using version 1.5.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
F:\SoVITS\DH_live\talkingface\audio_model.py:47: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
self.__net.load_state_dict(torch.load(ckpt_path))
F:\SoVITS\DH_live\talkingface\render_model.py:37: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
checkpoint = torch.load(ckpt_path)
0%| | 0/186 [00:00<?, ?it/s]C:\Users\Administrator\anaconda3\envs\dhlive\Lib\site-packages\torch\nn\functional.py:4373: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
warnings.warn(
100%|████████████████████████████████████████████████████████████████████████████████| 186/186 [00:09<00:00, 18.74it/s]

内存泄漏 ?

发现在运行过程中, 内存使用量持续增加

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.