kleinlee / dh_live Goto Github PK

View Code? Open in Web Editor NEW

263.0 6.0 52.0 46.75 MB

每个人都能用的数字人

Python 100.00%

dh_live's Introduction

Real-time Live Streaming Digital Human

实时直播数字人 bilibili video

Video Example

demo.mp4

Overview

This project is a real-time live streaming digital human powered by few-shot learning. It is designed to run smoothly on all 30 and 40 series graphics cards, ensuring a seamless and interactive live streaming experience.

Key Features

Real-time Performance: The digital human can interact in real-time with 25+ fps for common NVIDIA 30 and 40 series GPUs
Few-shot Learning: The system is capable of learning from a few examples to generate realistic responses.

Usage

Unzip the Model File

First, navigate to the checkpoint directory and unzip the model file:

cd checkpoint
gzip -d -c render.pth.gz.001 > render.pth

Prepare Your Video

Next, prepare your video using the data_preparation script. Replace YOUR_VIDEO_PATH with the path to your video:

python data_preparation YOUR_VIDEO_PATH

The result (video_info) will be stored in the ./video_data directory.

Run with Audio File

Run the demo script with an audio file. Make sure the audio file is in .wav format with a sample rate of 16kHz and 16-bit single channel. Replace video_data/test with the path to your video_info file, video_data/audio0.wav with the path to your audio file, and 1.mp4 with the desired output video path:

python demo.py video_data/test video_data/audio0.wav 1.mp4

Real-Time Run with Microphone

For real-time operation using a microphone, simply run the following command:

python demo_avatar.py

Acknowledgements

We would like to thank the contributors of wavlip, dinet, livespeechportrait repositories, for their open research and contributions.

License

This project is licensed under the MIT License.

Contact

For any questions or suggestions, please contact us at [[email protected]].

dh_live's People

Contributors

Stargazers

Watchers

dh_live's Issues

实现实时交互了吗？

python demo_avatar.py 实现实时交互了吗？

Unzip render.pth.gz file fail

1.git clone this project
2.unzip this two file --> fail

请问可以用单张图片驱动吗

很好的项目，只是文档资料几乎为0

很好的项目，只是文档资料几乎为0，希望作者能够补充：“训练说明”

Add a TTS text-driven lip-sync feature？

This is an excellent open-source project, sufficiently concise and lightweight, with a very friendly license. Thank you for your efforts. Are there any plans to add a TTS text-driven lip-sync feature? Thank you very much!

dataset to train the model

Dear author, thanks for your greate work. Could you share some information about training dataset? Such as dataset size, dataset person numbers?How to collect Chinese dataset? Thanks very much.

在video_data文件夹下，只生成了circle.mp4文件，没有生成keypoint_rotate.pkl

运行python data_preparation.py ***后，只生成了circle.mp4文件，没有生成keypoint_rotate.pkl

怎么训练的，有说明文档吗？

python demo_avatar.py

需要麦克风怎么使用的。我已经再本地打开了自己的麦克风但是还是没有任何反应

Can this project do real-time S-R?

real-time super resolution

推理后视频消失，output文件夹为空

在推理时output文件夹内会产生一个包含视频文件的文件夹，推理完成后该文件夹消失，output一片空白。后台没有报错。不知道什么情况。
以下是cmd内容：

(dhlive) F:\SoVITS\DH_live>python demo.py video_data/test video_data/audio0.wav 1.mp4
(256, 256, 3)
Video path is set to: video_data/test
Audio path is set to: video_data/audio0.wav
output video name is set to: 1.mp4
C:\Users\Administrator\anaconda3\envs\dhlive\Lib\site-packages\sklearn\base.py:376: InconsistentVersionWarning: Trying to unpickle estimator PCA from version 1.3.0 when using version 1.5.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
F:\SoVITS\DH_live\talkingface\audio_model.py:47: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
self.__net.load_state_dict(torch.load(ckpt_path))
F:\SoVITS\DH_live\talkingface\render_model.py:37: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
checkpoint = torch.load(ckpt_path)
0%| | 0/186 [00:00<?, ?it/s]C:\Users\Administrator\anaconda3\envs\dhlive\Lib\site-packages\torch\nn\functional.py:4373: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
warnings.warn(
100%|████████████████████████████████████████████████████████████████████████████████| 186/186 [00:09<00:00, 18.74it/s]