moorethreads / moore-animateanyone Goto Github PK
View Code? Open in Web Editor NEWCharacter Animation (AnimateAnyone, Face Reenactment)
License: Apache License 2.0
Character Animation (AnimateAnyone, Face Reenactment)
License: Apache License 2.0
I generated the video using
1.image: config/ref_images/anyone-3.png
2.pose video: config/pose_videos/anyone-video-5_kps.mp4
However, the face effect on the video is inconsistent with the face effect shown
The result is:
https://github.com/MooreThreads/Moore-AnimateAnyone/assets/98437692/d7bafef8-d13c-4f67-b966-587b9d20de3d
I have a question: unet_3d, resnet_3d, and transform_3d only deal with dimensional transformations, yet there is no indication anywhere that 3D computation is necessary.
I noticed that different transformation operations are used for pose and image.
For pose:
self.cond_transform = transforms.Compose(
[
transforms.RandomResizedCrop(
self.img_size,
scale=self.img_scale,
ratio=self.img_ratio,
interpolation=transforms.InterpolationMode.BILINEAR,
),
transforms.ToTensor(),
]
)
For image:
self.transform = transforms.Compose(
[
transforms.RandomResizedCrop(
self.img_size,
scale=self.img_scale,
ratio=self.img_ratio,
interpolation=transforms.InterpolationMode.BILINEAR,
),
transforms.ToTensor(),
transforms.Normalize([0.5], [0.5]),
]
)
Why does pose not require the final normalization step?
move to poetry.
I tested several pose videos.The generated video is only 2 seconds long. How to modify the code?
Thank you for sharing the inference code and models @lixunsong ; the results are good, but further optimization is needed for stable outcomes. I would highly appreciate it if you could open source the training code as well.
https://github.com/MooreThreads/Moore-AnimateAnyone#%EF%B8%8F-examples
Also any preconditioning and post editting?
TypeError: 'weights_only' is an invalid keyword argument for Unpickler()
Congratulations on achieving such amazing results!!!
Both cartoons and real person can make smooth motion, so I have a question on which type of datasets did you use during training, like ubc datasets or dataset from tiktok?
RuntimeError: PytorchStreamReader failed reading file data/383: invalid header or archive is corrupted
Is there anyway to increase the length of the output video longer than 4sec?
Hello,
Thank you so much for releasing the training code. What is the GPU VRAM required for training? Say if one wants to train it using single A100 (40GB) how long will it take to get very good results?
OSError: Error no file named config.json found in directory ./pretrained_weights/stable-diffusion-v1-5/.
We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like ./pretrained_weights/sd-vae-ft-mse is not the path to a directory containing a config.json file.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/diffusers/installation#offline-mode
Traceback (most recent call last):
File "Moore-AnimateAnyone\app.py", line 16, in <module>
from src.models.unet_2d_condition import UNet2DConditionModel
File "Moore-AnimateAnyone\src\models\unet_2d_condition.py", line 40, in <module>
from .unet_2d_blocks import (
File "Moore-AnimateAnyone\src\models\unet_2d_blocks.py", line 15, in <module>
from .transformer_2d import Transformer2DModel
File "Moore-AnimateAnyone\src\models\transformer_2d.py", line 7, in <module>
from diffusers.models.embeddings import CaptionProjection
ImportError: cannot import name 'CaptionProjection' from 'diffusers.models.embeddings'
Hello Moore AnimateAnyone team,
I've been exploring your remarkable project and am interested in applying it to a specific domain by fine-tuning the pre-trained model on a small, domain-specific dataset. I would appreciate some guidance on the best practices for fine-tuning the model effectively. My questions are as follows:
Model Weight Initialization: For fine-tuning, is it recommended to initialize the model with the provided pre-trained weights and then continue training on the new dataset? If so, could you provide an example or guidance on loading the pre-trained weights correctly before starting the fine-tuning process?
Two-Stage Training Process: The training process for the model is described as two-stage. Should fine-tuning on a new dataset also follow this two-stage approach, or are there any modifications or considerations we should be aware of for fine-tuning?
Data Preparation and Augmentation: For fine-tuning on a small dataset, are there any specific data preparation or augmentation techniques you recommend to prevent overfitting and ensure the model generalizes well to the new domain?
Hyperparameter Adjustments: Are there any specific hyperparameters (e.g., learning rate, batch size) that you suggest tweaking for fine-tuning as opposed to training from scratch?
Evaluation during Fine-tuning: What are the best practices for evaluating the model during the fine-tuning process to ensure that it's adapting well to the new dataset without forgetting the knowledge gained during pre-training?
Any guidance, examples, or additional resources you could provide would be greatly appreciated. Fine-tuning deep learning models can be nuanced, and insights from the creators would be invaluable.
Thank you for your work on this innovative project and for your support to the community.
i have tried alll torch version which version work?
mat1 and mat2 shapes cannot be multiplied (2x1024 and 768x320)
Any plans to support MPS? Thanks!
The demo at https://huggingface.co/spaces/xunsong/Moore-AnimateAnyone fails, even when using the example inputs.
When clicking the animate button, we only see the word "Error". Chrome reports this message:
iframeResizer.contentWindow.js:171 [iFrameSizer][iFrameResizer0] No tagged elements (data-iframe-height) found on page
someone be so kind as to make a google colab to test this thanks :)
Thank you very much for providing your valuable code and model. The current model test results may need further optimization.
In addition, can you open source the training code? I would be very grateful.
Dear Moore-AnimateAnyone Contributors,
I hope this message finds you well. I have been thoroughly exploring the capabilities of the Moore-AnimateAnyone repository and am deeply impressed by the strides made in animating still images with such remarkable results. The demo hosted on HuggingFace Spaces is particularly indicative of the potential this technology holds.
However, upon delving into the examples provided and running my own tests, I have observed certain limitations that I believe, if addressed, could significantly elevate the quality of the animations produced. I would like to propose a few enhancements that could potentially mitigate these issues and refine the overall animation process.
Background Artifacts: The presence of artifacts in animations, especially when the reference image has a clean background, can be quite distracting. Could we consider implementing a more robust background detection and preservation algorithm to maintain the integrity of the original image?
Scale Mismatch: The suboptimal results due to scale mismatch between the reference image and keypoints are noticeable. While the paper suggests preprocessing techniques, their implementation is not yet apparent in the current version. Could we prioritise the integration of these preprocessing techniques to improve the handling of scale variations?
Motion Subtleties: The flickering and jittering in animations with subtle motions or static scenes detract from the fluidity of the animation. Would it be possible to introduce a smoothing mechanism or a motion threshold to ensure that only significant movements are translated into the animation sequence?
I understand that these enhancements may involve considerable research and development efforts, but I believe they could be instrumental in pushing the boundaries of what Moore-AnimateAnyone can achieve. Additionally, these improvements could be pivotal in the deployment of this technology on the MoBi MaLiang AIGC platform, ensuring a more polished and professional output for end-users.
I am keen to follow the progress of this project and am more than willing to contribute to discussions or testing, should you find my feedback of value.
Thank you for your dedication to this innovative project, and I look forward to your thoughts on the potential for these enhancements.
Best regards,
yihong1120
every time i run this I get the following error:
Traceback (most recent call last):
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\diffusers\models\modeling_utils.py", line 109, in load_state_dict
return torch.load(checkpoint_file, map_location="cpu")
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 1028, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\diffusers\models\modeling_utils.py", line 122, in load_state_dict
raise ValueError(
ValueError: Unable to locate the file ./pretrained_weights/stable-diffusion-v1-5/unet\diffusion_pytorch_model.bin which is necessary to load this pretrained model. Make sure you have saved the model properly.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\routes.py", line 534, in predict
output = await route_utils.call_process_api(
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\route_utils.py", line 226, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1554, in process_api
result = await self.call_function(
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1192, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 659, in wrapper
response = f(*args, **kwargs)
File "C:\AI\AnimateAnyone\Moore-AnimateAnyone\app.py", line 52, in animate
reference_unet = UNet2DConditionModel.from_pretrained(
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\diffusers\models\modeling_utils.py", line 800, in from_pretrained
state_dict = load_state_dict(model_file, variant=variant)
File "C:\Users\henso\AppData\Local\Programs\Python\Python310\lib\site-packages\diffusers\models\modeling_utils.py", line 127, in load_state_dict
raise OSError(
OSError: Unable to load weights from checkpoint file for './pretrained_weights/stable-diffusion-v1-5/unet\diffusion_pytorch_model.bin' at './pretrained_weights/stable-diffusion-v1-5/unet\diffusion_pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
any solutions?
Very good job! I run you code in Colab,use anyone-video-2 kpts in your lib, just choose my reference img, but the results seem to no good, can you check it?
But the iterations continue.
Once completed a load of noise , this was length 32 , from a 25 frame source video
Is there an easy way to use safetensor models with the pipeline?
I have a few merges I would like to try.
As the title mentioned , I want to use other motion sequences to test the model
In src/pipelines/utils.py
, the fuction set_tensor_interpolation_method
doesn't look like it's ever been used. I searched for this function name globally in VSCode and found that it only appeared once, when it was defined.
Then I found out that the variable tensor_interpolation
will only be modified in this function, which means that the return value get_tensor_interpolation_method
function is always None, if I understand correctly.
The function get_tensor_interpolation_method
used when building the Pose2VideoPipeline, and I'm not sure if this will affect the results.
As the title says. Which config file is needed there?
File "Moore-AnimateAnyone/src/models/mutual_self_attention.py", line 180, in hacked_basic_transformer_inner_forward
norm_hidden_states[_uc_mask],
IndexError: The shape of the mask [2] at index 0 does not match the shape of the indexed tensor [3, 9216, 320] at index 0
Steps: 1%|▎ | 249/30000 [06:30<12:57:21, 1.57s/it, lr=1e-5, step_loss=0.107]
Such an open source effort with amazing results!
I have some questions about training data. What is the approximate amount of video data used to train the model?
Thanks for your great work. Have you ever encounter the phenomenon of overfit?
Hello!
Could you tell me why I'm getting error when launch vid2pose modul. I use commend that you provided.
python tools/vid2pose.py --video_path my_path/to_file.mp4
Console log:
Traceback (most recent call last):
File "/content/Moore-AnimateAnyone/tools/vid2pose.py", line 1, in <module>
from src.dwpose import DWposeDetector
ModuleNotFoundError: No module named 'src
great!
When processing data, the absence of controlnet_aux.util leads to the inability to run
When i try to load from safetensors i found some training weight problem.
Althrough it works in diffusers,but it had some hidden danger
Moore-AnimateAnyone/tools/download_weights.py
Line 106 in 6d0c2da
You guys forgot to call prepare_vae()
in main function
With unconditional generation during training, should reference embedding concat to the normal_hidden_states?
Can you release your training code?
Hello, thanks for your open weights. However, I am wondering to use some features of the result from the first training stage, will you share these weights?
I also wondering the ability of keeping id and generating high-quality image about the first-stage result, would you share me with this experience?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.