unibruce / mead Goto Github PK
View Code? Open in Web Editor NEWMEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
License: MIT License
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
License: MIT License
Hi:
I really appreciated you guys released the source code. But would you mind to test the demo.py by yourselves? There are several issues when running demo.py
(1) When cd Refinement as suggested, there are several paths are wrong.
(2) The audio dim (the size of phoneme) is 28 instead 97 in your config_demo.yaml.
(3) Line 290 in data.py should be './MFCC_test' instead of '/MFCC_test' in your code
(4) Line 308 in data.py, sample is a string looks like 'M003_73_1_output_03/054.pickle reference.jpg'. I don't think sample[0] in the following lines are referring to the character M. Currently I split the sample (which is a string) by space
(5) Line 60 in traner_demo.py, heatmap = self.transform(draw_heatmap_from_78_landmark(fake_ldmk, 384, 384)). You forgot the batchsize argument (the 1st argument of the draw_heatmap_from_78_landmark function). I supposed the batchsize=1 during the demo.
(6) Following (5), the fake_ldmk should resize to 2-D numpy array which is (1, 78).
(7) Following (6), the type of the returned draw_heatmap_from_78_landmark should be np array PIL image format but the current function returned tensor
(8) Line 66 in demo.py, trainer.page2emo instead of trainer.page2em
(9) In demo.py, image_dir = os.path.join(image_directory, str(em_fc)) instead of image_dir = os.path.join(image_directory, em_fc)
Really appreciate your great work but will definitely be more grateful if you could test the demo program first. Thanks!
Thank you for your contribution and publishing part of the data. I was wondering when Part1 will be available?
Hi,
Thanks for releasing the dataset and the code. The link for the pretrained models needed for running the test code is currently not working, could you kindly update the repository with the correct hyperlink?
Hi, when I use the preprocess_mfcc.py to create audio.pickle, I fine the number of pickle donot match the number of image from video. ie . angry/level_1/001.m4a
create 97 pickle, but video/front/angry/level_1/001.mp4
create 98 images.
Have you encountered this problem and how to solve it?
Thank you!
Hi, Thank you for your great work!
What is the license for the Mead Dataset?
same like Code(MiT)?
hello,
https://wywu.github.io/projects/MEAD/MEAD.html
the dataset of [Download-Part0 (Baidu Drive)] has been out of date. can you share again?
I am trying to download video.tar of the M019 folder from google drive. untarring results in an error saying corrupted tar. Tried downloading multiple times. Met with the same error.
请问这些拍摄这些视频的相机是在光笼中设定的位置吗?除了角度的变化外,相机与人的距离是否发生变化?能否给出具体的相机参数呢?
Hi, I am unable to run the demo code to generate the facial images. When I try to run demo.py under Refinement module after modifying the paths to the model files and list files appropriately, files audio_test_all.txt and video_list_test.txt (specified in config_demo.yml) appear to be missing. Also what should be the variable gan_path in config_demo.yml be set to ?
Thank you for making your dataset and method available. I would like to ask if the text of the corpus in txt (or other form) is available somewhere, or do we need to take it from the supplementary material of the pdf?
Thanks for your efforts in producing the MEAD dataset. We are looking forward to working with it.
It seems to me that some audio clips are longer than the maximum duration suggested in the paper. The supplementary material Fig 1 plots as well as the text on the first page suggests that the maximum "sentence duration" is 7 seconds.
https://wywu.github.io/projects/MEAD/support/MEAD-supp.pdf
From my example below I believe you should be able to replicate a case where the duration is 17 seconds. To replicate, you could try downloading the audio.tar file for M034 from Google Drive, extracting and then running the following python code:
>>> import librosa
>>> sentence_path = './fear/level_3/028.m4a'
>>> y, sr = librosa.load(sentence_path, sr=None)
>>> librosa.get_duration(y=y, sr=sr)
17.237333333333332
>>> sr
48000
>>> librosa.__version__
'0.8.1'
Hi, I am unable to run the demo code to generate the facial images. When I try to run demo.py under Refinement module after modifying the paths to the model files and list files appropriately, files audio_test_all.txt and video_list_test.txt (specified in config_demo.yml) appear to be missing. Also what should be the variable gan_path in config_demo.yml be set to ?
Actually, when running demo.py, you just need a single file. I have uploaded it, namely audio_demo.txt, and also the reference image. By the way, please feel free to download the testing audio data, which is provided on google drive.
Originally posted by @uniBruce in #3 (comment)
Thank you for your great work!
(1)I would like to ask what data exactly is stored in '/lists/mouth_ldmk.txt' and './lists/mouth_ldmk_test.txt'? And I hope you can tell me the format of his storage
(2)Is './lists/phoneme_list.txt' necessary? How it was generated?
Hi,
I downloaded part of the datasets and found that the correspondence of the audio snippets are not arranged as what I have expected.
I thought the number in the filename indicates the content of the audio. E.g. 001.mp4 in disgusted should be the same content as 001.mp4 in neutral. But unfortunately, they are not the same in M003. And I don't know why 30 snippets are provided for emotions other than neutral, but there are 40 snippets for neutral.
Could you explain to me why is it? And could you provide the correspondence relations of different snippets?
It is really hard to use your dataset if the correspondence are not provided.
Thanks
I downloaded the full Part 0 from Google Drive, and am planning to use this dataset for my research. However, while preprocessing it I noticed that many views are missing from speaker W017 (it looks like the only view that exists is "down"). Is this intended? Do you plan on releasing these extra views in the future? Thanks.
Hi,
Thanks for sharing, but it seem that the MEAD download link from Baidu Drive is invalid now. Would you please provide a new one ?
Thanks,
Julian
Hi, congrats on the fantastic dataset! It would be really helpful if it were possible to extract 3D data from these videos. Do you have the camera calibrations for the cameras you used?
Thank you for your excellent MEAD work. Could you provide a download link to the MEAD test set for research purposes? Thanks for your reply.
Hi, thank you for this great dataset.But when I want to train this work step by step,I find some problem.
(1) Audio2Landmark:
dataloader : how to generate audio_demo.list ? where is ./lists/landmarks.pickle? Could you tell me how to generate these files? ie. in audio_demo.txt why it is
"M003_07_1_output_01/000.pickle reference.jpg"?
(2) video_list.txt:
how to generate video_list?ie. in video_list.txt why it is "M003/M003_01_1_output_01/000.jpg"?
Could you realse the generate code?
I found there are some bugs in the demo.py. For example, the parameters in function draw_heatmap_from_78_landMark is lack when the function is called, then the parameters w and h is inverted.
When I tried to fix these bugs, the following outputs appears:
N2E result:
It is obvious that the final result, which is Refinement result, has too much noise. I wonder whether the N2E and Audio2Landmark network is normal, and how to get an accurate output of Refinement network.
In case you want to download this dataset and don't want to click through the GDrive links one by one, I did it for you. You are welcome :-). Only includes the videos, no audio files. GDrive might still deny you the download, but if you attempt multiple times over the course of several days, probably you'll be able to download it all eventually.
pip install gdown
then run this:
gdown --fuzzy https://drive.google.com/file/d/1gZnRqkub1Zt_ao0Jgf4URAH4ufI4PIU_/view?usp=sharing -O W040.tar
gdown --fuzzy https://drive.google.com/file/d/1wCmuzDaD1bkAjSbPiKb-rJCqBQVlZflC/view?usp=sharing -O W038.tar
gdown --fuzzy https://drive.google.com/file/d/16CEn_2fjnOMcegXgiXKNw1nDxydP_FbZ/view?usp=sharing -O W037.tar
gdown --fuzzy https://drive.google.com/file/d/1pe4CmrselXMFFj1JputEF8_PqddtZWi1/view?usp=sharing -O W036.tar
gdown --fuzzy https://drive.google.com/file/d/1Io8xMQt3-9wWj_OZ1e7o1BYChj6DvdGQ/view?usp=sharing -O W035.tar
gdown --fuzzy https://drive.google.com/file/d/1u5zSxa3zOwPFdXMc8Teu929jgOnXk5pD/view?usp=sharing -O W033.tar
gdown --fuzzy https://drive.google.com/file/d/1yE6ArRNJdTEDQBcxLZEPbfhiY46hH_bW/view?usp=sharing -O W029.tar
gdown --fuzzy https://drive.google.com/file/d/1WuAjnxNWuZaC7Hb1S20W-X-LvsqCqFdm/view?usp=sharing -O W028.tar
gdown --fuzzy https://drive.google.com/file/d/10NElRhzstGYCeNghdI6StcFaFU2EexqV/view?usp=sharing -O W026.tar
gdown --fuzzy https://drive.google.com/file/d/1/8Ac3XdJJqkD7iBxRT9l82uuNdd_0KWaH/view?usp=sharing -O W025.tar
gdown --fuzzy https://drive.google.com/file/d/1lKb_wv200Ss3QecRa-MVDsddYaG245UI/view?usp=sharing -O W024.tar
gdown --fuzzy https://drive.google.com/file/d/16LzfPpQYl6_Yc2bCfwQyJ9Tg5mJuhUG6/view?usp=sharing -O W023.tar
gdown --fuzzy https://drive.google.com/file/d/15KGkel1sAKmcjkHICm5gmP_b2QOKEK7J/view?usp=sharing -O W021-2.tar
gdown --fuzzy https://drive.google.com/file/d/1oPFVfVhtllT39cRkrut7MbPksjnj6XjN/view?usp=sharing -O W021-1.tar
gdown --fuzzy https://drive.google.com/file/d/1li2eNgDD9V6HSlhU38S7rEM8jT8o9Tz5/view?usp=sharing -O W019.tar
gdown --fuzzy https://drive.google.com/file/d/1R4w8WQYXLd145W4WNoRjTF04NeiRW8A_/view?usp=sharing -O W018.tar
gdown --fuzzy https://drive.google.com/file/d/1NfcQSYQXwW_ypENtxtJIqejaCHxutYMO/view?usp=sharing -O W017.tar
gdown --fuzzy https://drive.google.com/file/d/1LXuJ80T2i5k6HKV2ZAclV-bt-JTE0Bie/view?usp=sharing -O W016.tar
gdown --fuzzy https://drive.google.com/file/d/1QQN8lCFXez0NlZDJF77XL736jt2_H4CZ/view?usp=sharing -O W015.tar
gdown --fuzzy https://drive.google.com/file/d/1vHOs0Jk-XoJvjMIV8gcqk2QAMZ9zYn20/view?usp=sharing -O W014.tar
gdown --fuzzy https://drive.google.com/file/d/1vHOs0Jk-XoJvjMIV8gcqk2QAMZ9zYn20/view?usp=sharing -O W011.tar
gdown --fuzzy https://drive.google.com/file/d/1UFCS0yRaAAP4Aqr1QoaI_Pak9x_A0Qa3/view?usp=sharing -O W009.tar
gdown --fuzzy https://drive.google.com/file/d/1CFzV_mT509KiaPc0zdOMdZCW22Gg_BHV/view?usp=sharing -O M042-2.tar
gdown --fuzzy https://drive.google.com/file/d/1iz6lHRmeDCNJDm7J8mBftpk09z8p0tsH/view?usp=sharing -O M042-1.tar
gdown --fuzzy https://drive.google.com/file/d/1iz6lHRmeDCNJDm7J8mBftpk09z8p0tsH/view?usp=sharing -O M041.tar
gdown --fuzzy https://drive.google.com/file/d/1oUaIV81pWedu2aJ7-Zlc1VrKeY5or8uj/view?usp=sharing -O M040.tar
gdown --fuzzy https://drive.google.com/file/d/1vny_Nk6Vg0VWWaQxMQMpXYahvdYRLoMR/view?usp=sharing -O M039.tar
gdown --fuzzy https://drive.google.com/file/d/1Q6o8eR2rN4hjxlwbcqP6NzCJlTLNypc4/view?usp=sharing -O M037.tar
gdown --fuzzy https://drive.google.com/file/d/1gnmEkmF7c-CevvAxOO96LEtXJV5FV5y0/view?usp=sharing -O M035.tar
gdown --fuzzy https://drive.google.com/file/d/1gnmEkmF7c-CevvAxOO96LEtXJV5FV5y0/view?usp=sharing -O M034.tar
gdown --fuzzy https://drive.google.com/file/d/1vS3QgKSYhGTGAkTvuVMua-cAlgtNfvO5/view?usp=sharing -O M033.tar
gdown --fuzzy https://drive.google.com/file/d/1w6-R6WneiRBxG2GMbZU3yAezSxAgNI8T/view?usp=sharing -O M032-2.tar
gdown --fuzzy https://drive.google.com/file/d/175mieJBwjEbiB6yF6LktFZCuuFtVHDCy/view?usp=sharing -O M032-1.tar
gdown --fuzzy https://drive.google.com/file/d/1SSuDkrby0Ev_Ssj3M4s5hsR9WkbMNwgV/view?usp=sharing -O M031.tar
gdown --fuzzy https://drive.google.com/file/d/1FQPmuFTqwfKuVuFAtolX8Ge2sM7QW7BY/view?usp=sharing -O M030.tar
gdown --fuzzy https://drive.google.com/file/d/11EG6eC03tX97rLCObUzxJH6mJu0_a1GP/view?usp=sharing -O M029.tar
gdown --fuzzy https://drive.google.com/file/d/1oguNuh3ev8-6KjeVKuCZ3l0WnwdjkFAM/view?usp=sharing -O M028.tar
gdown --fuzzy https://drive.google.com/file/d/1mjtAOh_XAGmOB6CrnuJAbhPUMmBZBSaK/view?usp=sharing -O M027.tar
gdown --fuzzy https://drive.google.com/file/d/1t9zhGWUGHvb1MalM8L7VqF92iRUT-RIa/view?usp=sharing -O M026-2.tar
gdown --fuzzy https://drive.google.com/file/d/11xVIAJbErEf9RrniqhamHGVNdaGa9-Kd/view?usp=sharing -O M026-1.tar
gdown --fuzzy https://drive.google.com/file/d/1sICjHruXFSj3ib29iShEw0LMOVluTeqj/view?usp=sharing -O M025.tar
gdown --fuzzy https://drive.google.com/file/d/1VDQxx7saZLOIMl-7fXq8TuZFxzDh-vZ9/view?usp=sharing -O M024.tar
gdown --fuzzy https://drive.google.com/file/d/1TMj6Lks3IKtn1T5DI8jUPDLQYx5BZ2RI/view?usp=sharing -O M023.tar
gdown --fuzzy https://drive.google.com/file/d/1R856cJEgbw0mv4mWsCDZRBHa03DH0oP2/view?usp=sharing -O M022.tar
gdown --fuzzy https://drive.google.com/file/d/1YPUech64LhZTmQhj0irQ6oxOxTOWngxf/view?usp=sharing -O M019.tar
gdown --fuzzy https://drive.google.com/file/d/1HdpRfyKvU3nT8B6XwYjUxKxbzG7FHLI_/view?usp=sharing -O M013.tar
gdown --fuzzy https://drive.google.com/file/d/1r_02Rr7qd9zcXGcuOWmkCn-X4gwNdTSY/view?usp=sharing -O M012.tar
gdown --fuzzy https://drive.google.com/file/d/1PlYwxkgRwrDISiIsOYuP6_Yw8RxCImfq/view?usp=sharing -O M011.tar
gdown --fuzzy https://drive.google.com/file/d/1tJMq4W-dQ4Z6IMEs5kK0lYd0s6UjAmrz/view?usp=sharing -O M009.tar
gdown --fuzzy https://drive.google.com/file/d/1HO6n_jIyZXiZBmywk3vN7LcNSeqUWnCh/view?usp=sharing -O M007.tar
gdown --fuzzy https://drive.google.com/file/d/1ubqPoVb2f2dPEqBm07fAr-H-mqiSHoOV/view?usp=sharing -O M005.tar
gdown --fuzzy https://drive.google.com/file/d/1_sTB4DoljwU4IY2YTVkRQ_8bj28Le9-c/view?usp=sharing -O M003.tar
You are welcome
Hi,Thanks for your works!However, you code donot include the code(generate image from video).I think this very code is important for this work. Could you release the code?Thank you very much!
Thanks for releasing the MEAD dataset. Can you please mention the subject ids of the Test set used in the paper ?
Hi
I found that the link to dataset you provide in the article and in this repo is broken. https://wywu.github.io/projects/MEAD/MEAD.html
Could you provide a new one?
Does MEAD provide text caption ground truth?
Hello,
Thank you for sharing your work. I tried to run the demo (Refinement/demo.py), but found that some files are missing, especially "./lists/pca.pickle". I think I understand what it means, but there is no code to show how to compute it from the training data, and could you please share the pickle file if possible?
Also when I tried to run Refinement/demo.py, there seems no page2em() in Refinement/utils_parallel.py and can I use page2emo() in Refinement/trainer_demo.py instead?
Thank you!
is anyone have baidu drive resources?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.