unibruce / mead Goto Github PK

View Code? Open in Web Editor NEW

227.0 227.0 25.0 36.36 MB

MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]

License: MIT License

Python 100.00%

mead's People

Contributors

Stargazers

Watchers

mead's Issues

Few issues when running demo

Hi:

I really appreciated you guys released the source code. But would you mind to test the demo.py by yourselves? There are several issues when running demo.py

(1) When cd Refinement as suggested, there are several paths are wrong.
(2) The audio dim (the size of phoneme) is 28 instead 97 in your config_demo.yaml.
(3) Line 290 in data.py should be './MFCC_test' instead of '/MFCC_test' in your code
(4) Line 308 in data.py, sample is a string looks like 'M003_73_1_output_03/054.pickle reference.jpg'. I don't think sample[0] in the following lines are referring to the character M. Currently I split the sample (which is a string) by space
(5) Line 60 in traner_demo.py, heatmap = self.transform(draw_heatmap_from_78_landmark(fake_ldmk, 384, 384)). You forgot the batchsize argument (the 1st argument of the draw_heatmap_from_78_landmark function). I supposed the batchsize=1 during the demo.
(6) Following (5), the fake_ldmk should resize to 2-D numpy array which is (1, 78).
(7) Following (6), the type of the returned draw_heatmap_from_78_landmark should be np array PIL image format but the current function returned tensor
(8) Line 66 in demo.py, trainer.page2emo instead of trainer.page2em
(9) In demo.py, image_dir = os.path.join(image_directory, str(em_fc)) instead of image_dir = os.path.join(image_directory, em_fc)

Really appreciate your great work but will definitely be more grateful if you could test the demo program first. Thanks!

Data Part1 release date

Thank you for your contribution and publishing part of the data. I was wondering when Part1 will be available?

Link to Pretrained models

Hi,
Thanks for releasing the dataset and the code. The link for the pretrained models needed for running the test code is currently not working, could you kindly update the repository with the correct hyperlink?

Audio pickle donot match image from video

Hi, when I use the preprocess_mfcc.py to create audio.pickle, I fine the number of pickle donot match the number of image from video. ie . angry/level_1/001.m4a create 97 pickle, but video/front/angry/level_1/001.mp4 create 98 images.
Have you encountered this problem and how to solve it？
Thank you!

How to organize directory structure of the train data?

Mead Data License

Hi, Thank you for your great work!

What is the license for the Mead Dataset?
same like Code(MiT)?

the dataset of Baidu Drive has been out of date

hello,
https://wywu.github.io/projects/MEAD/MEAD.html
the dataset of [Download-Part0 (Baidu Drive)] has been out of date. can you share again？

M019 Video.tar on Google Drive Corrupted

I am trying to download video.tar of the M019 folder from google drive. untarring results in an error saying corrupted tar. Tried downloading multiple times. Met with the same error.

Mead dataset link is missing

相机参数

请问这些拍摄这些视频的相机是在光笼中设定的位置吗？除了角度的变化外，相机与人的距离是否发生变化？能否给出具体的相机参数呢？

Demo code error

Hi, I am unable to run the demo code to generate the facial images. When I try to run demo.py under Refinement module after modifying the paths to the model files and list files appropriately, files audio_test_all.txt and video_list_test.txt (specified in config_demo.yml) appear to be missing. Also what should be the variable gan_path in config_demo.yml be set to ?

Speech Corpus Text Files

Thank you for making your dataset and method available. I would like to ask if the text of the corpus in txt (or other form) is available somewhere, or do we need to take it from the supplementary material of the pdf?

Audio clips with duration >7 seconds

Thanks for your efforts in producing the MEAD dataset. We are looking forward to working with it.

It seems to me that some audio clips are longer than the maximum duration suggested in the paper. The supplementary material Fig 1 plots as well as the text on the first page suggests that the maximum "sentence duration" is 7 seconds.
https://wywu.github.io/projects/MEAD/support/MEAD-supp.pdf

From my example below I believe you should be able to replicate a case where the duration is 17 seconds. To replicate, you could try downloading the audio.tar file for M034 from Google Drive, extracting and then running the following python code:

>>> import librosa
>>> sentence_path = './fear/level_3/028.m4a'
>>> y, sr = librosa.load(sentence_path, sr=None)
>>> librosa.get_duration(y=y, sr=sr)
17.237333333333332
>>> sr
48000
>>> librosa.__version__
'0.8.1'

Website doesn't work.

Unfortunately, something happened with your website, it doesn't work.

The demo.py file does not work well, how to set the para of 'gan_path'?'

Hi, I am unable to run the demo code to generate the facial images. When I try to run demo.py under Refinement module after modifying the paths to the model files and list files appropriately, files audio_test_all.txt and video_list_test.txt (specified in config_demo.yml) appear to be missing. Also what should be the variable gan_path in config_demo.yml be set to ?

Actually, when running demo.py, you just need a single file. I have uploaded it, namely audio_demo.txt, and also the reference image. By the way, please feel free to download the testing audio data, which is provided on google drive.

Originally posted by @uniBruce in #3 (comment)

'/lists/mouth_ldmk.txt' and'./lists/mouth_ldmk_test.txt'

Thank you for your great work！
（1）I would like to ask what data exactly is stored in '/lists/mouth_ldmk.txt' and './lists/mouth_ldmk_test.txt'? And I hope you can tell me the format of his storage
（2）Is './lists/phoneme_list.txt' necessary? How it was generated？

The correspondence of audio snippets between different emotions

Hi,

I downloaded part of the datasets and found that the correspondence of the audio snippets are not arranged as what I have expected.

I thought the number in the filename indicates the content of the audio. E.g. 001.mp4 in disgusted should be the same content as 001.mp4 in neutral. But unfortunately, they are not the same in M003. And I don't know why 30 snippets are provided for emotions other than neutral, but there are 40 snippets for neutral.

Could you explain to me why is it? And could you provide the correspondence relations of different snippets?
It is really hard to use your dataset if the correspondence are not provided.

Thanks

MEAD - Missing views for speaker W017

I downloaded the full Part 0 from Google Drive, and am planning to use this dataset for my research. However, while preprocessing it I noticed that many views are missing from speaker W017 (it looks like the only view that exists is "down"). Is this intended? Do you plan on releasing these extra views in the future? Thanks.

Baidu Drive download link expired

Hi,

Thanks for sharing, but it seem that the MEAD download link from Baidu Drive is invalid now. Would you please provide a new one ?

Thanks,
Julian

已创建中文的讨论组想加入的请添加微信xaaheng

Camera Calibrations

Hi, congrats on the fantastic dataset! It would be really helpful if it were possible to extract 3D data from these videos. Do you have the camera calibrations for the cameras you used?

Apply for MEAD's test dataset (Part 1)

Thank you for your excellent MEAD work. Could you provide a download link to the MEAD test set for research purposes? Thanks for your reply.

How to generate the .list or pickle

Hi, thank you for this great dataset.But when I want to train this work step by step,I find some problem.
(1) Audio2Landmark:
dataloader : how to generate audio_demo.list ? where is ./lists/landmarks.pickle? Could you tell me how to generate these files? ie. in audio_demo.txt why it is
"M003_07_1_output_01/000.pickle reference.jpg"?
(2) video_list.txt:
how to generate video_list?ie. in video_list.txt why it is "M003/M003_01_1_output_01/000.jpg"?
Could you realse the generate code?

Some question about Refinement network

I found there are some bugs in the demo.py. For example, the parameters in function draw_heatmap_from_78_landMark is lack when the function is called, then the parameters w and h is inverted.

When I tried to fix these bugs, the following outputs appears:
N2E result:

Audio2Landmark result:

Refinement result:

It is obvious that the final result, which is Refinement result, has too much noise. I wonder whether the N2E and Audio2Landmark network is normal, and how to get an accurate output of Refinement network.

Why is 403 displayed at the end of downloading video data？

Hi, thanks for your project.
Now I have a problem that there is 403 error displayed in the end of downloading video data, but it's ok when downloading audio data. Can u give some solutions?

Script to download the data from GDrive.

In case you want to download this dataset and don't want to click through the GDrive links one by one, I did it for you. You are welcome :-). Only includes the videos, no audio files. GDrive might still deny you the download, but if you attempt multiple times over the course of several days, probably you'll be able to download it all eventually.

pip install gdown

then run this:

gdown --fuzzy https://drive.google.com/file/d/1gZnRqkub1Zt_ao0Jgf4URAH4ufI4PIU_/view?usp=sharing -O W040.tar
gdown --fuzzy https://drive.google.com/file/d/1wCmuzDaD1bkAjSbPiKb-rJCqBQVlZflC/view?usp=sharing -O W038.tar
gdown --fuzzy https://drive.google.com/file/d/16CEn_2fjnOMcegXgiXKNw1nDxydP_FbZ/view?usp=sharing -O W037.tar
gdown --fuzzy https://drive.google.com/file/d/1pe4CmrselXMFFj1JputEF8_PqddtZWi1/view?usp=sharing -O W036.tar
gdown --fuzzy https://drive.google.com/file/d/1Io8xMQt3-9wWj_OZ1e7o1BYChj6DvdGQ/view?usp=sharing -O W035.tar
gdown --fuzzy https://drive.google.com/file/d/1u5zSxa3zOwPFdXMc8Teu929jgOnXk5pD/view?usp=sharing -O W033.tar
gdown --fuzzy https://drive.google.com/file/d/1yE6ArRNJdTEDQBcxLZEPbfhiY46hH_bW/view?usp=sharing -O W029.tar
gdown --fuzzy https://drive.google.com/file/d/1WuAjnxNWuZaC7Hb1S20W-X-LvsqCqFdm/view?usp=sharing -O W028.tar
gdown --fuzzy https://drive.google.com/file/d/10NElRhzstGYCeNghdI6StcFaFU2EexqV/view?usp=sharing -O W026.tar
gdown --fuzzy https://drive.google.com/file/d/1/8Ac3XdJJqkD7iBxRT9l82uuNdd_0KWaH/view?usp=sharing -O W025.tar
gdown --fuzzy https://drive.google.com/file/d/1lKb_wv200Ss3QecRa-MVDsddYaG245UI/view?usp=sharing -O W024.tar
gdown --fuzzy https://drive.google.com/file/d/16LzfPpQYl6_Yc2bCfwQyJ9Tg5mJuhUG6/view?usp=sharing -O W023.tar
gdown --fuzzy https://drive.google.com/file/d/15KGkel1sAKmcjkHICm5gmP_b2QOKEK7J/view?usp=sharing -O W021-2.tar
gdown --fuzzy https://drive.google.com/file/d/1oPFVfVhtllT39cRkrut7MbPksjnj6XjN/view?usp=sharing -O W021-1.tar
gdown --fuzzy https://drive.google.com/file/d/1li2eNgDD9V6HSlhU38S7rEM8jT8o9Tz5/view?usp=sharing -O W019.tar
gdown --fuzzy https://drive.google.com/file/d/1R4w8WQYXLd145W4WNoRjTF04NeiRW8A_/view?usp=sharing -O W018.tar
gdown --fuzzy https://drive.google.com/file/d/1NfcQSYQXwW_ypENtxtJIqejaCHxutYMO/view?usp=sharing -O W017.tar
gdown --fuzzy https://drive.google.com/file/d/1LXuJ80T2i5k6HKV2ZAclV-bt-JTE0Bie/view?usp=sharing -O W016.tar
gdown --fuzzy https://drive.google.com/file/d/1QQN8lCFXez0NlZDJF77XL736jt2_H4CZ/view?usp=sharing -O W015.tar
gdown --fuzzy https://drive.google.com/file/d/1vHOs0Jk-XoJvjMIV8gcqk2QAMZ9zYn20/view?usp=sharing -O W014.tar
gdown --fuzzy https://drive.google.com/file/d/1vHOs0Jk-XoJvjMIV8gcqk2QAMZ9zYn20/view?usp=sharing -O W011.tar
gdown --fuzzy https://drive.google.com/file/d/1UFCS0yRaAAP4Aqr1QoaI_Pak9x_A0Qa3/view?usp=sharing -O W009.tar
gdown --fuzzy https://drive.google.com/file/d/1CFzV_mT509KiaPc0zdOMdZCW22Gg_BHV/view?usp=sharing -O M042-2.tar
gdown --fuzzy https://drive.google.com/file/d/1iz6lHRmeDCNJDm7J8mBftpk09z8p0tsH/view?usp=sharing -O M042-1.tar
gdown --fuzzy https://drive.google.com/file/d/1iz6lHRmeDCNJDm7J8mBftpk09z8p0tsH/view?usp=sharing -O M041.tar
gdown --fuzzy https://drive.google.com/file/d/1oUaIV81pWedu2aJ7-Zlc1VrKeY5or8uj/view?usp=sharing -O M040.tar
gdown --fuzzy https://drive.google.com/file/d/1vny_Nk6Vg0VWWaQxMQMpXYahvdYRLoMR/view?usp=sharing -O M039.tar
gdown --fuzzy https://drive.google.com/file/d/1Q6o8eR2rN4hjxlwbcqP6NzCJlTLNypc4/view?usp=sharing -O M037.tar
gdown --fuzzy https://drive.google.com/file/d/1gnmEkmF7c-CevvAxOO96LEtXJV5FV5y0/view?usp=sharing -O M035.tar
gdown --fuzzy https://drive.google.com/file/d/1gnmEkmF7c-CevvAxOO96LEtXJV5FV5y0/view?usp=sharing -O M034.tar
gdown --fuzzy https://drive.google.com/file/d/1vS3QgKSYhGTGAkTvuVMua-cAlgtNfvO5/view?usp=sharing -O M033.tar
gdown --fuzzy https://drive.google.com/file/d/1w6-R6WneiRBxG2GMbZU3yAezSxAgNI8T/view?usp=sharing -O M032-2.tar
gdown --fuzzy https://drive.google.com/file/d/175mieJBwjEbiB6yF6LktFZCuuFtVHDCy/view?usp=sharing -O M032-1.tar
gdown --fuzzy https://drive.google.com/file/d/1SSuDkrby0Ev_Ssj3M4s5hsR9WkbMNwgV/view?usp=sharing -O M031.tar
gdown --fuzzy https://drive.google.com/file/d/1FQPmuFTqwfKuVuFAtolX8Ge2sM7QW7BY/view?usp=sharing -O M030.tar
gdown --fuzzy https://drive.google.com/file/d/11EG6eC03tX97rLCObUzxJH6mJu0_a1GP/view?usp=sharing -O M029.tar
gdown --fuzzy https://drive.google.com/file/d/1oguNuh3ev8-6KjeVKuCZ3l0WnwdjkFAM/view?usp=sharing -O M028.tar
gdown --fuzzy https://drive.google.com/file/d/1mjtAOh_XAGmOB6CrnuJAbhPUMmBZBSaK/view?usp=sharing -O M027.tar
gdown --fuzzy https://drive.google.com/file/d/1t9zhGWUGHvb1MalM8L7VqF92iRUT-RIa/view?usp=sharing -O M026-2.tar
gdown --fuzzy https://drive.google.com/file/d/11xVIAJbErEf9RrniqhamHGVNdaGa9-Kd/view?usp=sharing -O M026-1.tar
gdown --fuzzy https://drive.google.com/file/d/1sICjHruXFSj3ib29iShEw0LMOVluTeqj/view?usp=sharing -O M025.tar
gdown --fuzzy https://drive.google.com/file/d/1VDQxx7saZLOIMl-7fXq8TuZFxzDh-vZ9/view?usp=sharing -O M024.tar
gdown --fuzzy https://drive.google.com/file/d/1TMj6Lks3IKtn1T5DI8jUPDLQYx5BZ2RI/view?usp=sharing -O M023.tar
gdown --fuzzy https://drive.google.com/file/d/1R856cJEgbw0mv4mWsCDZRBHa03DH0oP2/view?usp=sharing -O M022.tar
gdown --fuzzy https://drive.google.com/file/d/1YPUech64LhZTmQhj0irQ6oxOxTOWngxf/view?usp=sharing -O M019.tar
gdown --fuzzy https://drive.google.com/file/d/1HdpRfyKvU3nT8B6XwYjUxKxbzG7FHLI_/view?usp=sharing -O M013.tar
gdown --fuzzy https://drive.google.com/file/d/1r_02Rr7qd9zcXGcuOWmkCn-X4gwNdTSY/view?usp=sharing -O M012.tar
gdown --fuzzy https://drive.google.com/file/d/1PlYwxkgRwrDISiIsOYuP6_Yw8RxCImfq/view?usp=sharing -O M011.tar
gdown --fuzzy https://drive.google.com/file/d/1tJMq4W-dQ4Z6IMEs5kK0lYd0s6UjAmrz/view?usp=sharing -O M009.tar
gdown --fuzzy https://drive.google.com/file/d/1HO6n_jIyZXiZBmywk3vN7LcNSeqUWnCh/view?usp=sharing -O M007.tar
gdown --fuzzy https://drive.google.com/file/d/1ubqPoVb2f2dPEqBm07fAr-H-mqiSHoOV/view?usp=sharing -O M005.tar
gdown --fuzzy https://drive.google.com/file/d/1_sTB4DoljwU4IY2YTVkRQ_8bj28Le9-c/view?usp=sharing -O M003.tar

You are welcome

How to generate image(384x384) from video?

Hi,Thanks for your works!However, you code donot include the code(generate image from video).I think this very code is important for this work. Could you release the code?Thank you very much!

Test Data

Thanks for releasing the MEAD dataset. Can you please mention the subject ids of the Test set used in the paper ?

Link to dataset is broken

Hi
I found that the link to dataset you provide in the article and in this repo is broken. https://wywu.github.io/projects/MEAD/MEAD.html
Could you provide a new one?

How to inference on my own reference image?

Thank you for great work! But I have an issue with getting results.

When I run test.py with my own image, I get a poor result like below.

What should i do to get the better result?

Does MEAD provide text caption ground truth?

Why there is only one id(M003) in the 'videolist.txt'?

Missing files for running the demo (Refinement/demo.py)

Hello,

Thank you for sharing your work. I tried to run the demo (Refinement/demo.py), but found that some files are missing, especially "./lists/pca.pickle". I think I understand what it means, but there is no code to show how to compute it from the training data, and could you please share the pickle file if possible?

Also when I tried to run Refinement/demo.py, there seems no page2em() in Refinement/utils_parallel.py and can I use page2emo() in Refinement/trainer_demo.py instead?

Thank you!

baidu drive resources

is anyone have baidu drive resources?

unibruce / mead Goto Github PK

mead's People

Contributors

Stargazers

Watchers

Forkers

mead's Issues

Recommend Projects

Recommend Topics

Recommend Org