Giter VIP home page Giter VIP logo

mead's People

Contributors

unibruce avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mead's Issues

Few issues when running demo

Hi:

I really appreciated you guys released the source code. But would you mind to test the demo.py by yourselves? There are several issues when running demo.py

(1) When cd Refinement as suggested, there are several paths are wrong.
(2) The audio dim (the size of phoneme) is 28 instead 97 in your config_demo.yaml.
(3) Line 290 in data.py should be './MFCC_test' instead of '/MFCC_test' in your code
(4) Line 308 in data.py, sample is a string looks like 'M003_73_1_output_03/054.pickle reference.jpg'. I don't think sample[0] in the following lines are referring to the character M. Currently I split the sample (which is a string) by space
(5) Line 60 in traner_demo.py, heatmap = self.transform(draw_heatmap_from_78_landmark(fake_ldmk, 384, 384)). You forgot the batchsize argument (the 1st argument of the draw_heatmap_from_78_landmark function). I supposed the batchsize=1 during the demo.
(6) Following (5), the fake_ldmk should resize to 2-D numpy array which is (1, 78).
(7) Following (6), the type of the returned draw_heatmap_from_78_landmark should be np array PIL image format but the current function returned tensor
(8) Line 66 in demo.py, trainer.page2emo instead of trainer.page2em
(9) In demo.py, image_dir = os.path.join(image_directory, str(em_fc)) instead of image_dir = os.path.join(image_directory, em_fc)

Really appreciate your great work but will definitely be more grateful if you could test the demo program first. Thanks!

Data Part1 release date

Thank you for your contribution and publishing part of the data. I was wondering when Part1 will be available?

Link to Pretrained models

Hi,
Thanks for releasing the dataset and the code. The link for the pretrained models needed for running the test code is currently not working, could you kindly update the repository with the correct hyperlink?

Audio pickle donot match image from video

Hi, when I use the preprocess_mfcc.py to create audio.pickle, I fine the number of pickle donot match the number of image from video. ie . angry/level_1/001.m4a create 97 pickle, but video/front/angry/level_1/001.mp4 create 98 images.
Have you encountered this problem and how to solve it?
Thank you!

Mead Data License

Hi, Thank you for your great work!

What is the license for the Mead Dataset?
same like Code(MiT)?

M019 Video.tar on Google Drive Corrupted

I am trying to download video.tar of the M019 folder from google drive. untarring results in an error saying corrupted tar. Tried downloading multiple times. Met with the same error.

相机参数

请问这些拍摄这些视频的相机是在光笼中设定的位置吗?除了角度的变化外,相机与人的距离是否发生变化?能否给出具体的相机参数呢?

Demo code error

Hi, I am unable to run the demo code to generate the facial images. When I try to run demo.py under Refinement module after modifying the paths to the model files and list files appropriately, files audio_test_all.txt and video_list_test.txt (specified in config_demo.yml) appear to be missing. Also what should be the variable gan_path in config_demo.yml be set to ?

Speech Corpus Text Files

Thank you for making your dataset and method available. I would like to ask if the text of the corpus in txt (or other form) is available somewhere, or do we need to take it from the supplementary material of the pdf?

Audio clips with duration >7 seconds

Thanks for your efforts in producing the MEAD dataset. We are looking forward to working with it.

It seems to me that some audio clips are longer than the maximum duration suggested in the paper. The supplementary material Fig 1 plots as well as the text on the first page suggests that the maximum "sentence duration" is 7 seconds.
https://wywu.github.io/projects/MEAD/support/MEAD-supp.pdf

From my example below I believe you should be able to replicate a case where the duration is 17 seconds. To replicate, you could try downloading the audio.tar file for M034 from Google Drive, extracting and then running the following python code:

>>> import librosa
>>> sentence_path = './fear/level_3/028.m4a'
>>> y, sr = librosa.load(sentence_path, sr=None)
>>> librosa.get_duration(y=y, sr=sr)
17.237333333333332
>>> sr
48000
>>> librosa.__version__
'0.8.1'

The demo.py file does not work well, how to set the para of 'gan_path'?'

Hi, I am unable to run the demo code to generate the facial images. When I try to run demo.py under Refinement module after modifying the paths to the model files and list files appropriately, files audio_test_all.txt and video_list_test.txt (specified in config_demo.yml) appear to be missing. Also what should be the variable gan_path in config_demo.yml be set to ?

Actually, when running demo.py, you just need a single file. I have uploaded it, namely audio_demo.txt, and also the reference image. By the way, please feel free to download the testing audio data, which is provided on google drive.

Originally posted by @uniBruce in #3 (comment)

'/lists/mouth_ldmk.txt' and'./lists/mouth_ldmk_test.txt'

Thank you for your great work!
(1)I would like to ask what data exactly is stored in '/lists/mouth_ldmk.txt' and './lists/mouth_ldmk_test.txt'? And I hope you can tell me the format of his storage
(2)Is './lists/phoneme_list.txt' necessary? How it was generated?

The correspondence of audio snippets between different emotions

Hi,

I downloaded part of the datasets and found that the correspondence of the audio snippets are not arranged as what I have expected.

I thought the number in the filename indicates the content of the audio. E.g. 001.mp4 in disgusted should be the same content as 001.mp4 in neutral. But unfortunately, they are not the same in M003. And I don't know why 30 snippets are provided for emotions other than neutral, but there are 40 snippets for neutral.

Could you explain to me why is it? And could you provide the correspondence relations of different snippets?
It is really hard to use your dataset if the correspondence are not provided.

Thanks

MEAD - Missing views for speaker W017

I downloaded the full Part 0 from Google Drive, and am planning to use this dataset for my research. However, while preprocessing it I noticed that many views are missing from speaker W017 (it looks like the only view that exists is "down"). Is this intended? Do you plan on releasing these extra views in the future? Thanks.

Baidu Drive download link expired

Hi,

Thanks for sharing, but it seem that the MEAD download link from Baidu Drive is invalid now. Would you please provide a new one ?

Thanks,
Julian

Camera Calibrations

Hi, congrats on the fantastic dataset! It would be really helpful if it were possible to extract 3D data from these videos. Do you have the camera calibrations for the cameras you used?

How to generate the .list or pickle

Hi, thank you for this great dataset.But when I want to train this work step by step,I find some problem.
(1) Audio2Landmark:
dataloader : how to generate audio_demo.list ? where is ./lists/landmarks.pickle? Could you tell me how to generate these files? ie. in audio_demo.txt why it is
"M003_07_1_output_01/000.pickle reference.jpg"?
(2) video_list.txt:
how to generate video_list?ie. in video_list.txt why it is "M003/M003_01_1_output_01/000.jpg"?
Could you realse the generate code?

Some question about Refinement network

I found there are some bugs in the demo.py. For example, the parameters in function draw_heatmap_from_78_landMark is lack when the function is called, then the parameters w and h is inverted.

When I tried to fix these bugs, the following outputs appears:
N2E result:
N2E

Audio2Landmark result:
HeatMap

Refinement result:
RefineImage

It is obvious that the final result, which is Refinement result, has too much noise. I wonder whether the N2E and Audio2Landmark network is normal, and how to get an accurate output of Refinement network.

Script to download the data from GDrive.

In case you want to download this dataset and don't want to click through the GDrive links one by one, I did it for you. You are welcome :-). Only includes the videos, no audio files. GDrive might still deny you the download, but if you attempt multiple times over the course of several days, probably you'll be able to download it all eventually.

pip install gdown

then run this:

gdown --fuzzy https://drive.google.com/file/d/1gZnRqkub1Zt_ao0Jgf4URAH4ufI4PIU_/view?usp=sharing -O W040.tar
gdown --fuzzy https://drive.google.com/file/d/1wCmuzDaD1bkAjSbPiKb-rJCqBQVlZflC/view?usp=sharing -O W038.tar
gdown --fuzzy https://drive.google.com/file/d/16CEn_2fjnOMcegXgiXKNw1nDxydP_FbZ/view?usp=sharing -O W037.tar
gdown --fuzzy https://drive.google.com/file/d/1pe4CmrselXMFFj1JputEF8_PqddtZWi1/view?usp=sharing -O W036.tar
gdown --fuzzy https://drive.google.com/file/d/1Io8xMQt3-9wWj_OZ1e7o1BYChj6DvdGQ/view?usp=sharing -O W035.tar
gdown --fuzzy https://drive.google.com/file/d/1u5zSxa3zOwPFdXMc8Teu929jgOnXk5pD/view?usp=sharing -O W033.tar
gdown --fuzzy https://drive.google.com/file/d/1yE6ArRNJdTEDQBcxLZEPbfhiY46hH_bW/view?usp=sharing -O W029.tar
gdown --fuzzy https://drive.google.com/file/d/1WuAjnxNWuZaC7Hb1S20W-X-LvsqCqFdm/view?usp=sharing -O W028.tar
gdown --fuzzy https://drive.google.com/file/d/10NElRhzstGYCeNghdI6StcFaFU2EexqV/view?usp=sharing -O W026.tar
gdown --fuzzy https://drive.google.com/file/d/1/8Ac3XdJJqkD7iBxRT9l82uuNdd_0KWaH/view?usp=sharing -O W025.tar
gdown --fuzzy https://drive.google.com/file/d/1lKb_wv200Ss3QecRa-MVDsddYaG245UI/view?usp=sharing -O W024.tar
gdown --fuzzy https://drive.google.com/file/d/16LzfPpQYl6_Yc2bCfwQyJ9Tg5mJuhUG6/view?usp=sharing -O W023.tar
gdown --fuzzy https://drive.google.com/file/d/15KGkel1sAKmcjkHICm5gmP_b2QOKEK7J/view?usp=sharing -O W021-2.tar
gdown --fuzzy https://drive.google.com/file/d/1oPFVfVhtllT39cRkrut7MbPksjnj6XjN/view?usp=sharing -O W021-1.tar
gdown --fuzzy https://drive.google.com/file/d/1li2eNgDD9V6HSlhU38S7rEM8jT8o9Tz5/view?usp=sharing -O W019.tar
gdown --fuzzy https://drive.google.com/file/d/1R4w8WQYXLd145W4WNoRjTF04NeiRW8A_/view?usp=sharing -O W018.tar
gdown --fuzzy https://drive.google.com/file/d/1NfcQSYQXwW_ypENtxtJIqejaCHxutYMO/view?usp=sharing -O W017.tar
gdown --fuzzy https://drive.google.com/file/d/1LXuJ80T2i5k6HKV2ZAclV-bt-JTE0Bie/view?usp=sharing -O W016.tar
gdown --fuzzy https://drive.google.com/file/d/1QQN8lCFXez0NlZDJF77XL736jt2_H4CZ/view?usp=sharing -O W015.tar
gdown --fuzzy https://drive.google.com/file/d/1vHOs0Jk-XoJvjMIV8gcqk2QAMZ9zYn20/view?usp=sharing -O W014.tar
gdown --fuzzy https://drive.google.com/file/d/1vHOs0Jk-XoJvjMIV8gcqk2QAMZ9zYn20/view?usp=sharing -O W011.tar
gdown --fuzzy https://drive.google.com/file/d/1UFCS0yRaAAP4Aqr1QoaI_Pak9x_A0Qa3/view?usp=sharing -O W009.tar
gdown --fuzzy https://drive.google.com/file/d/1CFzV_mT509KiaPc0zdOMdZCW22Gg_BHV/view?usp=sharing -O M042-2.tar
gdown --fuzzy https://drive.google.com/file/d/1iz6lHRmeDCNJDm7J8mBftpk09z8p0tsH/view?usp=sharing -O M042-1.tar
gdown --fuzzy https://drive.google.com/file/d/1iz6lHRmeDCNJDm7J8mBftpk09z8p0tsH/view?usp=sharing -O M041.tar
gdown --fuzzy https://drive.google.com/file/d/1oUaIV81pWedu2aJ7-Zlc1VrKeY5or8uj/view?usp=sharing -O M040.tar
gdown --fuzzy https://drive.google.com/file/d/1vny_Nk6Vg0VWWaQxMQMpXYahvdYRLoMR/view?usp=sharing -O M039.tar
gdown --fuzzy https://drive.google.com/file/d/1Q6o8eR2rN4hjxlwbcqP6NzCJlTLNypc4/view?usp=sharing -O M037.tar
gdown --fuzzy https://drive.google.com/file/d/1gnmEkmF7c-CevvAxOO96LEtXJV5FV5y0/view?usp=sharing -O M035.tar
gdown --fuzzy https://drive.google.com/file/d/1gnmEkmF7c-CevvAxOO96LEtXJV5FV5y0/view?usp=sharing -O M034.tar
gdown --fuzzy https://drive.google.com/file/d/1vS3QgKSYhGTGAkTvuVMua-cAlgtNfvO5/view?usp=sharing -O M033.tar
gdown --fuzzy https://drive.google.com/file/d/1w6-R6WneiRBxG2GMbZU3yAezSxAgNI8T/view?usp=sharing -O M032-2.tar
gdown --fuzzy https://drive.google.com/file/d/175mieJBwjEbiB6yF6LktFZCuuFtVHDCy/view?usp=sharing -O M032-1.tar
gdown --fuzzy https://drive.google.com/file/d/1SSuDkrby0Ev_Ssj3M4s5hsR9WkbMNwgV/view?usp=sharing -O M031.tar
gdown --fuzzy https://drive.google.com/file/d/1FQPmuFTqwfKuVuFAtolX8Ge2sM7QW7BY/view?usp=sharing -O M030.tar
gdown --fuzzy https://drive.google.com/file/d/11EG6eC03tX97rLCObUzxJH6mJu0_a1GP/view?usp=sharing -O M029.tar
gdown --fuzzy https://drive.google.com/file/d/1oguNuh3ev8-6KjeVKuCZ3l0WnwdjkFAM/view?usp=sharing -O M028.tar
gdown --fuzzy https://drive.google.com/file/d/1mjtAOh_XAGmOB6CrnuJAbhPUMmBZBSaK/view?usp=sharing -O M027.tar
gdown --fuzzy https://drive.google.com/file/d/1t9zhGWUGHvb1MalM8L7VqF92iRUT-RIa/view?usp=sharing -O M026-2.tar
gdown --fuzzy https://drive.google.com/file/d/11xVIAJbErEf9RrniqhamHGVNdaGa9-Kd/view?usp=sharing -O M026-1.tar
gdown --fuzzy https://drive.google.com/file/d/1sICjHruXFSj3ib29iShEw0LMOVluTeqj/view?usp=sharing -O M025.tar
gdown --fuzzy https://drive.google.com/file/d/1VDQxx7saZLOIMl-7fXq8TuZFxzDh-vZ9/view?usp=sharing -O M024.tar
gdown --fuzzy https://drive.google.com/file/d/1TMj6Lks3IKtn1T5DI8jUPDLQYx5BZ2RI/view?usp=sharing -O M023.tar
gdown --fuzzy https://drive.google.com/file/d/1R856cJEgbw0mv4mWsCDZRBHa03DH0oP2/view?usp=sharing -O M022.tar
gdown --fuzzy https://drive.google.com/file/d/1YPUech64LhZTmQhj0irQ6oxOxTOWngxf/view?usp=sharing -O M019.tar
gdown --fuzzy https://drive.google.com/file/d/1HdpRfyKvU3nT8B6XwYjUxKxbzG7FHLI_/view?usp=sharing -O M013.tar
gdown --fuzzy https://drive.google.com/file/d/1r_02Rr7qd9zcXGcuOWmkCn-X4gwNdTSY/view?usp=sharing -O M012.tar
gdown --fuzzy https://drive.google.com/file/d/1PlYwxkgRwrDISiIsOYuP6_Yw8RxCImfq/view?usp=sharing -O M011.tar
gdown --fuzzy https://drive.google.com/file/d/1tJMq4W-dQ4Z6IMEs5kK0lYd0s6UjAmrz/view?usp=sharing -O M009.tar
gdown --fuzzy https://drive.google.com/file/d/1HO6n_jIyZXiZBmywk3vN7LcNSeqUWnCh/view?usp=sharing -O M007.tar
gdown --fuzzy https://drive.google.com/file/d/1ubqPoVb2f2dPEqBm07fAr-H-mqiSHoOV/view?usp=sharing -O M005.tar
gdown --fuzzy https://drive.google.com/file/d/1_sTB4DoljwU4IY2YTVkRQ_8bj28Le9-c/view?usp=sharing -O M003.tar

You are welcome

How to generate image(384x384) from video?

Hi,Thanks for your works!However, you code donot include the code(generate image from video).I think this very code is important for this work. Could you release the code?Thank you very much!

Test Data

Thanks for releasing the MEAD dataset. Can you please mention the subject ids of the Test set used in the paper ?

How to inference on my own reference image?

Thank you for great work! But I have an issue with getting results.

When I run test.py with my own image, I get a poor result like below.
gen_459
What should i do to get the better result?

Missing files for running the demo (Refinement/demo.py)

Hello,

Thank you for sharing your work. I tried to run the demo (Refinement/demo.py), but found that some files are missing, especially "./lists/pca.pickle". I think I understand what it means, but there is no code to show how to compute it from the training data, and could you please share the pickle file if possible?

Also when I tried to run Refinement/demo.py, there seems no page2em() in Refinement/utils_parallel.py and can I use page2emo() in Refinement/trainer_demo.py instead?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.