chahuja / pats Goto Github PK
View Code? Open in Web Editor NEWPATS Dataset. Aligned Pose-Audio-Transcripts and Style for co-speech gesture research
Home Page: http://chahuja.com/pats
PATS Dataset. Aligned Pose-Audio-Transcripts and Style for co-speech gesture research
Home Page: http://chahuja.com/pats
Thank you for the data you collected. However, I encountered some problems in getting the data.
We noticed that the videos of "Rock" is currently unavailable for download.
I also don't know how to download the videos of "Jon", could you give me some help?
Hi, thank you very much for the nice dataset.
When I plot 52 joint points after reading .h5 file the first joint (I think it's the 'Neck' joint) appears in an unusual location far from other joint points (see the fig). Would you please let me know the reason behind it?
I also have a lack of understanding about 'parents'.
Hi, thanks for releasing this dataset. I can't find the audio files in the data (specifically, for "oliver"; I haven't tried for the others). I unzipped oliver.zip
and the only things under pats/data
are cmu_intervals_df.csv
, cmu_intervals_df_transforms.csv
, missing_intervals.h5
, and processed/
. How can I get the audio files?
Is there any way that we can download this dataset using wget?
Is there a way to obtain the original resolution of frames in the given dataset? I am interested in normalizing the keypoints according to the resolution, any idea how to achieve this? Any help would be greatly appreciated.
Thanks for sharing the data! I found some annoying errors that might be caused by the data.
When I run this:
from data import Data
from tqdm import tqdm
common_kwargs = dict(path2data = 'data',
speaker = ['bee'],
modalities = ['pose/data', 'audio/log_mel_512', 'text/bert'],
fs_new = [15, 15, 15],
batch_size = 4,
window_hop = 5)
data = Data(**common_kwargs)
It raises an error below. Seems that the file data/processed/bee/cmu0000025765.h5
does not contain the text/bert
part. By further investigation, I found many of the h5 in bee
lost their text/bert
part.
G:\PATS\pats\data\dataUtils.py:134: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
self.df = self.df.append(pd.read_csv((Path(self.path2data)/'cmu_intervals_df_transforms.csv').as_posix())) ## file with evil twins
0%| | 0/576 [00:00<?, ?it/s]
data/processed/bee/cmu0000025765.h5 text/bert
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File G:\PATS\pats\data\dataUtils.py:485, in MiniData.__init__(self, path2h5, modalities, fs_new, time, modality_classes, window_hop, style, repeat_text, text_in_modalities, filler, **kwargs)
484 try:
--> 485 data, h5 = self.load(self.path2h5, modality)
486 except:
File G:\PATS\pats\data\common.py:39, in HDF5.load(filename, key)
36 # if key not in h5:
37 # print(f"{key} not in P{filename}!")
38 # return [], h5
---> 39 data = h5[key]
40 return data, h5
File h5py\_objects.pyx:54, in h5py._objects.with_phil.wrapper()
File h5py\_objects.pyx:55, in h5py._objects.with_phil.wrapper()
File ~\anaconda3\envs\pats\lib\site-packages\h5py\_hl\group.py:305, in Group.__getitem__(self, name)
304 elif isinstance(name, (bytes, str)):
--> 305 oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
306 else:
File h5py\_objects.pyx:54, in h5py._objects.with_phil.wrapper()
File h5py\_objects.pyx:55, in h5py._objects.with_phil.wrapper()
File h5py\h5o.pyx:190, in h5py.h5o.open()
KeyError: 'Unable to open object (component not found)'
During handling of the above exception, another exception occurred:
SystemExit Traceback (most recent call last)
[... skipping hidden 1 frame]
Input In [3], in <cell line: 1>()
----> 1 data = Data(**common_kwargs)
File G:\PATS\pats\data\dataUtils.py:155, in Data.__init__(self, path2data, speaker, modalities, fs_new, time, split, batch_size, shuffle, num_workers, window_hop, load_data, style_iters, num_training_sample, sample_all_styles, repeat_text, quantile_sample, quantile_num_training_sample, weighted, filler, num_training_iters)
153 #if self.load_data:
154 ## get train-dev-test split
--> 155 self.datasets = self.tdt_split()
156 self.dataLoader_kwargs = {'batch_size':batch_size,
157 'shuffle':shuffle,
158 'num_workers':num_workers,
159 'pin_memory':False}
File G:\PATS\pats\data\dataUtils.py:292, in Data.tdt_split(self)
290 self.test_intervals = test_intervals
--> 292 dataset_train = ConcatDatasetIndex(self.get_minidata_list(train_intervals))
293 dataset_dev = ConcatDatasetIndex(self.get_minidata_list(dev_intervals))
File G:\PATS\pats\data\dataUtils.py:247, in Data.get_minidata_list(self, intervals)
246 def get_minidata_list(self, intervals):
--> 247 return [MiniData(self.getPath2file(interval_id), style=self.getStyle(interval_id), **self.minidataKwargs)
248 for interval_id in tqdm(intervals)]
File G:\PATS\pats\data\dataUtils.py:247, in <listcomp>(.0)
246 def get_minidata_list(self, intervals):
--> 247 return [MiniData(self.getPath2file(interval_id), style=self.getStyle(interval_id), **self.minidataKwargs)
248 for interval_id in tqdm(intervals)]
File G:\PATS\pats\data\dataUtils.py:491, in MiniData.__init__(self, path2h5, modalities, fs_new, time, modality_classes, window_hop, style, repeat_text, text_in_modalities, filler, **kwargs)
488 # print(f"{modality} not in {self.path2h5}")
489 # return
--> 491 sys.exit(1)
493 self.shapes.append(data.shape)
SystemExit: 1
During handling of the above exception, another exception occurred:
AssertionError Traceback (most recent call last)
[... skipping hidden 1 frame]
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\interactiveshell.py:1972, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code)
1969 if exception_only:
1970 stb = ['An exception has occurred, use %tb to see '
1971 'the full traceback.\n']
-> 1972 stb.extend(self.InteractiveTB.get_exception_only(etype,
1973 value))
1974 else:
1975 try:
1976 # Exception classes can customise their traceback - we
1977 # use this in IPython.parallel for exceptions occurring
1978 # in the engines. This should return a list of strings.
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:578, in ListTB.get_exception_only(self, etype, value)
570 def get_exception_only(self, etype, value):
571 """Only print the exception type and message, without a traceback.
572
573 Parameters
(...)
576 value : exception value
577 """
--> 578 return ListTB.structured_traceback(self, etype, value)
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:436, in ListTB.structured_traceback(self, etype, evalue, etb, tb_offset, context)
433 chained_exc_ids.add(id(exception[1]))
434 chained_exceptions_tb_offset = 0
435 out_list = (
--> 436 self.structured_traceback(
437 etype, evalue, (etb, chained_exc_ids),
438 chained_exceptions_tb_offset, context)
439 + chained_exception_message
440 + out_list)
442 return out_list
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:1105, in AutoFormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
1103 else:
1104 self.tb = tb
-> 1105 return FormattedTB.structured_traceback(
1106 self, etype, value, tb, tb_offset, number_of_lines_of_context)
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:999, in FormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
996 mode = self.mode
997 if mode in self.verbose_modes:
998 # Verbose modes need a full traceback
--> 999 return VerboseTB.structured_traceback(
1000 self, etype, value, tb, tb_offset, number_of_lines_of_context
1001 )
1002 elif mode == 'Minimal':
1003 return ListTB.get_exception_only(self, etype, value)
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:852, in VerboseTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
850 """Return a nice text document describing the traceback."""
851 assert etb is not None
--> 852 formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
853 tb_offset)
855 colors = self.Colors # just a shorthand + quicker name lookup
856 colorsnormal = colors.Normal # used a lot
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:786, in VerboseTB.format_exception_as_a_whole(self, etype, evalue, etb, number_of_lines_of_context, tb_offset)
784 assert isinstance(tb_offset, int)
785 head = self.prepare_header(etype, self.long_header)
--> 786 records = self.get_records(etb, number_of_lines_of_context, tb_offset)
788 frames = []
789 skipped = 0
File ~\anaconda3\envs\pats\lib\site-packages\IPython\core\ultratb.py:840, in VerboseTB.get_records(self, etb, number_of_lines_of_context, tb_offset)
834 options = stack_data.Options(
835 before=before,
836 after=after,
837 pygments_formatter=formatter,
838 )
839 assert etb is not None
--> 840 return list(stack_data.FrameInfo.stack_data(etb, options=options))[tb_offset:]
File ~\anaconda3\envs\pats\lib\site-packages\stack_data\core.py:546, in FrameInfo.stack_data(cls, frame_or_tb, options, collapse_repeated_frames)
530 @classmethod
531 def stack_data(
532 cls,
(...)
536 collapse_repeated_frames: bool = True
537 ) -> Iterator[Union['FrameInfo', RepeatedFrames]]:
538 """
539 An iterator of FrameInfo and RepeatedFrames objects representing
540 a full traceback or stack. Similar consecutive frames are collapsed into RepeatedFrames
(...)
544 and optionally an Options object to configure.
545 """
--> 546 stack = list(iter_stack(frame_or_tb))
548 # Reverse the stack from a frame so that it's in the same order
549 # as the order from a traceback, which is the order of a printed
550 # traceback when read top to bottom (most recent call last)
551 if is_frame(frame_or_tb):
File ~\anaconda3\envs\pats\lib\site-packages\stack_data\utils.py:98, in iter_stack(frame_or_tb)
96 while frame_or_tb:
97 yield frame_or_tb
---> 98 if is_frame(frame_or_tb):
99 frame_or_tb = frame_or_tb.f_back
100 else:
File ~\anaconda3\envs\pats\lib\site-packages\stack_data\utils.py:91, in is_frame(frame_or_tb)
90 def is_frame(frame_or_tb: Union[FrameType, TracebackType]) -> bool:
---> 91 assert_(isinstance(frame_or_tb, (types.FrameType, types.TracebackType)))
92 return isinstance(frame_or_tb, (types.FrameType,))
File ~\anaconda3\envs\pats\lib\site-packages\stack_data\utils.py:172, in assert_(condition, error)
170 if isinstance(error, str):
171 error = AssertionError(error)
--> 172 raise error
AssertionError:
Hello and thank you for this wonderful work. I want to generate a similar dataset for another language and I really need the code that generated this dataset. Is it possible for you to provide me with this code somehow?
Hi, thanks for your great work!
I am now expanding datasets which support more ids. I have some question about human keypoints.
As i observed, OpenPose do not support invisible keypoints, however, in pats dataset, invisible keypoints are still labelled.
I would like to know how you label the invisible points by OpenPose.
Can you give me some advice?or open source data preprocessing scripts :)
Hi, I was curious to know how is the spatial extent average calculated (shown in Figure 2 in the dataset webpage). It would be very helpful if you could provide the code used to calculate the spatial extent average.
I received your email and opened the email URL, but it showed that "we could not find the page you were looking for."
Thanks for your great work!
When fps=15
and time=4.3s
, the video frame can be calculated as 64
, but I don’t understand how you extracted the audio as 64
frames
Hi,
How do I access the original frames (images, not pose) by index?
For example, how can I obtain the relevant frames for the word you
(frames 7-9)?
pd.read_hdf('pats/data/processed/bee/cmu0000025735.h5', key='text/meta')
Word | start_frame | end_frame | |
---|---|---|---|
0 | do | 0.0 | 7.0 |
1 | you | 7.0 | 9.0 |
2 | have | 9.0 | 10.0 |
Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.