bensaunders27 / progressivetransformersslp Goto Github PK

View Code? Open in Web Editor NEW

100.0 100.0 43.0 1.63 MB

Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

License: Other

Python 100.00%

progressivetransformersslp's People

Contributors

Stargazers

Watchers

progressivetransformersslp's Issues

2D to 3D pose estimation on RWTH-Boston-104 dataset

Hi,

I am working on a different dataset then the one used in this implementation which is RWTH-Boston-104. Videos are grayscale but skeletal poses from OpenPose are in RGB and so the frame image is of dimension (312, 336, 3) . This is giving me errors while using 3DposeEstimator suggested. Can you please help me with any preprocessing step to prepare my frame for input to the 3DposeEstimator. I am very new to deep learning and computer vision domain and any help will be greatly appreciated. Looking forward to your reply.

Thanks,
Divya C

how to lift the 2D joint positions to 3D?

hi, I have extracted the 2D positions using OpenPose, but I have no idea about lifting it to 3D as you mentioned in the paper. Could you please provide the code or give me some tips? Besides, I wonder how does the "trg" file arrange 150 positions data.
thanks :-)

Could you please share your pretrained model?

First of all, appreciate for your fantastic work! I want to write a demo based on this work, could you provide your pretrained model? Thank you!~

how to get the 151st value

When i I was using the 2D to 3D conversion to process my own dataset, i got the 150 values successfully. But the exemplar input is 151 per frame, i have no idea about the 151st point.(i know it is the counter) how do i get it or how to change the program input into 150 points?
Thanks

some question about Back Translation evaluation

hi,
I see that the original slt model takes the images as input, and then use CNNs to get the image features. I wonder in this work, should I take the generated skeleton video or skeleton sequence as input data?
Maybe could you please provide the back translation evalation code for us to study.
thanks a lot

Could you please share your pretrained model?

Hi Ben, its possible to share the per-trained model again? the link seems not working anymore... thx

Getting error while installing requirements.txt

Python version used: 3.10

Trying to install requirements by using pip install -r requirements.txt yields the following error.

Complete error:

Collecting absl-py==0.8.1
Using cached absl-py-0.8.1.tar.gz (103 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error × python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\Adhithiyaa\AppData\Local\Temp\pip-install-jaidvbge\absl-py_38076cf52b3b4e829eeed8eaf2615ca2\setup.py", line 34, in
raise RuntimeError('Python version 2.7 or 3.4+ is required.')
RuntimeError: Python version 2.7 or 3.4+ is required.
[end of output] note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed × Encountered error while generating package metadata.
╰─> See above for output. note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Running the command after removing all version pinning from requirements.txt yields a different error.

Complete error:

ERROR: Could not find a version that satisfies the requirement Brlapi (from versions: none)
ERROR: No matching distribution found for Brlapi

The number of joints

Openpose extracted 137 keypoints for each frame, including "pose_keypoints_2d", "face_keypoints_2d", "hand_right_keypoints_2d", "hand_left_keypoints_2d". But the examples provides 150 joints, what's the difference between them.

Thanks!

Training Issues

Hello,
In your code, /Configs/train.log, I see that you trained 20000 epochs to reduce lr below 0.0002, but I tried to train through your code and configuration, and the result was that you stopped training after 4000 epochs. Why? Is the code different from you?

clarification about future prediction and counter value

Hello,
In your paper, you mention future predictions to improve the G2P transformer. It's not clear for me how to decode the 10 next timesteps given only one timestep. Since the progressive decoder input is both the joint embedding and the counter value (fig2). What joint embedding do you provide as input for the next 10 decoding steps during training?

About the counter value, I didn't see any specific loss on the counter value prediction. The only loss you use is the mse on the predicted sequence of poses. How does that enable the network to learn the counter value correctly?

Thanks in advance.

151 values

Hello Ben,

I was looking at the 2D to 3D conversion and I am not sure how you get the 151st point, I understand for each frame we get 150 points from OpenPose, but how is the value for the 151st point deduced? Is that similar to the counter explained in the paper?

Looking forward to your reply.

Best,

Question on normalizing the 3D joints

Hi, can I ask about the scripts you use to normalize the 3D skeletal joints? I am experimenting with other datasets such as How2Sign (see the attached image). however, I find that the extracted skeletal joints can not be well-scaled compared to your PHOENIX14T-dataset, where all plotted videos are normalized into the same range.

probabily.about.right.mp4

metric

Hello author, can you share your code about using wer and bleu for sign language evaluation, I only see the original wer and bleu code?Thank you very much！

Does anyone reproduce the plausible result using this source code?

Does anyone reproduce the plausible result using this source code?
I have tried several times, but the generated skeleton always looks weird, can anyone help to figure that out？

Could you share the version info of torch, torchtext and other pkg? Thank you~

Pretrained model and preprocessed dataset

Hi,
Does anyone have the pre-processed Phoenix14T data?
I asked for it through the mail address mentioned but there is no answer.

Also, the pretrained model isn't working for me (running the test over it gives bad results),
did anyone manage to run it and get good results? (similar to the paper's video)

Thanks

Can you share your train.log ?

The DTW score seems to be import during training, but we do not know what a proper score should be, can you share some details about your training?

about processed dataset

Hi,
thank you for amazing work!
if you don't mind, can you please share your processed data?

Thank you!

greedy decode

is it right form of greedy decode?
the model keep getting reference poses

how to calculate bleu and rouge?

how to calculate bleu and rouge? thanks!

Train model

How to download the SignProdJoey package, I use pip and conda ,but don't have this package

Counter Value Explanation in Paper

In section 3.2 of the paper, it is written:

At inference, we drive the sequence generation by replacing the predicted counter value, ˆc, 
with the ground truth timing information, c∗, to produce a stable output sequence.

Is this a typo? Is it supposed to mean at train instead of at inference? Because at inference, we would not have access to the length of the pose video and hence the ground truth counter value, right?

Pre-trained model download link is disabled

Hi,

The download link in ProgressiveTransformersSLP main page is now disabled.
https://www.dropbox.com/s/l4xmnybp7luz0l3/PreTrained_PTSLP_Model.ckpt?dl=0.

Can you provide a new one?

Text to Pose settings

Hello,

I have been trying to reproduce results for text to pose, however, the settings provided on the git page are not working for text to pose.

Can you please share the settings and code( if it is different) for T2P and T2G2P?

Looking forward to hearing from you.

Thanks!

Missing main.py and creation of the model

Under README

To run, start main.py with arguments "train" and ".\Configs\Base.yaml":

I'm not seeing the main.py or other scripts related to training.

How to Run this code

Can anyone help me with how to run this code? I need this for my academic project.

Issues with installing via requirements.txt

I am attempting to get this repository to work on Google Colab. I have the following lines:

!git clone https://github.com/BenSaunders27/ProgressiveTransformersSLP.git
!cd ProgressiveTransformersSLP ; pip install -r requirements.txt

It produces the following errors:

Cloning into 'ProgressiveTransformersSLP'...
**truncated output**
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting absl-py==0.8.1
  Using cached absl-py-0.8.1.tar.gz (103 kB)
  Preparing metadata (setup.py) ... done
Collecting alabaster==0.7.8
  Using cached alabaster-0.7.8-py2.py3-none-any.whl (27 kB)
ERROR: Could not find a version that satisfies the requirement apt-clone==0.2.1 (from versions: none)
ERROR: No matching distribution found for apt-clone==0.2.1

I have experiemented some more, and have found further issues, including installing PyTorch 1.3.0 as it is too old - @BenSaunders27 do you have any more information about the environment in which this was built? Thanks.

SLP for text to Pose

Hi,

I wanted to generate SLP skeletal sequences from sentences. I changed config file Base.yaml for src = "text".
It trains alright but the videos generated during validation are of duration 0 secs not even plotting the ground truth.
Please guide me as to where I am going wrong.

Thanks,
Divya

bensaunders27 / progressivetransformersslp Goto Github PK

progressivetransformersslp's People

Contributors

Stargazers

Watchers

Forkers

progressivetransformersslp's Issues

Recommend Projects

Recommend Topics

Recommend Org