Giter VIP home page Giter VIP logo

pose-residual-network's People

Contributors

eakbas avatar salihkaragoz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pose-residual-network's Issues

Evaluate detection part with freezed backbone

Hi,
Could you please tell me how did you evaluate the detection part after training with a freezed backbone (trained for keypoint net)?
I can't achieve the mAP... not even close.

Reproducing results reported by paper

Given that the full network & training flow is not released by the authors, did anyone actually fully succeed in reproducing the results written in the paper (both the accuracy & speed of 23 FPS)? Either DL framework is ok. Thank you.

Hocam bu cok harika olmus

Emre simdi bu hem pose estimation, hem instance segmentation, hem real time.
Kodunu yayinlasaniz da kullansak.

about the infer fps and the network

HI,i head that your implement run 23fps in GTX1080Ti .Did you use mobilenetV2 or some other lite architecture to achieve 23fps?Or you just use traditional convolution which means it can be faster using mobilenetV2?

A question about the speed

As mentioned in the paper:

In terms of running time, our method appears to be the fastest of all multiperson
2D pose estimation methods. Depending on the number of people in the
input image, our method runs at between 27 frames/sec (FPS) (for one person
detection) and 15 FPS (for 20 person detections). For a typical COCO image,
which contains ∼3 people on average, we achieve ∼23 FPS

But,as we can see the speed of Retina Net in the paper 《Focal Loss for Dense Object Detection》 is almost 70ms. So is there any method to speed up?

Is there any pre-trained model available?

Hi,
Your work is great. Look like your method can run at 23 frames / sec while having such good results. I want to ask if you have any plan to publish pre-trained models.
Thank you very much.
Quan Hua

Complete results and a full pretrained model

Hey,
I have read your paper but its unjust not to release a pretrained model for the entire network. I agree you need not release any code base. But promising to do something and failing to do is a bad thing. If you cannnot release the entire pipeline please release the testing framework for testing your model. Im doubtful if the results reported in the paper are reproducable.
Because of such large hyper paramater tuning one will not be able to reproduce exact results as you report.
I think you wont have "license" issue releasing the code for testing the entire pretrained model everyone does it. Atleast point us to the pretrained models you are using.
Its been already approx 6 months from ECCV camera ready. Its high time to release the testing framework for the pretrained model or we would like to take it to the notice of higher authorities.
We will soon open a reddit post regarding the discussion about this paper as a lot of people are facing issues to reproduce the result.

Weird evaluation code in evaluate.py

Start from Line #140 in evaluation.py, it seems to me that you are using the groundtruth keypoints to obtain you keypoint estimation, which should not happen when you evaluate your PRN network. This issue makes it an improper evaluation.

A brief summary is that, first, you generate indexes from the old_weights_bbox (groundtruth). Then. you seem to utilize the index to place a window around that groundtruth position and calculate your estimated scores. Then the output keypoints are summarized from the scores.

I found the issue in the PyTorch version. There was another guy found the same issue Issue #17. Then I came here and found the same issue in this Keras version. @mkocabas please respond to our concerns. Thanks!

No module named 'gaussian

Traceback (most recent call last): File "main.py", line 6, in <module> from src.utils import train_bbox_generator, val_bbox_generator File "/home/dh/github/pose-residual-network/src/utils.py", line 3, in <module> from gaussian import gaussian, gaussian_multi_input_mp, gaussian_multi_output ModuleNotFoundError: No module named 'gaussian'

Predict on my own data

Hi,
Thank you very much for your amazing job.

I've used python main.py to train the model. How can i use the weights to infer my own images?
I do know how to load weights, but the input has to be (56, 36, 17).

If anyone have a code snippet it would be much appreciated.

Thank you!

how to evaluate the human segmentation?

HI,
How to evaluate the human segmentation part as paper described? just using last channel of (K+1) keypoints heatmap? How to extract the segmentation from PRN network.
Any post-processing op to generate the final segmentation?

Thanks!

the speed of this paper, i have some questions. Please help me.

in your paper's abstract ,"the fastest real time system with ∼23 frames/sec". And we can find "Keypoint and person detections take 35 ms while PRN takes 2 ms per instance" in 4.5 runtime analysis ,So we can get a result 1000/(40) ~=23 fps.

my questions are :
1."is the 35ms include "load image, resize image, transform image to tensor, normlized the image , put the image to cuda, and model inference" ? or "35 ms only for model inference?"

2.i test the speed of PRN , is same as reported on paper 2ms/instance ,but how about the time for "select the box from the retinanet" , "crop the feature map for every instance " and "resize the every instance "

i try to implementation whole repo by pytorch, but "load image , resize , to tensor ....." "select box , nms, crop heatmaps for everyone " cost too many time..., So ,how long your code take for those operation?
^-^,Thanks...!

Run the model on webcam

@mkocabas Hey, Thanks for the interesting paper. I am reviewing the code and I am wondering if there is a function that can return directly the 2D joints. I need to run the program on a webcam to check the real-time. Thanks

Will the rest of the MultiPoseNet framework be assembled ot this ?

Hi. I'm very interested in this code, but unfortunately, it seems to only be usable if the actual major part of the undertaking is assembled, which is the backbone, the keypoint detector and the people detector described in the paper.

They seem to all come from different githubs. Will there be a streamlined integration to allow for easy testing and reproduction ?

Thanks.

what is the inputdata of PRN?

i use the input heatmap for instance , crop from global heatmaps .resized to 1K36*56, when i vislized the output of the PRN,is not right

Code Release

The method and results shown in the paper are amazing. When do you plan to release the code?

Congratulations for your work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.