Wonderful job, as a researcher in the same field, I would like to express my appreciat

A question about person_box and action_predictor about alphaction HOT 6 CLOSED

pxssw commented on May 24, 2024

A question about person_box and action_predictor

from alphaction.

Comments (6)

yelantf commented on May 24, 2024

Thank you for your attention! As shown in the paper, our model takes the bounding box on the center frame to do RoIAlign on all frames of the input video clips. This is mainly following previous works, but also because AVA dataset only provides boxes annotated on the center frame. Of course, we could use a tracking algorithm to generate more accurate bounding boxes on every single frame, and then use them to get more robust results. Actually, there are some previous works [link] trying that. However, we did not find a very robust tracker (especially for fast motion scenes), so we chose to use the current design in our method.

from alphaction.

pxssw commented on May 24, 2024

Copy that，wish you to make greater success with the progress in the relevant fields！
At the same time, there are some little problems I meet in the project. (Maybe they are just my personal misunderstanding or bugs, if that please ignore them)

The part update_action_dictionary of visualizer.py: the finall result self.action_dictionary includes the all IDs results (from the first person ), if the project is running for a leng time or for crowds maybe there will be a large demand for calculate resources? Maybe there needs a clean for the long long ago IDS'results.
The cur_millis = stream.get(cv2.CAP_PROP_POS_MSEC) of video_detection_loader.py : I find , in my webcam mode, the begin value of cur_millis is very big (just like 410^8+) , I really don't konw why it not is 0ms, and the value keeps going up for different new running of my project(4.X10^8+, 5.X10^8+...). It's a common problem? I really don't konw.

from alphaction.

yelantf commented on May 24, 2024

Thanks for pointing out these problems! First, I have to admit that our current demo program is not well-designed. It could have some little bugs and is also hard to read. As to these two problems you mentioned above:

Yes, you are right. This is indeed a problem for long time running. We will try to enhance it following your suggestions when we are free. Of course, pull requests are also welcomed.
We did not notice this issue before, and actually we did not fully test the demo script in webcam mode because that requires a server with graphical interfaces and a camera, which is not always available to us. According to the documentation of opencv, this flag should give current position of the video file in milliseconds or video capture timestamp. I'm inclined to think that it is the right format for video timestamp, which is relevant to specific camera?

from alphaction.

pxssw commented on May 24, 2024

good job! 瑕不掩瑜

from alphaction.

jun0wanan commented on May 24, 2024

Thank you for your attention! As shown in the paper, our model takes the bounding box on the center frame to do RoIAlign on all frames of the input video clips. This is mainly following previous works, but also because AVA dataset only provides boxes annotated on the center frame. Of course, we could use a tracking algorithm to generate more accurate bounding boxes on every single frame, and then use them to get more robust results. Actually, there are some previous works [link] trying that. However, we did not find a very robust tracker (especially for fast motion scenes), so we chose to use the current design in our method.

hi,
sorry to disturb you , I want to ask how the 1th clip's person bbox link to 2rd clip's person bbox (the same person)?

best,
jun

from alphaction.

jun0wanan commented on May 24, 2024

hi,
sorry to disturb you , I want to ask how the 1th clip's person bbox link to 2rd clip's person bbox (the same person)?

best,
jun

from alphaction.

A question about person_box and action_predictor about alphaction HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent