Code for Transferable Interactiveness Knowledge for Human-Object Interaction Detection. (CVPR'19, TPAMI'21)

License: MIT License

Python 88.01% MATLAB 9.99% Shell 2.00%

human-object-interaction hoi-detection action-recognition hoi-classification hoi-datasets interactiveness-knowledge hoi

transferable-interactiveness-network's Issues

Reading comprehension of the paper

Thanx for ur great work! @HuangOwen
I got some quick questions.

Why u use sp-pose features (sp-pose stream) in network P, but only sp features (sp stream) in network C ? Dose not Pose Map benefit C ?
U may know some object detection methods, e.g., Faster R-CNN (tow-stage, R-CNN based) and SSD (one-stage, withour region proposal stage). Can i just say that, in principle, TIN 与之 iCAN, is quite similar with Faster R-CNN 与之 SSD? although they are from different tasks.
The tiny difference is: In Faster R-CNN, a binary score from binary classification (RPN) does not participate in calculating score in multi-class classification (final detection), while in TIN, final HOI score of a pair is obtained by Sc * Sp

How to choose the best iter?

Thanks for your great job！
I want to know how to choose the best iter to test as I found the you didn't set the validation set in your code.

Hello! Thank you very much for your open-source work!
I would like to know how to get the ..._TIN_VCOCO**_D.pkl.** It seems I can only get the ...naked.pkl with data format not available for direct visualization.
Many thanks for your help in advance!

For vcoco evaluation

Hello, thank you for your open source code. For vcoco evaluation, when the IOU of detected human and object's boxes and GT of human and object are more than 0.5, and the action prediction is correct, the sample is positive. I want to know what the action prediction is correct ?

In project/lib/networks/TIN_VCOCO.py,
self.predictions["cls_prob_HO_final"] = cls_prob_sp * (cls_prob_O + cls_prob_H)
The scoring method for the final action is: action_score = self.predictions["cls_prob_HO_final"] * LIS function(H_det_score) * LIS function(O_det_score), (ps: Is my understanding correct ?) and What is the value range of the action_score ?

[ {'image_id': # the coco image id,
'person_box': #[x1, y1, x2, y2] the box prediction for the person,
'[action]agent': # the score for action corresponding to the person prediction,
'[action][role]': # [x1, y1, x2, y2, s], the predicted box for role and
# associated score for the action-role pair.
} ]
[x1, y1, x2, y2, s] Is s the action_score above ?
In the VCOCO evaluation code, when the score of an action_score (s in '[action]_[role]') is higher than how much, it is considered to have the action ?
Thank you

What does "HO_weight" and "binary_weight" mean?

Hi @DirtyHarryLYL ! Thanks a lot for your great work!

I noticed that in lib/networks/TIN_HICO.py, you've added two extra weights self.HO_weight and self.binary_weight to the classification scores from both HOI and binary classifiers, which is different from the code from iCAN. May I ask why did you multiply the weights with the raw classification scores and how are the weights be generated?

Thanks!

Some question about the implementation

Hi, @DirtyHarryLYL , thanks for the great work!
I roughly browse the code and have some questions, could you give me some advice!

In the VCOCO model:

what the difference between the GT[i][1] and GT[i][4]
I cannot understand why the implementation of the function 'binary_discriminator', why
the 'fc8_binary_2' only contains the 'positive' human features
it seems that you do not compute the 'object_roles' classification, but VCOCO has the
GT for 'object roles'.
and why you set the action numbers to 29?
when you do the evaluation, as there are many human-object combinations even if you
have removed some by the interactiveness, will you put these negative samples to the
results to compute the mAP or just find the best combination which has the highest
action score? (I am confused about the evaluation, will the negative samples influence
the mAP)

Thanks very much! Looking forward to your reply!

please help me ! I can not find the files.

Hi,
Thanks to reading this mail.
Please help me !
I can not find the files of that 'vcoco_test.json', ' instances_vcoco_all_2014.json' and ' v-coco/data/splits/vcoco_test.ids'.

Can you tell me how to download the files or give me them by mail "[email protected]".
Thanks very much.

How to generate my "Trainval_GT_VCOCO.pkl"and "Trainval_Neg_VCOCO.pkl"

Hi @HuangOwen
Sorry to bother. I have a quick question that maybe you can help.
I wonder how can i generate my own "Trainval_GT_VCOCO.pkl"and "Trainval_Neg_VCOCO.pkl"?
To my understanding, i can install Faster R-CNN and run it on test set of VCOCO to get the test data (to obtain "Test_Faster_RCNN_R-50-PFN_2x_VCOCO.pkl") .
But how can i generate the training data?

Thanx a lot.

what is file '-Results/80000_TIN_D_noS.pkl'

Hi, I simply run the command 'python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO' for testing, but encountered a problem that cannot find '-Results/80000_TIN_D_noS.pkl' in HICO_Benchamrk_Binary.py, I have no idea where this file is saved and how is the format it is saved.

The shape of pool5_O in early and late fusion

Hello, and thank you for your job well done.
I have a problem with the code in this article.In late fusion, the pool5_O shape is (46,7,7,1024).In early fusion, pool5_O was in the shape of (16,7,7,1024).I have the following questions:

Pool5_O is the same production code in both merges, so am I missing something?Why are the sample sizes inconsistent?
Why are the sample sizes selected differently in the two different fusion methods?
I hope we can get a solution.
Thank you very much.

Changed the feature extraction resnet50 to resnet101, but AP decreased?

I changed the feature extraction resnet50 to resnet101. Why did the depth increase but the AP decrease?

Always 1 prediction for the interactivity class?

Hi,

When I feed-forward the images with the provided human-object detections, the interactivity network always outputs a confidence of 1, and 0 to no-interactivity.

I use the provided network weights with the evaluation code (Test_TIN_HICO), and did not do any training myself.

Is this expected? Or does it mean a degenerate solution?

Thank you.

How to select correct HOI from multiple predictions during test time.

Hi,
Thank you for making your code available. I've one question.

During test time test_net function in lib/models/test_HICO_pose_pattern_all_wise_pair.py generates multiple hoi detections for each image. Only a few of them are correct predictions, while others are not. But all of them have very high prediction scores. How are you selecting the correct predictions while ignoring the wrong predictions?

This is the information from code that is curated into the pkl file.

            temp.append(Human_out[2])           # Human box
            temp.append(Object[2])              # Object box
            temp.append(Object[4])              # Object class
            temp.append(prediction_HO[0])       # Score (600)
            temp.append(Human_out[5])           # Human score
            temp.append(Object[5])              # Object score
            temp.append(prediction_binary[0])   # binary score

looking to visualize the hoi detections. Any suggestions are really appreciated.

I can't download PreTrained weights from GoogleDrive!

Hi,
Thanks for your wonderful contribution.

I'm using Google Colab..

I've used this command.

!python ./script/Download_data.py 1f_w7HQxTfXGxOPrkriu7jTyCTC-KPEH3 Weights/TIN_HICO.zip
!python ./script/Download_data.py 1iU9dN9rLtekcHX2MT_zU_df3Yf0paL9s/Weights/TIN_VCOCO.zip

i got this error!

Traceback (most recent call last):

File "./script/Download_data.py", line 52, in
download_file_from_google_drive(file_id, destination)
File "./script/Download_data.py", line 40, in download_file_from_google_drive
save_response_content(response, destination)
File "./script/Download_data.py", line 24, in save_response_content
with open(destination, "wb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Weights/TIN_HICO.zip'
Traceback (most recent call last):
File "./script/Download_data.py", line 52, in
download_file_from_google_drive(file_id, destination)
File "./script/Download_data.py", line 40, in download_file_from_google_drive
save_response_content(response, destination)
File "./script/Download_data.py", line 24, in save_response_content
with open(destination, "wb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Weights/TIN_VCOCO.zip'

Thanks in advance.

Why not use YOLO as well as SSD as the instance detector.

Hello,

Thank you for your good works.

After reading some papers on the task, I realized they all leverage the Faster R-CNN as the instance detector. I am not sure the main reason for this because YOLO and SSD are also nice detectors. That really confuses me these days. Could anyone please explain this if you are aware of the reason.

Thanks.

Question about the detector and feature extractor

Hi! Thanks for your great work!

I've read your code and noticed that int your model, you imported the detection results from Detectron in Data/Test_Faster_RCNN_R-50-PFN_2x_HICO_DET.pkl, which contains bbox annotations of human and objects. And used another simplified ResNet-50 (just stage1 to stage4) to extract features from the proposals cropped from the raw image according to the annotations above. Is my description true in the model? I would really appreciate it if you can correct me if I am wrong.

Many thanks for your attention! 😁

Error using Generate_detection>(parfor body) (line 19). please

Error using Generate_detection>(parfor body) (line 19)
Unable to read file
'/home/magus/huxw-workspace/Transferable-Interactiveness-Network/-Results/TIN_HICONIS_thres_x0.1_y0.9/detections_14.mat'.
No such file or directory.

Error in Generate_detection (line 17)
parfor obj_idx = 1 : 80

When will the code be updated?

2 questions about CNN architecture

Really wonderful work!

I have 2 questions about the architecture details.
(1 ) In Sec 5.2, "Relatively, the spatial stream is composed of two convolutional layers with max pooling, and two 1024 sized FCs", two 1024FCs are used in the spatial stream of C.
In Figure3, it seems 4 are used as the H and O stream.

Which one is correct?

(2) In iCAN paper, the Residual Block has 2048 channels. In my understanding, you used 1024 in your paper instead and followed by 4 1024 FCs. Am I right?

Questions about the ablation studies

Hi guys, thanks for your nice code!
I'm trying to check the contribution of each stream, but the result is totally different from that in your paper. Here is my method:

For human stream, I just store prediction_H generated by net.test_image_H and repeat it for all objects paired with the current human instance during the test;
For object stream, I use self.predictions["cls_prob_O"] as prediction;
Similarly, I use self.predictions["cls_prob_sp"] as prediction for sp stream;

I train the network jointly and adopt the above settings during the test. And finally I got a result like this: AP = 37.85 for human stream, AP = 31.63 map for object stream and AP = 47.19 for sp stream. I think there must be something wrong with my method, but I have no idea about it. So would you mind share your strategy of the ablation study? How do you guys get the results of different streams?

Test on custom images/video

Hi,
First of all I would like to thank you for the awesome work you have done. It is quite impressive to see such great results.

I need to run the network on some custom images and a custom video, but I am a bit confused as to whether or not this is possible with the code shipped in your repository. There's a pkl file which you import to test the VCOCO detections, but how to generate this on our custom data?

Retrained the code and got a bad result. Did i do something wrong?

@HuangOwen

I retrained and tested ur code on VCOCO with my GPU, but got a bad result:
Average Role [scenario_1] AP = 30.39
while the result in ur paper is:
AP=47.8 (RPdCd)
Would u mind telling me if i did something wrong?
Here are the scripts i used:
python tools/Train_TIN_VCOCO.py --num_iteration 300000 --model TIN_VCOCO_test
python tools/Test_TIN_VCOCO.py --num_iteration 300000 --model TIN_VCOCO

Thanx again

The eigenvector shape problem of early fusion and late fusion

Hello, and thank you for the great job.
I have a problem with the code for this article.In late fusion, the shape of fc7_O is (16,2048).However, the shape of fc7_H in the early fusion was (46,2048).In network, the two feature vectors are generated in the same way.I would like to ask where in the code this difference is implemented?

about the hico-det metric?

hi,man,i am very confused about the metric on the hico-det datasets metric,as I see,the map is conculated by the mean average precision for every hoi-class,the bbox is from the ground-truth or from the detector's results, can you answer about it? and if you use the detector's result to compute the map, can you tell me your detector's map for the hico-det datasets and do you train the detector ?
I will pay the highest respect

where is res50_faster_rcnn?

when run "python tools/Train_TIN_HICO.py --num_iteration 2000000 --model TIN_HICO_test"
give this error "tensorflow.python.framework.errors_impl.NotFoundError: FindFirstFile failed for: E:/Transferable-Interactiveness-Network-master/Transferable-Interactiveness-Network-master/Weights/res50_faster_rcnn : The system cannot find the path specified."

Having trouble installing

When I'm installing the requirements, it stops at datasets=0.0.9 with the error code

Anyone experiencing the same problem?

GT and Neg？

Hello, may I ask how the GT and Neg in the pre-training model were obtained respectively?

> 50+ mAP？wow，that is so high. No idea, maybe something went wrong, or it is the benefit of better object detections. Maybe you can convert the ican's detections to your format, and use the same model to verify the 10+% difference.

50+ mAP？wow，that is so high. No idea, maybe something went wrong, or it is the benefit of better object detections. Maybe you can convert the ican's detections to your format, and use the same model to verify the 10+% difference.

I just made wrong about the evading code,i rewrites it by accuracy method,i then eval it on the val data, it can up to 43 on aprole,, i just read some papers, it's different from each other, do you eval on the val data or on the test data?

Originally posted by @ZHUXUHAN in #3 (comment)

about test the model，mAP value is too low

hi，i used ur pre-trained weight ，and run the scripts
python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO
but the mAP is lower then ur mAP value，and i didn't change anything，
this is what i get:
setting: def
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0642 / 0.1615
mAP / mRec (rare): 0.0493 / 0.1433
mAP / mRec (non-rare): 0.0686 / 0.1669
setting: ko
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0745 / 0.1615
mAP / mRec (rare): 0.0542 / 0.1433
mAP / mRec (non-rare): 0.0805 / 0.1669
so where i did wrong.

What is the reason for using negative examples?

Hello
Negative samples have wrong relationships. Therefore minimizing loss does not mean the predictions for these negative samples tend to the wrong relations?What is the reason for using negative examples?

Question About reported Map

Hello,
Thanks for the code repo. So in the paper on V-COCO set, did you guys report the map of scenario 1 or scenario 2?

Question about the baseline structure compared to iCAN

Hi, thanks for the great work!
I have read your code. it seems your baseline structure of $RC_D$ is the same as iCAN (please correct me if I'm wrong).
I wonder why the results of $RC_D$ seems lower than the iCAN, if you could address it, that would be great.
Thanks very much!

dataset

hi
thanks for your code
I have some questions
1-Is there in one image just one HOI class? when I run demo_visualize, for each image, several human and several objects may be identified but their hoi class is the same.
I dont understand about dataset, please help me
2-please explain about Trainval_GT_HICO.pkl file
thanks

problem in installing

Hi. Thankyou for your opensource code, and wonderfull work.
i am having problem in isntalling this repo on colab
pip is not able to find model==0.6.0.
can you guide me how to get around this or is there a type in library name.
I shall be very thankful for your help

about .pkl file for train

Hi, thanks for the great work!

I browsed your code for HICO-DET dataset, and I've seen the code for iCAN before.
I saw that you used the .pkl file in function Train_TIN_HICO.py, and (QUESTION_1) i want to know if your Trainval_GT and Trainval_N . pkl file adds a part about pose to the. pkl file (Compared with iCAN).
Because I saw your pickle.load function loaded '*with_pose. pkl'('with_pose'), but (QUESTION_2) I didn't find where to download this .pkl file for train (with a pose info's .pkl file).
i.e.
'''
Trainval_GT = pickle.load( open( cfg.DATA_DIR + '/' + 'Trainval_GT_HICO_with_pose.pkl', "rb" ))
Trainval_N = pickle.load( open( cfg.DATA_DIR + '/' + 'Trainval_Neg_HICO_with_pose.pkl', "rb" ) )
'''
I can't work with the Trainval_GT and Trainval_N .pkl file of iCAN because it will make an error:
'''
File "..\lib\ult\ult.py", line 793, in Augmented_HO_Neg_HICO_pose_pattern_version2
Pattern = Get_next_sp_with_pose(Human_augmented_temp[j][1:], Object_augmented_temp[j][1:], GT[GT_count - 1][5]).reshape(1, 64, 64, 3)
IndexError: list index out of range
'''
if you add information about pose to Trainval_GT.pkl, i will another question (QUESTION_3): Trainval_GT.pkl should be grouth-truth anno, but official anno_bbox.mat file has no information about pose, and how to deal with it, please.

And, I'm not clear how to generate Trainval_Neg_HICO_with_pose.pkl files. Can you provide some relevant code or ideas&process? I will be very grateful.
thank you very much!

Question about the Trainval_Neg_HICO.pkl

Thanks for your great work!
I want to know how you generate the Trainval_Neg_HICO.pkl?
And the negative (human verb object) has the incorrect verb?

self.spatial = tf.placeholder(tf.float32, shape=[None, 64, 64, 3], name = 'sp')

Hello, please bother you. I have a small question for you. About self.spatial = tf.placeholder(tf.float32, shape=[None, 64, 64, 3], name ='sp') in your code. You changed the 2 in the code of this paper of ICAN to 3. I would like to ask what is the reason for this change? Thank you.

Where can I find Trainval_GT_HICO_with_pose.pkl ?

I can't find it in the script

About from_previous_ckpt and from_best_trained_model

Thanks for your job!
Why fix some parameters when loading model from_previous_ckpt and from_best_trained_model

Question About TIN_VCOCO.py

Hey,
Thanks for the repo.In TIN_VCOCO.py, you guys introduced HO_weight(Line 77) and H_weight(Line 83). From where you got those values or is it just trial and error?

Questions about Code in TIN_HICO.py

Sorry to bother you，I have some doubts about the codes in TIN_HICO.py

Question1：
I see the code:
pool5_O = self.crop_pool_layer(head, self.O_boxes[:self.H_num, :], 'Crop_O')
You only use the positive labels, and I want to ask the shape of your O branch(cls_prob_O) in the below code:
self.predictions["cls_prob_HO"] = cls_prob_sp * (cls_prob_O + cls_prob_H)
Cause I found the the shape of cls_prob_sp and cls_prob_H is (pos+neg, 600), while the shape of cls_prob_O is (pos, 600) when rewrite it in pytorch, they can't add

Question2：
tf.reshape(cls_prob_binary, [1, self.num_binary])
And:
tf.reshape(cls_prob_H, [1, self.num_classes])
tf.reshape(cls_prob_O, [1, self.num_classes])
tf.reshape(cls_prob_sp, [1, self.num_classes])
Are they useless cause I found they are not given to a variable?

How should I insert my own pose estimation results in the PLK file

Hello, may I ask how should I insert my own pose estimation results in the PLK file

excuse me, how to get self.HO_weight ? thank you

self.HO_weight = np.array([
9.192927, 9.778443, 10.338059, 9.164914, 9.075144, 10.045923, 8.714437, 8.59822, 12.977117, 6.2745423,
11.227917, 6.765012, 9.436157, 9.56762, 11.0675745, 11.530198, 9.609821, 9.897503, 6.664475, 6.811699,
6.644726, 9.170454, 13.670264, 3.903943, 10.556748, 8.814335, 9.519224, 12.753973, 11.590822, 8.278912,
5.5245695, 9.7286825, 8.997436, 10.699849, 9.601237, 11.965516, 9.192927, 10.220277, 6.056692, 7.734048,
8.42324, 6.586457, 6.969533, 10.579222, 13.670264,.......]

Query About Low-grade Instance Suppressive Function

Hey,
Thanks for the repo. In section 4.3 of the paper, you introduced LIS function. Can you point me out the location where this function is implemented in the code? I am basically curious about how to get the value of T, k and W in the logistic function.

About 60000_TIN_VCOCO_D.pkl

File "tools/vcoco_lis_nis.py", line 184, in generate_pkl
score_binary_d = np.array(dic_d['binary_score'])
KeyError: 'binary_score'

the dic_d is 60000_TIN_VCOCO_D.pkl, and I have download it from https://docs.google.com/uc?export=download&id=1sJipmoZ-5u0ymm8diqYd5Yqk2A-QQBXN

about the class weight?

how do you get the ho_weight and h_weight?

script

About the batchsize and epoch

Hello, thank you for your work.Your code only has places for iteration, but not for batchsize and epoch.So how are these two parameters set?
Thank you very much.

Zero values for mAP

Hello
When I fix errors and run "python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO
" it crate a notepad file as name of eval_result.txt which have following values:
setting: def
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a

mAP / mRec (full): 0.0000 / 0.0000

mAP / mRec (rare): 0.0000 / 0.0000
mAP / mRec (non-rare): 0.0000 / 0.0000

setting: ko
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a

mAP / mRec (full): 0.0000 / 0.0000

mAP / mRec (rare): 0.0000 / 0.0000
mAP / mRec (non-rare): 0.0000 / 0.0000

Why return all values to zero?

About human pose feats and HICO-Det Dataset Annotation

Good job!
1.Do you use the alpha-pose result trained on MPII or COCO dataset?
2.I would like to know if you filter keypoints the score of which blow the threshold(maybe 0.05 in coco). If done, how to construct the human skeleton if missing some keypoints.

dirtyharrylyl / transferable-interactiveness-network Goto Github PK

transferable-interactiveness-network's Issues

Recommend Projects

Recommend Topics

Recommend Org