dirtyharrylyl / transferable-interactiveness-network Goto Github PK
View Code? Open in Web Editor NEWCode for Transferable Interactiveness Knowledge for Human-Object Interaction Detection. (CVPR'19, TPAMI'21)
License: MIT License
Code for Transferable Interactiveness Knowledge for Human-Object Interaction Detection. (CVPR'19, TPAMI'21)
License: MIT License
Thanx for ur great work! @HuangOwen
I got some quick questions.
Thanks for your great job!
I want to know how to choose the best iter to test as I found the you didn't set the validation set in your code.
Hello! Thank you very much for your open-source work!
I would like to know how to get the ..._TIN_VCOCO**_D.pkl.** It seems I can only get the ...naked.pkl with data format not available for direct visualization.
Many thanks for your help in advance!
Hello, thank you for your open source code. For vcoco evaluation, when the IOU of detected human and object's boxes and GT of human and object are more than 0.5, and the action prediction is correct, the sample is positive. I want to know what the action prediction is correct ?
In project/lib/networks/TIN_VCOCO.py,
self.predictions["cls_prob_HO_final"] = cls_prob_sp * (cls_prob_O + cls_prob_H)
The scoring method for the final action is: action_score = self.predictions["cls_prob_HO_final"] * LIS function(H_det_score) * LIS function(O_det_score), (ps: Is my understanding correct ?) and What is the value range of the action_score ?
[ {'image_id': # the coco image id,
'person_box': #[x1, y1, x2, y2] the box prediction for the person,
'[action]agent': # the score for action corresponding to the person prediction,
'[action][role]': # [x1, y1, x2, y2, s], the predicted box for role and
# associated score for the action-role pair.
} ]
[x1, y1, x2, y2, s] Is s the action_score above ?
In the VCOCO evaluation code, when the score of an action_score (s in '[action]_[role]') is higher than how much, it is considered to have the action ?
Thank you
Hi @DirtyHarryLYL ! Thanks a lot for your great work!
I noticed that in lib/networks/TIN_HICO.py
, you've added two extra weights self.HO_weight
and self.binary_weight
to the classification scores from both HOI and binary classifiers, which is different from the code from iCAN. May I ask why did you multiply the weights with the raw classification scores and how are the weights be generated?
Thanks!
Hi, @DirtyHarryLYL , thanks for the great work!
I roughly browse the code and have some questions, could you give me some advice!
In the VCOCO model:
Thanks very much! Looking forward to your reply!
Hi,
Thanks to reading this mail.
Please help me !
I can not find the files of that 'vcoco_test.json', ' instances_vcoco_all_2014.json' and ' v-coco/data/splits/vcoco_test.ids'.
Can you tell me how to download the files or give me them by mail "[email protected]".
Thanks very much.
Hi @HuangOwen
Sorry to bother. I have a quick question that maybe you can help.
I wonder how can i generate my own "Trainval_GT_VCOCO.pkl"and "Trainval_Neg_VCOCO.pkl"?
To my understanding, i can install Faster R-CNN and run it on test set of VCOCO to get the test data (to obtain "Test_Faster_RCNN_R-50-PFN_2x_VCOCO.pkl") .
But how can i generate the training data?
Thanx a lot.
Hi, I simply run the command 'python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO' for testing, but encountered a problem that cannot find '-Results/80000_TIN_D_noS.pkl' in HICO_Benchamrk_Binary.py, I have no idea where this file is saved and how is the format it is saved.
Hello, and thank you for your job well done.
I have a problem with the code in this article.In late fusion, the pool5_O shape is (46,7,7,1024).In early fusion, pool5_O was in the shape of (16,7,7,1024).I have the following questions:
I changed the feature extraction resnet50 to resnet101. Why did the depth increase but the AP decrease?
Hi,
When I feed-forward the images with the provided human-object detections, the interactivity network always outputs a confidence of 1, and 0 to no-interactivity.
I use the provided network weights with the evaluation code (Test_TIN_HICO), and did not do any training myself.
Is this expected? Or does it mean a degenerate solution?
Thank you.
Hi,
Thank you for making your code available. I've one question.
During test time test_net function in lib/models/test_HICO_pose_pattern_all_wise_pair.py generates multiple hoi detections for each image. Only a few of them are correct predictions, while others are not. But all of them have very high prediction scores. How are you selecting the correct predictions while ignoring the wrong predictions?
This is the information from code that is curated into the pkl file.
temp.append(Human_out[2]) # Human box
temp.append(Object[2]) # Object box
temp.append(Object[4]) # Object class
temp.append(prediction_HO[0]) # Score (600)
temp.append(Human_out[5]) # Human score
temp.append(Object[5]) # Object score
temp.append(prediction_binary[0]) # binary score
looking to visualize the hoi detections. Any suggestions are really appreciated.
Hi,
Thanks for your wonderful contribution.
I'm using Google Colab..
I've used this command.
!python ./script/Download_data.py 1f_w7HQxTfXGxOPrkriu7jTyCTC-KPEH3 Weights/TIN_HICO.zip
!python ./script/Download_data.py 1iU9dN9rLtekcHX2MT_zU_df3Yf0paL9s/Weights/TIN_VCOCO.zip
i got this error!
Traceback (most recent call last):
File "./script/Download_data.py", line 52, in
download_file_from_google_drive(file_id, destination)
File "./script/Download_data.py", line 40, in download_file_from_google_drive
save_response_content(response, destination)
File "./script/Download_data.py", line 24, in save_response_content
with open(destination, "wb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Weights/TIN_HICO.zip'
Traceback (most recent call last):
File "./script/Download_data.py", line 52, in
download_file_from_google_drive(file_id, destination)
File "./script/Download_data.py", line 40, in download_file_from_google_drive
save_response_content(response, destination)
File "./script/Download_data.py", line 24, in save_response_content
with open(destination, "wb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Weights/TIN_VCOCO.zip'
Thanks in advance.
Hello,
Thank you for your good works.
After reading some papers on the task, I realized they all leverage the Faster R-CNN as the instance detector. I am not sure the main reason for this because YOLO and SSD are also nice detectors. That really confuses me these days. Could anyone please explain this if you are aware of the reason.
Thanks.
Hi! Thanks for your great work!
I've read your code and noticed that int your model, you imported the detection results from Detectron in Data/Test_Faster_RCNN_R-50-PFN_2x_HICO_DET.pkl
, which contains bbox annotations of human and objects. And used another simplified ResNet-50 (just stage1 to stage4) to extract features from the proposals cropped from the raw image according to the annotations above. Is my description true in the model? I would really appreciate it if you can correct me if I am wrong.
Many thanks for your attention! 😁
Error using Generate_detection>(parfor body) (line 19)
Unable to read file
'/home/magus/huxw-workspace/Transferable-Interactiveness-Network/-Results/TIN_HICONIS_thres_x0.1_y0.9/detections_14.mat'.
No such file or directory.
Error in Generate_detection (line 17)
parfor obj_idx = 1 : 80
Really wonderful work!
I have 2 questions about the architecture details.
(1 ) In Sec 5.2, "Relatively, the spatial stream is composed of two convolutional layers with max pooling, and two 1024 sized FCs", two 1024FCs are used in the spatial stream of C.
In Figure3, it seems 4 are used as the H and O stream.
Which one is correct?
(2) In iCAN paper, the Residual Block has 2048 channels. In my understanding, you used 1024 in your paper instead and followed by 4 1024 FCs. Am I right?
Hi guys, thanks for your nice code!
I'm trying to check the contribution of each stream, but the result is totally different from that in your paper. Here is my method:
prediction_H
generated by net.test_image_H
and repeat it for all objects paired with the current human instance during the test;self.predictions["cls_prob_O"]
as prediction;self.predictions["cls_prob_sp"]
as prediction for sp stream;I train the network jointly and adopt the above settings during the test. And finally I got a result like this: AP = 37.85 for human stream, AP = 31.63 map for object stream and AP = 47.19 for sp stream. I think there must be something wrong with my method, but I have no idea about it. So would you mind share your strategy of the ablation study? How do you guys get the results of different streams?
Hi,
First of all I would like to thank you for the awesome work you have done. It is quite impressive to see such great results.
I need to run the network on some custom images and a custom video, but I am a bit confused as to whether or not this is possible with the code shipped in your repository. There's a pkl file which you import to test the VCOCO detections, but how to generate this on our custom data?
I retrained and tested ur code on VCOCO with my GPU, but got a bad result:
Average Role [scenario_1] AP = 30.39
while the result in ur paper is:
AP=47.8 (RPdCd)
Would u mind telling me if i did something wrong?
Here are the scripts i used:
python tools/Train_TIN_VCOCO.py --num_iteration 300000 --model TIN_VCOCO_test
python tools/Test_TIN_VCOCO.py --num_iteration 300000 --model TIN_VCOCO
Thanx again
Hello, and thank you for the great job.
I have a problem with the code for this article.In late fusion, the shape of fc7_O is (16,2048).However, the shape of fc7_H in the early fusion was (46,2048).In network, the two feature vectors are generated in the same way.I would like to ask where in the code this difference is implemented?
hi,man,i am very confused about the metric on the hico-det datasets metric,as I see,the map is conculated by the mean average precision for every hoi-class,the bbox is from the ground-truth or from the detector's results, can you answer about it? and if you use the detector's result to compute the map, can you tell me your detector's map for the hico-det datasets and do you train the detector ?
I will pay the highest respect
when run "python tools/Train_TIN_HICO.py --num_iteration 2000000 --model TIN_HICO_test"
give this error "tensorflow.python.framework.errors_impl.NotFoundError: FindFirstFile failed for: E:/Transferable-Interactiveness-Network-master/Transferable-Interactiveness-Network-master/Weights/res50_faster_rcnn : The system cannot find the path specified."
Hello, may I ask how the GT and Neg in the pre-training model were obtained respectively?
50+ mAP?wow,that is so high. No idea, maybe something went wrong, or it is the benefit of better object detections. Maybe you can convert the ican's detections to your format, and use the same model to verify the 10+% difference.
I just made wrong about the evading code,i rewrites it by accuracy method,i then eval it on the val data, it can up to 43 on aprole,, i just read some papers, it's different from each other, do you eval on the val data or on the test data?
Originally posted by @ZHUXUHAN in #3 (comment)
hi,i used ur pre-trained weight ,and run the scripts
python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO
but the mAP is lower then ur mAP value,and i didn't change anything,
this is what i get:
setting: def
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0642 / 0.1615
mAP / mRec (rare): 0.0493 / 0.1433
mAP / mRec (non-rare): 0.0686 / 0.1669
setting: ko
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0745 / 0.1615
mAP / mRec (rare): 0.0542 / 0.1433
mAP / mRec (non-rare): 0.0805 / 0.1669
so where i did wrong.
Hello
Negative samples have wrong relationships. Therefore minimizing loss does not mean the predictions for these negative samples tend to the wrong relations?What is the reason for using negative examples?
Hello,
Thanks for the code repo. So in the paper on V-COCO set, did you guys report the map of scenario 1 or scenario 2?
Hi, thanks for the great work!
I have read your code. it seems your baseline structure of
I wonder why the results of
Thanks very much!
hi
thanks for your code
I have some questions
1-Is there in one image just one HOI class? when I run demo_visualize, for each image, several human and several objects may be identified but their hoi class is the same.
I dont understand about dataset, please help me
2-please explain about Trainval_GT_HICO.pkl file
thanks
Hi, thanks for the great work!
I browsed your code for HICO-DET dataset, and I've seen the code for iCAN before.
I saw that you used the .pkl file in function Train_TIN_HICO.py, and (QUESTION_1) i want to know if your Trainval_GT and Trainval_N . pkl file adds a part about pose to the. pkl file (Compared with iCAN).
Because I saw your pickle.load function loaded '*with_pose. pkl'('with_pose'), but (QUESTION_2) I didn't find where to download this .pkl file for train (with a pose info's .pkl file).
i.e.
'''
Trainval_GT = pickle.load( open( cfg.DATA_DIR + '/' + 'Trainval_GT_HICO_with_pose.pkl', "rb" ))
Trainval_N = pickle.load( open( cfg.DATA_DIR + '/' + 'Trainval_Neg_HICO_with_pose.pkl', "rb" ) )
'''
I can't work with the Trainval_GT and Trainval_N .pkl file of iCAN because it will make an error:
'''
File "..\lib\ult\ult.py", line 793, in Augmented_HO_Neg_HICO_pose_pattern_version2
Pattern = Get_next_sp_with_pose(Human_augmented_temp[j][1:], Object_augmented_temp[j][1:], GT[GT_count - 1][5]).reshape(1, 64, 64, 3)
IndexError: list index out of range
'''
if you add information about pose to Trainval_GT.pkl, i will another question (QUESTION_3): Trainval_GT.pkl should be grouth-truth anno, but official anno_bbox.mat file has no information about pose, and how to deal with it, please.
And, I'm not clear how to generate Trainval_Neg_HICO_with_pose.pkl files. Can you provide some relevant code or ideas&process? I will be very grateful.
thank you very much!
Thanks for your great work!
I want to know how you generate the Trainval_Neg_HICO.pkl?
And the negative (human verb object) has the incorrect verb?
Hello, please bother you. I have a small question for you. About self.spatial = tf.placeholder(tf.float32, shape=[None, 64, 64, 3], name ='sp') in your code. You changed the 2 in the code of this paper of ICAN to 3. I would like to ask what is the reason for this change? Thank you.
I can't find it in the script
Thanks for your job!
Why fix some parameters when loading model from_previous_ckpt and from_best_trained_model
Hey,
Thanks for the repo.In TIN_VCOCO.py, you guys introduced HO_weight(Line 77) and H_weight(Line 83). From where you got those values or is it just trial and error?
Sorry to bother you,I have some doubts about the codes in TIN_HICO.py
Question1:
I see the code:
pool5_O = self.crop_pool_layer(head, self.O_boxes[:self.H_num, :], 'Crop_O')
You only use the positive labels, and I want to ask the shape of your O branch(cls_prob_O) in the below code:
self.predictions["cls_prob_HO"] = cls_prob_sp * (cls_prob_O + cls_prob_H)
Cause I found the the shape of cls_prob_sp and cls_prob_H is (pos+neg, 600), while the shape of cls_prob_O is (pos, 600) when rewrite it in pytorch, they can't add
Question2:
tf.reshape(cls_prob_binary, [1, self.num_binary])
And:
tf.reshape(cls_prob_H, [1, self.num_classes])
tf.reshape(cls_prob_O, [1, self.num_classes])
tf.reshape(cls_prob_sp, [1, self.num_classes])
Are they useless cause I found they are not given to a variable?
Hello, may I ask how should I insert my own pose estimation results in the PLK file
self.HO_weight = np.array([
9.192927, 9.778443, 10.338059, 9.164914, 9.075144, 10.045923, 8.714437, 8.59822, 12.977117, 6.2745423,
11.227917, 6.765012, 9.436157, 9.56762, 11.0675745, 11.530198, 9.609821, 9.897503, 6.664475, 6.811699,
6.644726, 9.170454, 13.670264, 3.903943, 10.556748, 8.814335, 9.519224, 12.753973, 11.590822, 8.278912,
5.5245695, 9.7286825, 8.997436, 10.699849, 9.601237, 11.965516, 9.192927, 10.220277, 6.056692, 7.734048,
8.42324, 6.586457, 6.969533, 10.579222, 13.670264,.......]
Hey,
Thanks for the repo. In section 4.3 of the paper, you introduced LIS function. Can you point me out the location where this function is implemented in the code? I am basically curious about how to get the value of T, k and W in the logistic function.
File "tools/vcoco_lis_nis.py", line 184, in generate_pkl
score_binary_d = np.array(dic_d['binary_score'])
KeyError: 'binary_score'
the dic_d is 60000_TIN_VCOCO_D.pkl, and I have download it from https://docs.google.com/uc?export=download&id=1sJipmoZ-5u0ymm8diqYd5Yqk2A-QQBXN
how do you get the ho_weight and h_weight?
Hello, thank you for your work.Your code only has places for iteration, but not for batchsize and epoch.So how are these two parameters set?
Thank you very much.
Hello
When I fix errors and run "python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO
" it crate a notepad file as name of eval_result.txt which have following values:
setting: def
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0000 / 0.0000
mAP / mRec (rare): 0.0000 / 0.0000
mAP / mRec (non-rare): 0.0000 / 0.0000
setting: ko
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0000 / 0.0000
mAP / mRec (rare): 0.0000 / 0.0000
mAP / mRec (non-rare): 0.0000 / 0.0000
Why return all values to zero?
Good job!
1.Do you use the alpha-pose result trained on MPII or COCO dataset?
2.I would like to know if you filter keypoints the score of which blow the threshold(maybe 0.05 in coco). If done, how to construct the human skeleton if missing some keypoints.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.