Code for Transferable Interactiveness Knowledge for Human-Object Interaction Detection. (CVPR'19, TPAMI'21)

License: MIT License

Python 88.01% MATLAB 9.99% Shell 2.00%

human-object-interaction hoi-detection action-recognition hoi-classification hoi-datasets interactiveness-knowledge hoi

transferable-interactiveness-network's Introduction

TIN: Transferable Interactiveness Network

News: (2022.12.19) HAKE 2.0 is accepted by TPAMI!

(2022.11.19) We release the interactive object bounding boxes & classes in the interactions within AVA dataset (2.1 & 2.2)! HAKE-AVA, [Paper]. BTW, we also release a CLIP-based human body part states recognizer in CLIP-Activity2Vec!

(2022.07.29) Our new work PartMap (ECCV'22) is released! Paper, Code

(2022.04.23) Two new works on HOI learning are releassed! Interactiveness Field (CVPR'22) and a new HOI metric mPD (AAAI'22).

(2022.02.14) We release the human body part state labels based on AVA: HAKE-AVA.

(2021.2.7) Upgraded HAKE-Activity2Vec is released! Images/Videos --> human box + ID + skeleton + part states + action + representation. [Description]

(2021.1.15) Our extended version of TIN is accepted by TPAMI!

(2020.10.27) The code of IDN (Paper) in NeurIPS'20 is released!

(2020.6.16) Our larger version HAKE-Large (>120K images, activity and part state labels) is released!

We have opened a tiny repo: HOI learning list (https://github.com/DirtyHarryLYL/HOI-Learning-List). It includes most of the recent HOI-related papers, code, datasets and leaderboard on widely-used benchmarks. Hope it could help everybody interested in HOI.

Code of "Transferable Interactiveness Knowledge for Human-Object Interaction Detection".

Created by Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu.

Link: [CVPR arXiv], [TPAMI arXiv]

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{li2019transferable,
  title={Transferable Interactiveness Knowledge for Human-Object Interaction Detection},
  author={Li, Yong-Lu and Zhou, Siyuan and Huang, Xijie and Xu, Liang and Ma, Ze and Fang, Hao-Shu and Wang, Yanfeng and Lu, Cewu},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3585--3594},
  year={2019}
}
@article{li2022transferable,
  title={Transferable Interactiveness Knowledge for Human-Object Interaction Detection},
  author={Li, Yong-Lu and Liu, Xinpeng and Wu, Xiaoqian and Huang, Xijie and Xu, Liang and Lu, Cewu},
  journal={TPAMI},
  year={2022}
}

Introduction

Interactiveness Knowledge indicates whether human and object interact with each other or not. It can be learned across HOI datasets, regardless of HOI category settings. We exploit an Interactiveness Network to learn the general interactiveness knowledge from multiple HOI datasets and perform Non-Interaction Suppression before HOI classification in inference. On account of the generalization of interactiveness, our TIN: Transferable Interactiveness Network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results. TIN outperforms state-of-the-art HOI detection results by a great margin, verifying its efficacy and flexibility.

Results on HICO-DET and V-COCO (CVPR)

Our Results on HICO-DET dataset

Method	Full(def)	Rare(def)	None-Rare(def)	Full(ko)	Rare(ko)	None-Rare(ko)
RC_D(paper)	13.75	10.23	15.45	15.34	10.98	17.02
RP_DC_D(paper)	17.03	13.42	18.11	19.17	15.51	20.26
RC_T(paper)	10.61	7.78	11.45	12.47	8.87	13.54
RP_T1C_D(paper)	16.91	13.32	17.99	19.05	15.22	20.19
RP_T2C_D(paper)	17.22	13.51	18.32	19.38	15.38	20.57
Interactiveness-optimized	17.54	13.80	18.65	19.75	15.70	20.96

Our Results on V-COCO dataset

Method	Full(def)
RC_D(paper)	43.2
RP_DC_D(paper)	47.8
RC_T(paper)	38.5
RP_T1C_D(paper)	48.3
RP_T2C_D(paper)	48.7
Interactiveness-optimized	49.0

Please note that we have reimplemented TIN (e.g. replacing the vanilla HOI classifier with iCAN and using cosine_decay lr), thus the result here is different and slight better than the one in [Arxiv].

Extended Version (TPAMI 2021): [Paper], [Code]

Besides the instance-level interactiveness between humans and objects, we further propose the part-level interactiveness between body parts and objects (whether a body part is interacted with an object or not). A new large-scale HOI benchmark based on the data from HAKE (CVPR2020), i.e., PaStaNet-HOI is also constructed. It contains 110K+ images with 520 HOIs (without the 80 "no_interaction" HOIs of HICO-DET to avoid the incomplete labeling) and is more difficult than HICO-DET. We hope it can help to benchmark the HOI detection method better.

More details please refer to our PAMI paper arXiv.
The PAMI code is maintained in this [branch].

New results of TPAMI version

RCD in new version: R (representation extractor), C (interaction classifier), D (interactiveness discriminator), slightly different from the CVPR 2019 version.

HICO-DET

Method	Full(def)	Rare(def)	None-Rare(def)	Full(ko)	Rare(ko)	None-Rare(ko)
R+iCAN+D₃	17.58	13.75	18.33	19.13	15.06	19.94
RCD	17.84	13.08	18.78	20.58	16.19	21.45
RCD₁	17.49	12.23	18.53	20.28	15.25	21.27
RCD₂	18.43	13.93	19.32	21.10	16.56	22.00
RCD₃	20.93	18.95	21.32	23.02	20.96	23.42

V-COCO

Method	Scenario-1
R+iCAN+D₃	45.8 (46.1)
RCD	48.4
RCD₁	48.5
RCD₂	48.7
RCD₃	49.1

PaStaNet-HOI

Method	mAP
iCAN	11.0
R+iCAN+D₃	13.13
RCD	15.38

Getting Started

Installation

1.Clone this repository.

git clone https://github.com/DirtyHarryLYL/Transferable-Interactiveness-Network.git

2.Download dataset and setup evaluation and API. (The detection results (person and object boudning boxes) are collected from: iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection [website].)

chmod +x ./script/Dataset_download.sh 
./script/Dataset_download.sh

3.Install Python dependencies.

pip install -r requirements.txt

If you have trouble installing requirements, try to update your pip or try to use conda/virtualenv.

4.Download our pre-trained weight (Optional)

<!-- old links-->
<!-- python script/Download_data.py 1f_w7HQxTfXGxOPrkriu7jTyCTC-KPEH3 Weights/TIN_HICO.zip
python script/Download_data.py 1iU9dN9rLtekcHX2MT_zU_df3Yf0paL9s Weights/TIN_VCOCO.zip -->
<!-- new links-->
python script/Download_data.py 1AJIyLETjcHF4oxjZKk1KtXteAqHdK-De Weights/TIN_HICO.zip
python script/Download_data.py 13559njUIizkqd9Yu8CtWAqX2ZPRSIp7R Weights/TIN_VCOCO.zip

Training

1.Train on HICO-DET dataset

python tools/Train_TIN_HICO.py --num_iteration 2000000 --model TIN_HICO_test

2.Train on V-COCO dataset

python tools/Train_TIN_VCOCO.py --num_iteration 20000 --model TIN_VCOCO_test

Testing

1.Test on HICO-DET dataset

python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO

2.Test on V-COCO dataset

python tools/Test_TIN_VCOCO.py --num_iteration 6000 --model TIN_VCOCO

Notes on training and Q&A

Since the interactiveness branch is easier to converge, first pre-training the whole model with HOI classification loss only then finetuning with both HOI and interactiveness loss is preferred to get the best performance.

Q: How is the used loss weights generated?

A: Please refer to this issue for detailed explanation.

HAKE

You may also be interested in our new work **HAKE**[[website]](http://hake-mvig.cn/home/), HAKE is a new large-scale knowledge base and engine for human activity understanding. HAKE provides elaborate and abundant **body part state** labels for active human instances in a large scale of images and videos. With HAKE, we boost the HOI recognition performance on HICO and some other widely-used human activity benchmarks. Now we are still enlarging and enriching it, and looking forward to working with outstanding researchers around the world on its applications and further improvements. If you have any pieces of advice or interests, please feel free to contact [Yong-Lu Li](https://dirtyharrylyl.github.io/) ([email protected]).

Acknowledgement

Some of the codes are built upon iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection [website]. Thanks them for their great work! The pose estimation results are obtained from AlphaPose . Alpha Pose is an accurate multi-person pose estimator, which is the first real-time open-source system that achieves 70+ mAP (72.3 mAP) on COCO dataset and 80+ mAP (82.1 mAP) on MPII dataset. You may also use your own pose estimation results to train the interactiveness predictor, thus you could directly donwload the train and test pkl files from iCAN [website] and insert your pose results.

If you get any problems or if you find any bugs, don't hesitate to comment on GitHub or make a pull request!

TIN(Transferable Interactiveness Network) is freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an e-mail. We will send the detail agreement to you.

transferable-interactiveness-network's People

Contributors

Stargazers

Watchers

transferable-interactiveness-network's Issues

Where can I find Trainval_GT_HICO_with_pose.pkl ?

I can't find it in the script

How to select correct HOI from multiple predictions during test time.

Hi,
Thank you for making your code available. I've one question.

During test time test_net function in lib/models/test_HICO_pose_pattern_all_wise_pair.py generates multiple hoi detections for each image. Only a few of them are correct predictions, while others are not. But all of them have very high prediction scores. How are you selecting the correct predictions while ignoring the wrong predictions?

This is the information from code that is curated into the pkl file.

            temp.append(Human_out[2])           # Human box
            temp.append(Object[2])              # Object box
            temp.append(Object[4])              # Object class
            temp.append(prediction_HO[0])       # Score (600)
            temp.append(Human_out[5])           # Human score
            temp.append(Object[5])              # Object score
            temp.append(prediction_binary[0])   # binary score

looking to visualize the hoi detections. Any suggestions are really appreciated.

What is the reason for using negative examples?

Hello
Negative samples have wrong relationships. Therefore minimizing loss does not mean the predictions for these negative samples tend to the wrong relations?What is the reason for using negative examples?

Question About TIN_VCOCO.py

Hey,
Thanks for the repo.In TIN_VCOCO.py, you guys introduced HO_weight(Line 77) and H_weight(Line 83). From where you got those values or is it just trial and error?

Test on custom images/video

Hi,
First of all I would like to thank you for the awesome work you have done. It is quite impressive to see such great results.

I need to run the network on some custom images and a custom video, but I am a bit confused as to whether or not this is possible with the code shipped in your repository. There's a pkl file which you import to test the VCOCO detections, but how to generate this on our custom data?

About from_previous_ckpt and from_best_trained_model

Thanks for your job!
Why fix some parameters when loading model from_previous_ckpt and from_best_trained_model

Error using Generate_detection>(parfor body) (line 19). please

Error using Generate_detection>(parfor body) (line 19)
Unable to read file
'/home/magus/huxw-workspace/Transferable-Interactiveness-Network/-Results/TIN_HICONIS_thres_x0.1_y0.9/detections_14.mat'.
No such file or directory.

Error in Generate_detection (line 17)
parfor obj_idx = 1 : 80

about the hico-det metric?

hi,man,i am very confused about the metric on the hico-det datasets metric,as I see,the map is conculated by the mean average precision for every hoi-class,the bbox is from the ground-truth or from the detector's results, can you answer about it? and if you use the detector's result to compute the map, can you tell me your detector's map for the hico-det datasets and do you train the detector ?
I will pay the highest respect

Question about the Trainval_Neg_HICO.pkl

Thanks for your great work!
I want to know how you generate the Trainval_Neg_HICO.pkl?
And the negative (human verb object) has the incorrect verb?

About human pose feats and HICO-Det Dataset Annotation

Good job!
1.Do you use the alpha-pose result trained on MPII or COCO dataset?
2.I would like to know if you filter keypoints the score of which blow the threshold(maybe 0.05 in coco). If done, how to construct the human skeleton if missing some keypoints.

Questions about Code in TIN_HICO.py

Sorry to bother you，I have some doubts about the codes in TIN_HICO.py

Question1：
I see the code:
pool5_O = self.crop_pool_layer(head, self.O_boxes[:self.H_num, :], 'Crop_O')
You only use the positive labels, and I want to ask the shape of your O branch(cls_prob_O) in the below code:
self.predictions["cls_prob_HO"] = cls_prob_sp * (cls_prob_O + cls_prob_H)
Cause I found the the shape of cls_prob_sp and cls_prob_H is (pos+neg, 600), while the shape of cls_prob_O is (pos, 600) when rewrite it in pytorch, they can't add

Question2：
tf.reshape(cls_prob_binary, [1, self.num_binary])
And:
tf.reshape(cls_prob_H, [1, self.num_classes])
tf.reshape(cls_prob_O, [1, self.num_classes])
tf.reshape(cls_prob_sp, [1, self.num_classes])
Are they useless cause I found they are not given to a variable?

problem in installing

Hi. Thankyou for your opensource code, and wonderfull work.
i am having problem in isntalling this repo on colab
pip is not able to find model==0.6.0.
can you guide me how to get around this or is there a type in library name.
I shall be very thankful for your help

Changed the feature extraction resnet50 to resnet101, but AP decreased?

I changed the feature extraction resnet50 to resnet101. Why did the depth increase but the AP decrease?

How should I insert my own pose estimation results in the PLK file

Hello, may I ask how should I insert my own pose estimation results in the PLK file

When will the code be updated?

self.spatial = tf.placeholder(tf.float32, shape=[None, 64, 64, 3], name = 'sp')

Hello, please bother you. I have a small question for you. About self.spatial = tf.placeholder(tf.float32, shape=[None, 64, 64, 3], name ='sp') in your code. You changed the 2 in the code of this paper of ICAN to 3. I would like to ask what is the reason for this change? Thank you.

Questions about the ablation studies

Hi guys, thanks for your nice code!
I'm trying to check the contribution of each stream, but the result is totally different from that in your paper. Here is my method:

For human stream, I just store prediction_H generated by net.test_image_H and repeat it for all objects paired with the current human instance during the test;
For object stream, I use self.predictions["cls_prob_O"] as prediction;
Similarly, I use self.predictions["cls_prob_sp"] as prediction for sp stream;

I train the network jointly and adopt the above settings during the test. And finally I got a result like this: AP = 37.85 for human stream, AP = 31.63 map for object stream and AP = 47.19 for sp stream. I think there must be something wrong with my method, but I have no idea about it. So would you mind share your strategy of the ablation study? How do you guys get the results of different streams?

What does "HO_weight" and "binary_weight" mean?

Hi @DirtyHarryLYL ! Thanks a lot for your great work!

I noticed that in lib/networks/TIN_HICO.py, you've added two extra weights self.HO_weight and self.binary_weight to the classification scores from both HOI and binary classifiers, which is different from the code from iCAN. May I ask why did you multiply the weights with the raw classification scores and how are the weights be generated?

Thanks!

about the class weight?

how do you get the ho_weight and h_weight?

About the batchsize and epoch

Hello, thank you for your work.Your code only has places for iteration, but not for batchsize and epoch.So how are these two parameters set?
Thank you very much.

Zero values for mAP

Hello
When I fix errors and run "python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO
" it crate a notepad file as name of eval_result.txt which have following values:
setting: def
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a

mAP / mRec (full): 0.0000 / 0.0000

mAP / mRec (rare): 0.0000 / 0.0000
mAP / mRec (non-rare): 0.0000 / 0.0000

setting: ko
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a

mAP / mRec (full): 0.0000 / 0.0000

mAP / mRec (rare): 0.0000 / 0.0000
mAP / mRec (non-rare): 0.0000 / 0.0000

Why return all values to zero?

Reading comprehension of the paper

Thanx for ur great work! @HuangOwen
I got some quick questions.

Why u use sp-pose features (sp-pose stream) in network P, but only sp features (sp stream) in network C ? Dose not Pose Map benefit C ?
U may know some object detection methods, e.g., Faster R-CNN (tow-stage, R-CNN based) and SSD (one-stage, withour region proposal stage). Can i just say that, in principle, TIN 与之 iCAN, is quite similar with Faster R-CNN 与之 SSD? although they are from different tasks.
The tiny difference is: In Faster R-CNN, a binary score from binary classification (RPN) does not participate in calculating score in multi-class classification (final detection), while in TIN, final HOI score of a pair is obtained by Sc * Sp

Why not use YOLO as well as SSD as the instance detector.

Hello,

Thank you for your good works.

After reading some papers on the task, I realized they all leverage the Faster R-CNN as the instance detector. I am not sure the main reason for this because YOLO and SSD are also nice detectors. That really confuses me these days. Could anyone please explain this if you are aware of the reason.

Thanks.

How to generate my "Trainval_GT_VCOCO.pkl"and "Trainval_Neg_VCOCO.pkl"

Hi @HuangOwen
Sorry to bother. I have a quick question that maybe you can help.
I wonder how can i generate my own "Trainval_GT_VCOCO.pkl"and "Trainval_Neg_VCOCO.pkl"?
To my understanding, i can install Faster R-CNN and run it on test set of VCOCO to get the test data (to obtain "Test_Faster_RCNN_R-50-PFN_2x_VCOCO.pkl") .
But how can i generate the training data?

Thanx a lot.

60000_TIN_VCOCO_D.pkl

Hello! Thank you very much for your open-source work!
I would like to know how to get the ..._TIN_VCOCO**_D.pkl.** It seems I can only get the ...naked.pkl with data format not available for direct visualization.
Many thanks for your help in advance!

2 questions about CNN architecture

Really wonderful work!

I have 2 questions about the architecture details.
(1 ) In Sec 5.2, "Relatively, the spatial stream is composed of two convolutional layers with max pooling, and two 1024 sized FCs", two 1024FCs are used in the spatial stream of C.
In Figure3, it seems 4 are used as the H and O stream.

Which one is correct?

(2) In iCAN paper, the Residual Block has 2048 channels. In my understanding, you used 1024 in your paper instead and followed by 4 1024 FCs. Am I right?

The shape of pool5_O in early and late fusion

Hello, and thank you for your job well done.
I have a problem with the code in this article.In late fusion, the pool5_O shape is (46,7,7,1024).In early fusion, pool5_O was in the shape of (16,7,7,1024).I have the following questions:

Pool5_O is the same production code in both merges, so am I missing something?Why are the sample sizes inconsistent?
Why are the sample sizes selected differently in the two different fusion methods?
I hope we can get a solution.
Thank you very much.

what is file '-Results/80000_TIN_D_noS.pkl'

Hi, I simply run the command 'python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO' for testing, but encountered a problem that cannot find '-Results/80000_TIN_D_noS.pkl' in HICO_Benchamrk_Binary.py, I have no idea where this file is saved and how is the format it is saved.

Always 1 prediction for the interactivity class?

Hi,

When I feed-forward the images with the provided human-object detections, the interactivity network always outputs a confidence of 1, and 0 to no-interactivity.

I use the provided network weights with the evaluation code (Test_TIN_HICO), and did not do any training myself.

Is this expected? Or does it mean a degenerate solution?

Thank you.

Query About Low-grade Instance Suppressive Function

Hey,
Thanks for the repo. In section 4.3 of the paper, you introduced LIS function. Can you point me out the location where this function is implemented in the code? I am basically curious about how to get the value of T, k and W in the logistic function.

dataset

hi
thanks for your code
I have some questions
1-Is there in one image just one HOI class? when I run demo_visualize, for each image, several human and several objects may be identified but their hoi class is the same.
I dont understand about dataset, please help me
2-please explain about Trainval_GT_HICO.pkl file
thanks

The eigenvector shape problem of early fusion and late fusion

Hello, and thank you for the great job.
I have a problem with the code for this article.In late fusion, the shape of fc7_O is (16,2048).However, the shape of fc7_H in the early fusion was (46,2048).In network, the two feature vectors are generated in the same way.I would like to ask where in the code this difference is implemented?

about .pkl file for train

Hi, thanks for the great work!

I browsed your code for HICO-DET dataset, and I've seen the code for iCAN before.
I saw that you used the .pkl file in function Train_TIN_HICO.py, and (QUESTION_1) i want to know if your Trainval_GT and Trainval_N . pkl file adds a part about pose to the. pkl file (Compared with iCAN).
Because I saw your pickle.load function loaded '*with_pose. pkl'('with_pose'), but (QUESTION_2) I didn't find where to download this .pkl file for train (with a pose info's .pkl file).
i.e.
'''
Trainval_GT = pickle.load( open( cfg.DATA_DIR + '/' + 'Trainval_GT_HICO_with_pose.pkl', "rb" ))
Trainval_N = pickle.load( open( cfg.DATA_DIR + '/' + 'Trainval_Neg_HICO_with_pose.pkl', "rb" ) )
'''
I can't work with the Trainval_GT and Trainval_N .pkl file of iCAN because it will make an error:
'''
File "..\lib\ult\ult.py", line 793, in Augmented_HO_Neg_HICO_pose_pattern_version2
Pattern = Get_next_sp_with_pose(Human_augmented_temp[j][1:], Object_augmented_temp[j][1:], GT[GT_count - 1][5]).reshape(1, 64, 64, 3)
IndexError: list index out of range
'''
if you add information about pose to Trainval_GT.pkl, i will another question (QUESTION_3): Trainval_GT.pkl should be grouth-truth anno, but official anno_bbox.mat file has no information about pose, and how to deal with it, please.

And, I'm not clear how to generate Trainval_Neg_HICO_with_pose.pkl files. Can you provide some relevant code or ideas&process? I will be very grateful.
thank you very much!

Question About reported Map

Hello,
Thanks for the code repo. So in the paper on V-COCO set, did you guys report the map of scenario 1 or scenario 2?

Some question about the implementation

Hi, @DirtyHarryLYL , thanks for the great work!
I roughly browse the code and have some questions, could you give me some advice!

In the VCOCO model:

what the difference between the GT[i][1] and GT[i][4]
I cannot understand why the implementation of the function 'binary_discriminator', why
the 'fc8_binary_2' only contains the 'positive' human features
it seems that you do not compute the 'object_roles' classification, but VCOCO has the
GT for 'object roles'.
and why you set the action numbers to 29?
when you do the evaluation, as there are many human-object combinations even if you
have removed some by the interactiveness, will you put these negative samples to the
results to compute the mAP or just find the best combination which has the highest
action score? (I am confused about the evaluation, will the negative samples influence
the mAP)

Thanks very much! Looking forward to your reply!

> 50+ mAP？wow，that is so high. No idea, maybe something went wrong, or it is the benefit of better object detections. Maybe you can convert the ican's detections to your format, and use the same model to verify the 10+% difference.

50+ mAP？wow，that is so high. No idea, maybe something went wrong, or it is the benefit of better object detections. Maybe you can convert the ican's detections to your format, and use the same model to verify the 10+% difference.

I just made wrong about the evading code,i rewrites it by accuracy method,i then eval it on the val data, it can up to 43 on aprole,, i just read some papers, it's different from each other, do you eval on the val data or on the test data?

Originally posted by @ZHUXUHAN in #3 (comment)

Having trouble installing

When I'm installing the requirements, it stops at datasets=0.0.9 with the error code

Anyone experiencing the same problem?

Question about the baseline structure compared to iCAN

Hi, thanks for the great work!
I have read your code. it seems your baseline structure of $RC_D$ is the same as iCAN (please correct me if I'm wrong).
I wonder why the results of $RC_D$ seems lower than the iCAN, if you could address it, that would be great.
Thanks very much!

about test the model，mAP value is too low

hi，i used ur pre-trained weight ，and run the scripts
python tools/Test_TIN_HICO.py --num_iteration 1700000 --model TIN_HICO
but the mAP is lower then ur mAP value，and i didn't change anything，
this is what i get:
setting: def
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0642 / 0.1615
mAP / mRec (rare): 0.0493 / 0.1433
mAP / mRec (non-rare): 0.0686 / 0.1669
setting: ko
exp_name: rcnn_caffenet_ho_pconv_ip1_s
score_blob: n/a
mAP / mRec (full): 0.0745 / 0.1615
mAP / mRec (rare): 0.0542 / 0.1433
mAP / mRec (non-rare): 0.0805 / 0.1669
so where i did wrong.

How to choose the best iter?

Thanks for your great job！
I want to know how to choose the best iter to test as I found the you didn't set the validation set in your code.

excuse me, how to get self.HO_weight ? thank you

self.HO_weight = np.array([
9.192927, 9.778443, 10.338059, 9.164914, 9.075144, 10.045923, 8.714437, 8.59822, 12.977117, 6.2745423,
11.227917, 6.765012, 9.436157, 9.56762, 11.0675745, 11.530198, 9.609821, 9.897503, 6.664475, 6.811699,
6.644726, 9.170454, 13.670264, 3.903943, 10.556748, 8.814335, 9.519224, 12.753973, 11.590822, 8.278912,
5.5245695, 9.7286825, 8.997436, 10.699849, 9.601237, 11.965516, 9.192927, 10.220277, 6.056692, 7.734048,
8.42324, 6.586457, 6.969533, 10.579222, 13.670264,.......]

Question about the detector and feature extractor

Hi! Thanks for your great work!

I've read your code and noticed that int your model, you imported the detection results from Detectron in Data/Test_Faster_RCNN_R-50-PFN_2x_HICO_DET.pkl, which contains bbox annotations of human and objects. And used another simplified ResNet-50 (just stage1 to stage4) to extract features from the proposals cropped from the raw image according to the annotations above. Is my description true in the model? I would really appreciate it if you can correct me if I am wrong.

Many thanks for your attention! 😁

Retrained the code and got a bad result. Did i do something wrong?

@HuangOwen

I retrained and tested ur code on VCOCO with my GPU, but got a bad result:
Average Role [scenario_1] AP = 30.39
while the result in ur paper is:
AP=47.8 (RPdCd)
Would u mind telling me if i did something wrong?
Here are the scripts i used:
python tools/Train_TIN_VCOCO.py --num_iteration 300000 --model TIN_VCOCO_test
python tools/Test_TIN_VCOCO.py --num_iteration 300000 --model TIN_VCOCO

Thanx again

GT and Neg？

Hello, may I ask how the GT and Neg in the pre-training model were obtained respectively?

I can't download PreTrained weights from GoogleDrive!

Hi,
Thanks for your wonderful contribution.

I'm using Google Colab..

I've used this command.

!python ./script/Download_data.py 1f_w7HQxTfXGxOPrkriu7jTyCTC-KPEH3 Weights/TIN_HICO.zip
!python ./script/Download_data.py 1iU9dN9rLtekcHX2MT_zU_df3Yf0paL9s/Weights/TIN_VCOCO.zip

i got this error!

Traceback (most recent call last):

File "./script/Download_data.py", line 52, in
download_file_from_google_drive(file_id, destination)
File "./script/Download_data.py", line 40, in download_file_from_google_drive
save_response_content(response, destination)
File "./script/Download_data.py", line 24, in save_response_content
with open(destination, "wb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Weights/TIN_HICO.zip'
Traceback (most recent call last):
File "./script/Download_data.py", line 52, in
download_file_from_google_drive(file_id, destination)
File "./script/Download_data.py", line 40, in download_file_from_google_drive
save_response_content(response, destination)
File "./script/Download_data.py", line 24, in save_response_content
with open(destination, "wb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Weights/TIN_VCOCO.zip'

Thanks in advance.

About 60000_TIN_VCOCO_D.pkl

File "tools/vcoco_lis_nis.py", line 184, in generate_pkl
score_binary_d = np.array(dic_d['binary_score'])
KeyError: 'binary_score'

the dic_d is 60000_TIN_VCOCO_D.pkl, and I have download it from https://docs.google.com/uc?export=download&id=1sJipmoZ-5u0ymm8diqYd5Yqk2A-QQBXN

script

For vcoco evaluation

Hello, thank you for your open source code. For vcoco evaluation, when the IOU of detected human and object's boxes and GT of human and object are more than 0.5, and the action prediction is correct, the sample is positive. I want to know what the action prediction is correct ?

In project/lib/networks/TIN_VCOCO.py,
self.predictions["cls_prob_HO_final"] = cls_prob_sp * (cls_prob_O + cls_prob_H)
The scoring method for the final action is: action_score = self.predictions["cls_prob_HO_final"] * LIS function(H_det_score) * LIS function(O_det_score), (ps: Is my understanding correct ?) and What is the value range of the action_score ?

[ {'image_id': # the coco image id,
'person_box': #[x1, y1, x2, y2] the box prediction for the person,
'[action]agent': # the score for action corresponding to the person prediction,
'[action][role]': # [x1, y1, x2, y2, s], the predicted box for role and
# associated score for the action-role pair.
} ]
[x1, y1, x2, y2, s] Is s the action_score above ?
In the VCOCO evaluation code, when the score of an action_score (s in '[action]_[role]') is higher than how much, it is considered to have the action ?
Thank you

where is res50_faster_rcnn?

when run "python tools/Train_TIN_HICO.py --num_iteration 2000000 --model TIN_HICO_test"
give this error "tensorflow.python.framework.errors_impl.NotFoundError: FindFirstFile failed for: E:/Transferable-Interactiveness-Network-master/Transferable-Interactiveness-Network-master/Weights/res50_faster_rcnn : The system cannot find the path specified."

please help me ! I can not find the files.

Hi,
Thanks to reading this mail.
Please help me !
I can not find the files of that 'vcoco_test.json', ' instances_vcoco_all_2014.json' and ' v-coco/data/splits/vcoco_test.ids'.

Can you tell me how to download the files or give me them by mail "[email protected]".
Thanks very much.

dirtyharrylyl / transferable-interactiveness-network Goto Github PK