xlliu7 / tadtr Goto Github PK

[TIP 2022] End-to-end Temporal Action Detection with Transformer

License: Apache License 2.0

Python 93.45% C++ 1.09% Cuda 5.12% Shell 0.34%

transformer temporal-action-localization temporal-action-detection action-recognition pytorch

tadtr's Issues

About th14_i3d2s_ft_info.json

Hello,Thank u for your work!
I want to know how feature_length can be read directly from the video feature files，because I use my own dataset to try this code.

Different lengths of Thumos14 I3D Features

Hi, xiaolong. I'm very interested in your work. As you mentioned in another issue, you use the I3D features form P-GCN for the Thumos14 experiment. I find that some features for the same video have different sizes so that I can't concat them directly. And the diff is always 1. Have you ever met this situation. If ever, how you deal with it? Thx~

Reproducibility of ActivityNet

Hi, first thanks for your great work.
I am trying to reproduce your results in ActivityNet. I follow the operations in your paper. Using TSP features and add some codes in Dataset module. I can run through whole process in ActivityNet but i just cannot get results as good as you present in the paper. For me, the results drop all about 3-4%.
I am wondering whether you have planning to open source the train code for ActivityNet?

Undeterministic results

Hello,

thank you for sharing the code. I checked the code and all of the seeds are set.
I further added torch.backends.cudnn.deterministic = True and torch.backends.cudnn.benchmark = False to make the code produce the same results in different runs. However, the results differ between different runs.
Do you have any idea why?

Thanks in advance.

how to combine with classifier?

Hi @xlliu7,

Interesting paper! I want to know how to combine your model with the classifier? e.g. PGCN in Table 1.
Would you mind sharing the code? Thanks.

E2E-TAD code

Do you have a planed time to release the code of E2E-TAD?

Actionness Regression not working

The claimed improvement from actionness regression does not seem to materialize based on my implementation using this code repository. The results with and without actionness regression are very similar.

Upon inspecting the implementation, I noticed a potential issue:

TadTR/models/tadtr.py

Lines 314 to 325 in 983ae14

 src_segments = outputs['pred_segments'].view((-1, 2)) 

 target_segments = torch.cat([t['segments'] for t in targets], dim=0) 

 losses = {} 

 iou_mat = segment_ops.segment_iou( 

 segment_ops.segment_cw_to_t1t2(src_segments), 

 segment_ops.segment_cw_to_t1t2(target_segments)) 

 gt_iou = iou_mat.max(dim=1)[0] 

 pred_actionness = outputs['pred_actionness'] 

 loss_actionness = F.l1_loss(pred_actionness.view(-1), gt_iou.view(-1).detach())

On line 315, all target segments in the batch are concatenated, and on line 323 the maximum IoU between a predicted segment and all target segments is taken as the actionness ground truth. However, the IoUs are computed across videos, likely producing a maximum IoU between a predicted segment in video A and a ground truth segment in video B.

Even after correcting this issue, there was still no performance improvement from the actionness regression in my runs (the performance drops a lot actually). Upon my inspection, that because the actionness regression suffers from a serious label imbalance problem as most target IoUs are zero.

Request code for ActivityNet

Hello, thanks for the nice work.
I have sent an email to request you the codes for ActivityNet.
Could you share your codes?

Best wishes.

Redundant computation of reference point

https://github.com/xlliu7/TadTR/blob/master/models/transformer.py#L287
这里是在decoder的forward里面计算reference point，
但是在这里
https://github.com/xlliu7/TadTR/blob/master/models/tadtr.py#L185
又重新计算了一次，这两部分应该是重复了

Inference on single video

作者您好！最近准备研究这个方向，请问有没有在单个视频上进行推理的代码？

ImportError: cannot import name 'TemporalDeformableAttention' from 'models.ops.temporal_deform_attn'

I encountered the above error when trying to run the demo code.

How to evaluate the FLOPS?

Could you please share the counting code? Thanks!

train on my dataset with miatakes

when i train on my dataset ,something wrong happened.the map_raw was nan
can you give me some suggestions?
thanks!

Regarding the features

Can you provide the link to download the features.

code releasing date

Thanks for your great work. When will the training & inference code release? Can you give an approximate date? Thanks!

Modification of focal loss for it to works with mix-up augmentation?

I'm trying to train on relatively small datasets, mix-up is one way to reduce it from overfitting, but it seems like focal loss is not designed to works with label with probabilities. It seems that this line

TadTR/models/tadtr.py

Line 274 in 3af0abc

target_classes_onehot.scatter_(2, target_classes.unsqueeze(-1), 1)

specifically designed for binary classification.

Do you have any idea how to modify focal loss for label with probabilities?

Reproduction on ActivityNet1.3 Dataset

Hi, thank you for your brilliant work!

Can you provided your test results on ActivityNet1.3 dataset?

Thank you very much!

No training/inference code or weights

Hi! I'm really interested in using this work for action detection - is there any way I could get access to your training scripts and pretrained weights?

How to generate th14_i3d2s_ft_info.json?

Hello, thank you for your good work!
I want to know how to generate th14_i3d2s_ft_info.json for thumos14 video features. And how to compute ''feature_length", "feature_second" and "feature_fps" for each video?

求源码

请问大佬源码什么时候公布呢？

Missing datasets

Dear @xlliu7, thank you for your valuable contribution to the community.
I know cleaning code and supporting all datasets require much work.
However, I would greatly appreciate it if you were to release the rest of the code for reproducing results in HACS and ActivityNet.

Could you kindly let me know the time horizon for this?

Best,
Mattia

One question about the loss backward of temporal_deform_attn

Thanks open source for this good work.

But, I met a problem.

models/ops/temporal_deform_attn/functions/temporal_deform_attn_func.py", line 40, in backward value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, grad_output, ctx.seq2col_step) RuntimeError: Not implemented

I wonder if it is convenient for you to answer.

...when will the code be released? thanks~

The networks weight

Dear researchers,

Thank you for your work.

The links for your networks weights don t work, which prevent us to reproduce your work.

best regards,

Lines 45 to 46 in 3af0abc

 losses = sum(loss_dict[k] * weight_dict[k] 

 for k in loss_dict.keys() if k in weight_dict)

Looking at the weight_dict, loss_seg is used rather than loss_segments

TadTR/models/tadtr.py

Lines 498 to 501 in 3af0abc

 weight_dict = { 

 'loss_ce': args.cls_loss_coef, 

 'loss_seg': args.seg_loss_coef, 

 'loss_iou': args.iou_loss_coef}

TadTR/models/tadtr.py

Line 299 in 3af0abc

losses['loss_segments'] = loss_segment.sum() / num_segments

For actionness, it's assigned as loss_iou instead of loss_actionness, which replaced the loss_iou by segments loss.

TadTR/models/tadtr.py

Line 327 in 3af0abc

losses['loss_iou'] = loss_actionness

Are these bugs? Could you confirm it? Thanks.

	src_segments = outputs['pred_segments'].view((-1, 2))
	target_segments = torch.cat([t['segments'] for t in targets], dim=0)

	losses = {}

	iou_mat = segment_ops.segment_iou(
	segment_ops.segment_cw_to_t1t2(src_segments),
	segment_ops.segment_cw_to_t1t2(target_segments))

	gt_iou = iou_mat.max(dim=1)[0]
	pred_actionness = outputs['pred_actionness']
	loss_actionness = F.l1_loss(pred_actionness.view(-1), gt_iou.view(-1).detach())

	losses = sum(loss_dict[k] * weight_dict[k]
	for k in loss_dict.keys() if k in weight_dict)

	weight_dict = {
	'loss_ce': args.cls_loss_coef,
	'loss_seg': args.seg_loss_coef,
	'loss_iou': args.iou_loss_coef}

xlliu7 / tadtr Goto Github PK

tadtr's Issues

Recommend Projects

Recommend Topics

Recommend Org