mcg-nju / rtd-action Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation
License: Apache License 2.0
[ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation
License: Apache License 2.0
Hi, Thanks for your wonderful work. If I track this work in my Date, how do we get TEM_scores, such as start score, end score?
Can the code of RTD-Action for ActivityNet-1.3 be rewritten from RTD-Action for THUMOS14?
Ask a question again.
Through experiments, the model makes proposals in different positions by sliding the window. The length of the proposal is basically the same. This phenomenon is due to the setting of positional embedding or other reasons. I added the start and end score convolution to the overall model to train together. @tony2016uestc
跑代码报了这个错,真的不知道出了什么问题
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_0p4sbyi9/none_egcn9ob1/attempt_0/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_0p4sbyi9/none_egcn9ob1/attempt_0/1/error.json
[W ProcessGroupNCCL.cpp:1569] Rank 0 using best-guess GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
[W ProcessGroupNCCL.cpp:1569] Rank 1 using best-guess GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
QStandardPaths: XDG_RUNTIME_DIR points to non-existing path '/run/user/1065', please create it with 0700 permissions.
QStandardPaths: XDG_RUNTIME_DIR points to non-existing path '/run/user/1065', please create it with 0700 permissions.
qt.qpa.screen: QXcbConnection: Could not connect to display localhost:11.0
Could not connect to any X display.
qt.qpa.screen: QXcbConnection: Could not connect to display localhost:11.0
Could not connect to any X display.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 161242) of binary: /home/10601006/apps/anaconda3/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 3/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
GPU能胜任训练吗
Thank you very much for your open source.
There is a training question, why after setting the random seed each training is still different. The comparison is influenced by random factors.
from datasets import build_dataset
from models import build_model
I don't see these two files in these two folders. Where are the source codes of build_dataset,build_model these two functions?
I am confused about the label problem. In your code, the label of the ground-truth is 0, which represent this is action, can I think that the lable output by the network is 0 for action, not 0 for background? If the predicted label is 0, does it match the ground-truth?
我发现RTD滑动窗口的大小是受到限制的。因为在进行验证时要保证一个batchsize中至少包含一个groundtruth,这就需要对滑动窗口的size进行限制。可是验证是当作测试来做的,不能保证至少有一个groundtruth与其对应,是否需要修改代码,当一个batchsize的滑动窗口没有groundtruth与之对应时,其loss计算方法改变一下。之前我在使用GTAD跑别的数据集时,也遇到过训练正常但是验证loss出现nan的问题,其原因也是没有groundtruth与之对应,造成有分母为0的情况,我把这样的特殊情况,特别设置分母为1。
您好,我去年复现了rtd 在thumos14数据集上的代码,但是我用同样的环境运行anet的代码,却发生了环境方面的错误,nccl278 error。刚才我发现anet代码中util/misc.py文件中,这个函数torch.distributed.init_process_group(
backend=args.dist_backend,
init_method=args.dist_url,
world_size=args.world_size,
rank=args.rank,
),最后一个参数后面有一个逗号(rank=args.rank,),但是thumos14的代码中是没有,我想问下这个多出来的逗号是一个错误吗?
Can you provide the code of ActivityNet dataset?
Hi ,
Thanks for making the code public. I am interested to change the temporal resolution of the current model. However, too many hyperparameter to tune. Which one actually changes the resolution ? It seems if i change window size from 100 to 70 but it fails in evaluation
What's the meaning of the action start end in the TEM_scores files?
Thank you for your reply.
Good job!! i like it.
when will you publish the code?
thanks
Hi @tony2016uestc ,
Thanks for your wonderful work. I plan to follow your work, and try out Action Proposal Generation. But, it is hard for me to implement your approach on my own.
When would you release your code? I am going to use your code to accelerate my reproduction.
Thanks again :)
作者您好,anet代码在我的gpu上训练时间太久,整个流程大概需要5天,这对我来说试错成本太高了。
想问下您,activitynet1.2是activitynet1.3的子类,如果我用RTD在activitynet1.2上训练,从activitynet1.3的视频特征中提取出属于activitynet1.2的视频的特征,这样是否是可以的呢?因为目前tsn的特征,只有activitynet1.3的,没有activitynet1.2的。然后我也将activitynet1.2与activitynet1.3的注释文件进行了对比,结果显示activitynet1.2的视频在activitynet1.3都存在,且视频持续时间都是一样的,每个视频annotations的数量都是一致的,仅有三个视频activitynet1.3的annotations的数量增加了一个。
Hi, can I get the code of BSN in thumos14?
Hi~
Does the input feature and start/end scores is pretrained in advance?
Hello, I see that your paper combines the proposals generated by RTD on activitynet1.3 with untrimmednet to get the detection result. Can you provide the video classification results or model of untrimmednet training on activitynet1.3?
[2024-02-22 19:20:17,149][datasets.builder][WARNING] - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-70dc00f935d3701b/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 138.50it/s]
[2024-02-22 19:20:22,163][torch.nn.parallel.distributed][INFO] - Reducer buckets have been rebuilt in this iteration.
#my get model stuck here, after I interrupt it, it shows below. I want to ask whether this wrong is caused by code or environment
WARNING:torch.distributed.elastic.agent.server.api:Received 2 death signal, shutting down workers
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 4797 closing signal SIGINT
Traceback (most recent call last):
File "main.py", line 84, in
main()
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 119, in run
ret = run_job(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "main.py", line 77, in main
train(model, train_dataloader, validation_dataloader, test_dataloader, accelerator,
File "/root/Biot5/biot5/utils/train_utils.py", line 289, in train
accelerator.backward(loss / args.optim.grad_acc)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/accelerate/accelerator.py", line 1966, in backward
loss.backward(**kwargs)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
KeyboardInterrupt
Traceback (most recent call last):
File "/root/anaconda3/envs/biot5/bin/torchrun", line 8, in
sys.exit(main())
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 241, in launch_agent
result = agent.run()
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/elastic/metrics/api.py", line 129, in wrapper
result = f(*args, **kwargs)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/elastic/agent/server/api.py", line 723, in run
result = self._invoke_run(role)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/elastic/agent/server/api.py", line 864, in _invoke_run
time.sleep(monitor_interval)
File "/root/anaconda3/envs/biot5/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 62, in _terminate_process_handler
raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 4763 got signal: 2
你好,我看到您提供的thumos14_anno__action.json以及thumos14_anno_action_class_idx.json文件中video_test_0001459的duration_second、fps与您提供的thumos14_test_groundtruth.csv中的不一致,通过与gtad对比thumos14_test_groundtruth.csv中的应该是正确的。请问这是您没有注意的一个错误吗?我目前只看了这一个视频,因为我发现thumos14中有些视频的帧频并不是30fps,而在thumos14_anno__action.json、thumos14_anno_action_class_idx.json中您把所有的视频的帧频都设置成了30fps
Can you provide the feature of activitynet 1.3 that are rescaled to 100 via linear interpolation?The link that you provide can't visit and the other link is hard to download.
I would like to ask if you can tell how to do temporal action detection tasks with the generated proposal? What has been modified on the model? It would be nice to tell me the code!
Hi, thanks for your interesting work! I'm wondering when would you share the evaluation code for calculating the Mean Average Precision (mAP)? Thanks so much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.