chinayi / asformer Goto Github PK
View Code? Open in Web Editor NEWOfficial repo for BMVC2021 paper ASFormer: Transformer for action segmentation
License: MIT License
Official repo for BMVC2021 paper ASFormer: Transformer for action segmentation
License: MIT License
Hi, Thank you for your code
Line 370 in 3940443
Hi,
Thanks for sharing the code. I noticed the bg_class
in the evaluation code is not properly set.
The default name of background class is set to background
, which is true in GTEA yet need to be changed to SIL
for breakfast and action_start
and action_end
for 50salads. It seems they are not changed for the results in the paper.
With the correct class name and the released model, I obtained a lower result
[email protected] | [email protected] | [email protected] | |
---|---|---|---|
Breakfast | 70.9 | 67.5 | 56.7 |
50salads | 83.7 | 81.8 | 73.7 |
Hello,
I am adapting your code for my own dataset which usually train relatively fast when using only ASRF, but when using your model with the transformer it's taking approximately 10x times longer. Do you have a similar behaviour with Salad/breakfast/gtea datasets ?
Thank you :)
作者您好:感谢您做出的系列贡献,我想请教的是您论文部分Table9计算prams flops和GPU mem这里的代码可否方便分享一下,不知如何求得flops大小,谢谢!
I tried to run the pretrained models but i keep getting the following error:
**(myenv) E:\ASN\ASFormer>python main.py --action=predict --dataset=50salads --split=1
Model Size: 1134476
Traceback (most recent call last):
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 189, in nti
n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'ld_tenso'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 2297, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1093, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1035, in frombuf
chksum = nti(buf[148:156])
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 191, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 556, in _load
return legacy_load(f)
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 467, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1589, in open
return func(name, filemode, fileobj, **kwargs)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1619, in taropen
return cls(name, mode, fileobj, **kwargs)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1482, in init
self.firstmember = self.next()
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 2309, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 97, in
trainer.predict(model_dir, results_dir, features_path, batch_gen_tst, num_epochs, actions_dict, sample_rate)
File "E:\ASN\ASFormer\model.py", line 399, in predict
self.model.load_state_dict(torch.load(model_dir + "/epoch-" + str(epoch) + ".model"))
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 387, in load
return _load(f, map_location, pickle_module, pickle_load_args)
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 560, in _load
raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: ./models/50salads/split_1/epoch-120.model is a zip archive (did you mean to use torch.jit.load()?)
I tried changing the torch.load to torch.jit.load but i get another error saying that pytorch version is old to run this. I am using Python 3.6.10, PyTorch 1.1.0, torchvision 0.3.0 and i am for now just trying to run on CPU not GPU. Kindly, need your assistance related this matter. Thank you.
Hi thanks for your work.
I was able to train and test the model and achieve similar performance as mentioned in the paper when I use both enc and dec. However, when I don't use the decoder, the results are much worse than what is mentioned in the table 5 (first row).
I was wondering if I need to do any changes to the setting to get the same performance (specially for Acc)?
I notice that without using the decoder the acc drops lower than 80.
Hello,
Thank you for your amazing work !
I was wondering if there is any particular reason for imposing a batch size of 1 in model.py
:
Line 138 in 89e72d8
In my testing, ASFormer learns fine with bigger batch sizes.
Hello, thank you for sharing your amazing work!.
I have a question when analysing the results.
for images generated like below
What does each row means? and what does each stage 0, 1, 2, 3 means?
Also, as the methods uses each frame's action label to evaluate, you might compare the model with action recognition models too. Is there any specific reason for not comparing result for action recognition?
Thank you in advance!
Hi, can you provide more informations about the feature extraction? I would like to use this fantastic model on my dataset but I don't know how to extract the features to feed to the encoder.
Hi !
Firstly, thanks for sharing this repo ! I'm struggling to download the model (3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg)) Indeed, the site says that you need to create an account to download the file. The thing is I cannot create an account with a french phone number 😅 Any other way to download the pretrained model ?
Many thanks !
I installed the environment as you asked: Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1
It is certain that the model is loaded because the model size is printed:
Model Size: 1130860
But the problem is:
Traceback (most recent call last):
File "main.py", line 99, in
trainer.predict(model_dir, results_dir, features_path, batch_gen_tst, num_epochs, actions_dict, sample_rate)
File "/home/cpslabrtx3090/zjb/projects/ASFormer/model.py", line 399, in predict
self.model.load_state_dict(torch.load(model_dir + "/epoch-" + str(epoch) + ".model"))
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 560, in _load
raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: ./models/gtea/split_1/epoch-120.model is a zip archive (did you mean to use torch.jit.load()?)
Traceback (most recent call last):
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 189, in nti
n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'ld_tenso'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 2299, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1093, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1035, in frombuf
chksum = nti(buf[148:156])
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 191, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 556, in _load
return legacy_load(f)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 467, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1591, in open
return func(name, filemode, fileobj, **kwargs)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1621, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1484, in init
self.firstmember = self.next()
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 2311, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header
你好,我最近也在研究一个类似的任务,看到你的这篇文章很感兴趣,打算借鉴下。请问下,论文里提到的pre-extract feature指的是每帧图像提取出来的feature map吗,能详细阐述下吗?谢谢
Thanks for you nice work, meanwhile, may I confirm one thing? By using your features and pre-trained models (epoch=120), the obtained scores are lower than your BMVC paper for three datasets. For instance, the edit and F1@10 of gtea can only reach 84.0 and 88.9, which are lower than 84.6 and 90.1 in your paper. Same for another two datasets.
50salads edit=75.7, F1@10=83.4.
您好,在您提供的decoder代码中,V 是来自上一个decoder或者encoder的feature, Q 和K利用的x1,即上一层的输出,是不是和论文中描述的不一样?
Hi,
Thank you for your work.
When I try to increase the batch size, the index drops a lot. What do you think are the possible reasons
the GPU is A100 40g.
train with default setting, only split 1
(s1)[83.40807175 81.16591928 72.19730942] 75.934108 83.2241
just change batch size to 8 lr 0.001 and then train
(s1)[68.94977169 67.57990868 55.25114155] 63.931922 72.0049
Hi, I was wondering how is it possible to extract the self attention weight? As you have done in Figure 2. I am interested in the hierarchical case.
您好,您提到的层次注意力是不是指的是band attention(如下图所示),只不过随着层数增加,窗口大小指数递增。这样的话model.py里这个函数里的那个for循环内容,是不是应该改为window_mask[:, i, i:i+self.bl] = 1
def construct_window_mask(self):
window_mask = torch.zeros((1, self.bl, self.bl + 2* (self.bl //2)))
for i in range(self.bl):
window_mask[:, :, i:i+self.bl] = 1
return window_mask.to(device)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.