andr345 / frtm-vos Goto Github PK
View Code? Open in Web Editor NEWCode accompanying the paper Learning Fast and Robust Target Models for Video Object Segmentation
License: GNU General Public License v3.0
Code accompanying the paper Learning Fast and Robust Target Models for Video Object Segmentation
License: GNU General Public License v3.0
Hi, I can run evaluate.py, but run train.py faild.
(p36) rtm@rtm:~/zc/f/frtm-vos$ python train.py dv17_res101 --ftext resnet101 --dset dv2017 --dev cuda:0
Compiling npp extension
done
Traceback (most recent call last):
File "train.py", line 133, in <module>
trainer.train()
File "/home/rtm/zc/f/frtm-vos/lib/training.py", line 131, in train
stats = self.model(*batch)
File "/home/rtm/Envs/p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/rtm/zc/f/frtm-vos/model/training_model.py", line 92, in forward
cache_hits = self._initialize(images[0], labels[0], specs)
File "/home/rtm/zc/f/frtm-vos/model/training_model.py", line 137, in _initialize
self.save_target_model(specs[i], L, self.tmodels[i].get_state_dict())
AttributeError: 'TargetObject' object has no attribute 'get_state_dict'
my env:
pytohn3.6
torch1.0.1
Do you have any suggestions?
Thanks!
Sorry to disturb you again. I observed the output of your code and found that the segmentation of fine objects and small objects is relatively poor. I wonder if we can add conv3_x or conv2_X features to the discriminator, and also output a scores, and the resnet layer Features are spliced in TSE to improve the segmentation effect of small targets.
I tried it, but I really don't know how to modify it. If you can provide some ideas, which parts need to be modified, and some important details, I will be very grateful! !
sorry , I have no question now.
code ///////////////////////////////////
paths = dict(
models=Path(file).parent / "checkpoints/try", # The .pth files should be here
davis="~/disk/MATNet/data/DAVIS2017", # DAVIS dataset root
yt2018="/path/to/ytvos2018", # YouTubeVOS 2018 root
output="/outputs", # Output path
)
traceback////////////////////////////
Traceback (most recent call last):
File "/home/fg/disk/frtm-vos/evaluate.py", line 152, in
os.makedirs(out_path)
File "/home/fg/anaconda3/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/fg/anaconda3/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/outputs'
Hi, tanks for your excellent work.
During reading your paper and codes, I have a question:
A L2 loss is adopted in Equation (2) as in your paper.
But in discriminator.py, the residual was computed by residuals = self.w * (s - self.y), which dosen't seem to be a L2 loss.
Could you explain why the residual is not computed as in your paper? Or, are they equivalent to each other?
Thanks a lot.
Hi,thanks for your nice code.The approach in the paper is to fine-tune the target net when testing.I'd like to ask whether you did this in the evaluate.py or not? Thank you.
when i run evaluate.py, it appears this mistake, can you provide some suggestions?
Compiling npp extension
Traceback (most recent call last):
File "/data2/jaffeProj/frtm/evaluate.py", line 18, in
from lib.datasets import DAVISDataset, YouTubeVOSDataset
File "/data2/jaffeProj/frtm/lib/datasets.py", line 6, in
from lib.image import imread
File "/data2/jaffeProj/frtm/lib/image.py", line 6, in
from ._npp import nppig_cpp
File "/data2/jaffeProj/frtm/lib/_npp/init.py", line 17, in
with_cuda=True, build_directory='/home/jaffe/tmp') # _build_dir
File "/home/jaffe/miniconda3/envs/pytracking/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 680, in load
is_python_module)
File "/home/jaffe/miniconda3/envs/pytracking/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 877, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/home/jaffe/miniconda3/envs/pytracking/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1088, in _import_module_from_library
return imp.load_module(module_name, file, path, description)
File "/home/jaffe/miniconda3/envs/pytracking/lib/python3.7/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/home/jaffe/miniconda3/envs/pytracking/lib/python3.7/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libnppc.so.9.2: cannot open shared object file: No such file or directory
Hello, my test model is very slow on Nvidia RTX3090, but still relatively fast on 1080Ti, do you know why?
Thank you for your wonderful code. I am a green hand, and I want to know:
I can't find the place where we use the path of 'tmcache' in training.py
How can we use this cache(20GB) in training.
Best wishes! ^_^
Hi,
I was trying to train your model with a rs18 as backbone: nonetheless I got the following error:
Traceback (most recent call last):
File "train.py", line 133, in <module>
trainer.train()
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/lib/training.py", line 130, in train
stats = self.model(*batch)
File "/home/gjeanneret/anaconda3/envs/frtm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/training_model.py", line 95, in forward
cache_hits = self._initialize(images[0], labels[0], specs)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/training_model.py", line 137, in _initialize
self.tmodels[i].initialize(ft, lb)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/training_model.py", line 20, in initialize
self.discriminator.init(ft[self.discriminator.layer], mask)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/discriminator.py", line 175, in init
optimizer.run(self.init_iters)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/optimizer.py", line 70, in run
self.run_GN_iter(cg_iter)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/optimizer.py", line 81, in run_GN_iter
self.f0 = self.problem(self.x)
File "/media/user_home1/gjeanneret/SOFTWARE/frtm-vos/model/discriminator.py", line 47, in __call__
s = self.net(self.x)
File "/home/gjeanneret/anaconda3/envs/frtm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/gjeanneret/anaconda3/envs/frtm/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/gjeanneret/anaconda3/envs/frtm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/gjeanneret/anaconda3/envs/frtm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 353, in forward
return self._conv_forward(input, self.weight)
File "/home/gjeanneret/anaconda3/envs/frtm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 349, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Given groups=1, weight of size [32, 1024, 1, 1], expected input[15, 256, 30, 54] to have 1024 channels, but got 256 channels instead
I fixed it with ease just by adding an if condition on line
Line 59 in 64fe105
layer="layer4", in_channels=256 if '18' in feature_extractor else 1024, c_channels=32, out_channels=1
Hi, when i run the code, ImportError: No module named 'nppig_cpp'
Compiling npp extension
Traceback (most recent call last):
File "evaluate.py", line 18, in <module>
from lib.datasets import DAVISDataset, YouTubeVOSDataset
File "/home/rtm/zc/frtm-vos/lib/datasets.py", line 6, in <module>
from lib.image import imread
File "/home/rtm/zc/frtm-vos/lib/image.py", line 6, in <module>
from ._npp import nppig_cpp
File "/home/rtm/zc/frtm-vos/lib/_npp/__init__.py", line 16, in <module>
with_cuda=True, build_directory=_build_dir)
File "/home/rtm/Envs/py36env/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 644, in load
is_python_module)
File "/home/rtm/Envs/py36env/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 824, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/home/rtm/Envs/py36env/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 967, in _import_module_from_library
file, path, description = imp.find_module(module_name, [path])
File "/home/rtm/Envs/py36env/lib/python3.6/imp.py", line 297, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named 'nppig_cpp'
my environment:
torch1.1.0
py3.6
Do you have any suggestions?
Thank you for your great work. Can you give me a link to download YouTubeVos 2018?I can't find the link you have now
在discriminator.py中,
class Discriminator(nn.Module):
...
def apply(self, ft):
self.frame_num += 1
cft = self.project(ft)
self.current_sample = cft
scores = self.filter(cft)
return scores
与seg_network.py中,
class SegNetwork(nn.Module):
...
def forward(self, scores, features, image_size):
num_targets = scores.shape[0]
num_fmaps = features[next(iter(self.ft_channels))].shape[0]
if num_targets > num_fmaps:
multi_targets = True
else:
multi_targets = False
x = None
for i, L in enumerate(self.ft_channels):
ft = features[L]
s = interpolate(scores, ft.shape[-2:]) # Resample scores to match features size
if multi_targets:
h, hpool = self.TSE[L](ft.repeat(num_targets, 1, 1, 1), s, x)
else:
h, hpool = self.TSE[L](ft, s, x)
h = self.RRB1[L](h)
h = self.CAB[L](hpool, h)
x = self.RRB2[L](h)
x = self.project(x, image_size)
return x
Hello, I print cft and scores, and found that for the evaluate process of each frame, the sequence has several targets, and it will print cft and scores the same times. The size of scores is [1,1,m,n], and for different target, the parameters are not the same. I want to know why this is and where is it set up?
Another question: In seg_network.py, I found that num_targets and num_fmaps are always 1, and there are several num_targets in the target, and output 1 the same times, so multi_targets is always False, but the result of segmentation is multi-target. Why? ? I am really confused and need your answer.
Looking forward to your reply!
RuntimeError: Error building extension 'nppig_cpp': b'[1/2] c++ -MMD -MF nppig.o.d -DTORCH_EXTENSION_NAME=nppig_cpp -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/TH -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/hp/anaconda3/envs/open-mmlab/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /media/hp/01a64147-0526-48e6-803a-383ca12a7cad/WH/wh2020/frtm-vos-master/lib/_npp/nppig.cpp -o nppig.o\nFAILED: nppig.o \nc++ -MMD -MF nppig.o.d -DTORCH_EXTENSION_NAME=nppig_cpp -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/TH -isystem /home/hp/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/hp/anaconda3/envs/open-mmlab/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /media/hp/01a64147-0526-48e6-803a-383ca12a7cad/WH/wh2020/frtm-vos-master/lib/_npp/nppig.cpp -o nppig.o\n/media/hp/01a64147-0526-48e6-803a-383ca12a7cad/WH/wh2020/frtm-vos-master/lib/_npp/nppig.cpp:9:37: fatal error: ATen/cuda/CUDAGuard.h: \xe6\xb2\xa1\xe6\x9c\x89\xe9\x82\xa3\xe4\xb8\xaa\xe6\x96\x87\xe4\xbb\xb6\xe6\x88\x96\xe7\x9b\xae\xe5\xbd\x95\ncompilation terminated.\nninja: build stopped: subcommand failed.\n'
done
Evaluating dv2017val
bike-packing: 0%| | 0/69 [00:00<?, ?frames/s]Segmentation fault
Hello, When I run code:"python evaluate.py --model rn18_all.pth --fast --dset dv2017val", it always appears
"Computing J-scores
1/30: bike-packing: 2 objects
Traceback (most recent call last):
File "evaluate.py", line 164, in
evaluate_dataset(dset, out_path, measure='J')
File "/new/personal/limoran/frtm-vos/lib/evaluation.py", line 66, in evaluate_dataset
_print("joint {obj}: acc {score:.3f} \u250a{apf}\u250a".format(obj=obj_id, score=s, apf=text_bargraph(score)))
File "/new/personal/limoran/frtm-vos/lib/evaluation.py", line 20, in _print
print(msg)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 19-89: ordinal not in range(128)"
I don't know why and how to modify the code, could you help me ?
Hello!
Thanks for your wonderful code.
I just learned about this field, and i want to run this code in multi GPU.I modify --dev parameter to cuda:0,1 , but it has no effect. Should I use nn.DataParallel() somewhere in the code or take some other ways?
excuse me ,i want to know when will you show your code,sorry to disturb you
when I run the code, it stops at compiling npp extension, and I don't know why?
Although this question may not be suitable for mentioning here, I have been confused for a long time and hope to get a little hint.
The first question is: why the 'target model' chooses l2 loss instead of other common loss for segmentation(dice, cross-entropy, etc.). And the second question is: why only two convolutional layers without non-linear activation layers can achieve such a brilliant effect. (I noticed that the name for this layer is 'project', the reason behind 'projection layer' also bothers me.)
Hi, thanks for sharing.
When I use the pretrained rn101-ytvos
model to evaluate on the Youtube VOS2018 dataset with nothing else changed, I only get 0.695 which far from 0.721. What should I do to get the correct value?
Looking forward to your response, thank you!
Hi, thanks for sharing!
The discriminative method used in the paper is called the target model, and this corresponds to the "Discriminator" class in the code. The "Discriminator" is often used in the generative adversarial networks. So how should I understand it and whether it is related to the generative adversarial networks?But from the perspective of the loss function and its model, I did not find that it is related to GAN, so why is it called "Discriminator". Looking forward to your response, thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.