suhwan-cho / tmo Goto Github PK
View Code? Open in Web Editor NEW[WACV 2023] Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation
License: MIT License
[WACV 2023] Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation
License: MIT License
.
Hi,
Your work is really great. But how to train on my own dataset? My dataset likes DUTS, which masks are binary. I noticed that it is the validation set because it is only validated on DAVIS. Can you tell me where I can modify the color palette for semantic segmentation?
.
Hello, I'm sorry to bother you again.
In the project, max_epoch is set to 4000. Although the setting is large, the training and convergence are fast. But according to my past experience, isn't the epoch usually set between 100~300? I have set the epoch to 200 on other models, but the training is slower than yours, and they achieve a similar degree of accuracy.
So, I would like to ask what is the epoch here. Is it Iterations or some other parameter? I feel very confused. Could you give me some help?
.
Hello,
I'm coming back to this. I have two questions about output selection:
I trained as well as tested on the new TMO without using output selection and the metrics show up the same as the previous version of TMO, but the visualized binary map has significant edge jaggedness. I'm curious as to why the visualization results differ, but I'm calculating the same metrics. The presence of edge jaggedness should be lower for miou. The new TMO predicted binary plot is below. (Is it because of parameter B? Is the output soft score?)
On my own dataset, performance drops significantly using output selection. I looked at the code and the purpose of output selection seems to be to compare the percentage of more defined pixels in the saliency map (non-blurred areas). Is this somewhat inappropriate for confidence calculations? For example, structural similarity is not taken into account. The parameter B, here, is meant to be binary, so should the output selection be done when calculating the final score?
Looking forward to your reply~
Hi, is it possible to generate single channel output by TMO? as TMO currently uses standard cross entropy loss, you are generating 2 channel output, can we just change the last layer of TMO to generate single channel output? if so, then do we need to change the code of IOU after loss and also the code for J and F values inside mode=val ?
.
Hi, sorry I got a bit confused while using your updated repo. Before there was only code for TMO and there was no any option for choosing encoder or output selection, so it means when we were downloading the code we had TMO's code.
But now we have many options, that is why I have few questions:
torch.cuda.set_device(0)
# define model
ver = 'mitb1'
aos = True
model = TMO(ver, aos).eval()
# training stage
if options.train:
model = torch.nn.DataParallel(model)
train_duts_davis(model, ver)
torch.cuda.set_device(0)
# define model
ver = 'rn101'
aos = True
model = TMO(ver, aos).eval()
# training stage
if options.train:
model = torch.nn.DataParallel(model)
train_duts_davis(model, ver)
TMO with mitb1 encoder:
# set device
torch.cuda.set_device(1)
# define model
ver = 'mitb1'
aos = False
model = TMO(ver, aos).eval()
# training stage
if options.train:
# model = torch.nn.DataParallel(model)
train_duts_davis(model, ver)
TMO with Resnet 101 encoder:
# set device
torch.cuda.set_device(1)
# define model
ver = 'rn101'
aos = False
model = TMO(ver, aos).eval()
# training stage
if options.train:
# model = torch.nn.DataParallel(model)
train_duts_davis(model, ver)
Hi,
my dataset is binary segment data and has been modified in util.py. It can be trained, and the loss shows that it is going down. But the output iou has not been quite normal. I have two datasets; one has stabilized at 1 in less than 20 rounds of iou, and the other iou has been 0. Has the initialization of the model created any problems?
.
您好,我想问一下,您每个数据集的光流生成是否有具体的脚本文件?
Hi, sorry for disturbing you so much. It is because I really liked your approach. However I have few questions:
Is there any difference between TMO and TMO++ in training stage? as much as I checked the papers and looked at the code, I see no difference between them during training stage, both of them use RGB images and Optical Flows as input of motion encoder randomly.
The major difference I saw between them is the output selection algorithm which is not affecting the training process. am I right?
I implemented TMO(Using rn101 encoder) and TMO++(Using rn101 encoder) on some ultrasound images and videos, I used some ultrasound images instead of Duts and some ultrasound videos instead of DAVIS 2016, but TMO performs better than TMO++(While the data set is same for both of them) , for example TMO gets 66.4 in terms of mean of J and F, but TMO++ gets 62.3, the difference is very high and does not seem very reasonable, this made me confused, it is only reasonable if there is a difference in training stage of TMO and TMO++, Would you please help me to understand it?
.
Thanks for your interesting work. I want to query about the inference time. I run your codes on my 2080ti, and the environment is the same. But the inference time is just about 20 FPS, which is the result of your print code. The FPS is quite lower than the 43.2 FPS claimed in your paper. Could you provide more details about inference or give some insight on the inference time difference?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.