zacjiang / gma Goto Github PK
View Code? Open in Web Editor NEWLearning to Estimate Hidden Motions with Global Motion Aggregation (ICCV 2021)
License: Do What The F*ck You Want To Public License
Learning to Estimate Hidden Motions with Global Motion Aggregation (ICCV 2021)
License: Do What The F*ck You Want To Public License
请问您是否会考虑公开训练好的模型呀?
I'd like to estimate the optical flow of 4K pictures, but cuda out of memory.(Tesla T4)
Please tell me what I can do.
Thank you!!!
Thanks for the great code !
I try to reproduce the Sintel and KITTI test results reported in the paper. However, I got 1.58 on Sintel clean, 2.64 on Sintel final for GMA(our), and 5.14 on KITTI for GMA(p only). The results seem worse than those reported in the paper (1.39 on Sintel clean, 2.47 on Sintel final for GMA(our), and 4.93 on KITTI for GMA(p only)).
Is it because you find the best iteration checkpoint on the validation set, while I use the last iteration checkpoint? If so, may I know the validation set you choose?
Hi, it is a very nice work!
I would like to ask questions about kitti submission. When I get 'kitti_submission' folder, how can I create the right ZIP file.
The tips on the Kitti website are as follows:
For the optical flow task, do we only need to include the 'flow' folder in the submitted folder?So just rename your procedurally generated 'kitti_submission' folder to 'flow' and compress it into a ZIP file?
Thanks for making your source code public.
Could you please share your code for visualizing the attention map in Figure 6 or guide me on how to obtain it?
Hi, thank you for your great work.
I want to test the GMA on real-world images (ie, not synthetic ones). Could you tell me which one of the four checkpoints (chairs, kitti, sintel, things) is expected to generalize well on real-world images?
I created this issue on the RAFT github, yet I feel like this is applicable to your work as well as you also adopt their FlowAugmentor. Could you maybe provide reasoning for this?
Thanks in advance!
/home/sunyy/anaconda3/envs/gma/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/home/sunyy/anaconda3/envs/gma/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:1290: UserWarning: To get the last learning rate computed by the scheduler, please use get_last_lr()
.
Can you guys share the model?
Hi,can you tell me the traing time of each training phase(charis,things,sintel,kitti) and the data storage device you have used(ssd or hhd)?Thank you very much!
Hi! great work! When I tested sintel-test-final dataset and created a create_sintel_submission, It happened" IndexError: index 0 is out of bounds for axis 0 with size 0" on a pair of images, why?
Hi, file named 'things_val_test_set.txt' cannot be found in in core/datasets.py, which leads to failure when validate flyingthings3d dataset.
Please provide this file, and explain the source of that. Thanks.
Hi, I'm trying to reproduce your result on sintel benchmark. I notice that you use 'C + T + S/K (+ H)' in the experiment table of the paper. To my knowledge, referring to the RAFT paper, C+T+S/K means when you train the sintel stage, you only use C+T+S. I have no idea what is the (+H), even with the explain “S/K (+ H)” refers to methods that are fine-tuned on the Sintel and KITTI datasets, with some also fine-tuned on the HD1K dataset. '. What does the
with some' mean?
Could you please detail the training schedule you used in the Sintel submission? Is it the C+T+S+H?
I was using HD1k when doing Sintel finetuning, just as GMA does. I'm surprised that it only consists of grayscale images; that means there's a big domain gap between HD1k and other training sets. Wonder if I remove HD1k, would the model be trained better? My training without HD1k is ongoing, and seems the loss on the training data is much smaller, and accuracy is higher. Would update when it finishes.
When I input two images of size 768 x 1856 to the model, I got the error below:
einops.EinopsError: Error while processing rearrange-reduction pattern "(y v) d -> y () v d". Input tensor shape: torch.Size([25600, 128]). Additional info: {'y': 232}. Shape mismatch, can't divide axis of length 25600 in chunks of 232
Any idea why this happens?
I wonder how to accurately count floating point operations per second of GMA model. The open interface basically only counts the convolution , for those special operations in the model, are there any good statistical methods? In other words, how do you do that?
Hi, I have run the first two stages in your train.sh, which are chairs and things. But I only get 1.35 and 2.83 EPE on Sintel's clean and final pass.
I want to know how can I get the same result in your paper, that is 1.30 and 2.74. Is it achieved by training multiple times to get the optimal value?
Hi,
I wonder to know whether GMA could handle the case that subtitle accompany with large motion which is common in the movie?
Would the subtitle be kept well in the interpolation result?
Hi @zacjiang
Would you like to release the 2-view results on Sintel as RAFT, i.e. the results without warm-start?
Hey @zacjiang ,
Thank you for sharing your work!
@zacjiang, I was looking to evaluate the pre-trained model on the KITTI test set. I have completed the repository set up, and got it running according to instructions mentioned on GitHub. I was able to reproduce the results mentioned in the paper in Table 2 for the KITTI train dataset.
But When I run it for the KITTI test, the execution of evaluation.py fails. This is because in evaluation.py file, It expects 4 outputs from the data loader,
Lines 348 to 355 in 2f1fd29
Lines 38 to 46 in 2f1fd29
Then in that case how do i get numbers for KITTI Test split.
Regards,
Nitin Bansal
Thank you for your concise and efficient work!
I would like to ask a question about the impact of the transformer head's number. I found no ablation experiments for this variable in the paper you published, and you set it to 1 in your code.
May I ask if you have conducted relevant experiments on this variable, and whether the performance will be improved if the number is increased?
Hello,
Thank you very much for sharing your precious work with us.
I had two questions regarding the code.
Thanks a lot.
Azin
It says,"we project the context feature map to a query feature map and a key feature map. We then take the dot product of the two feature maps and a softmax to obtain an attention matrix"
but in network.py line 99,I just found "attention = self.att(inp)". this is what puzzled me.
Hi, good work ! I have a question .Why don't you use "attention=self.att(fmp1)", but use "attention=self.att(inp)"
Hi, I have run the first two stages in your train.sh, which are chairs and things. But I only get 1.35 and 2.83 EPE on Sintel's clean and final pass.
I want to know how can I get the same result in your paper, that is 1.30 and 2.74. Is it achieved by training multiple times to get the optimal value?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.