Comments (7)
Using VGG16 model should give you an accuracy around 78%. Using other advanced models will give you higher accuracy. For example, if you use ResNet152, you should be able to achieve around 83%.
For optical flow, there may be bugs in my implementation right now. The preprocessing may need modification. I am doing the experiments and will update the code soon.
from two-stream-pytorch.
May I ask you what are the parameters that you used to achieve the 78% accuracy and after how many epochs you reach this value?
I've used the default parameters (and also other experiments), but the best result still 73.6%, that is far to your 78.
These are the parameters I tried: modality rgb, dataset ucf101, split 1, workers 4, epochs 1000, start-epoch 0, iter-size 5, new_length 1, new_width 340, new_height 256, save-freq 20, resume data/checkpoints, gpus 0, arch vgg16, batch-size 32, learning-rate 0.001 momentum 0.9 weight-decay 5e-4.
For the optical flow the results are even worse (near 60%), do you know on what are related the bugs you mentioned?
from two-stream-pytorch.
Hi, your parameters are good. I think maybe it is because of the input quality. I use opencv to decode the video to images and choose image quality to be 95, which is the default value. If you use ffmpeg to decode the video, the image quality is between 60 to 75. This means the image quality is low, which leads to worse performance. I use ffmpeg to decode before, and I also get 74% accuracy, similar to yours. So I think this is the reason.
For optical flow, I think it is because the preprocessing of optical flow. I use 0.5 and 0.5 for mean and std for now. But maybe it is not appropriate. Because in original caffe implementation, they only subtract the mean, but didn't divide the std. The pre-trained VGG16 models in Caffe and Pytorch accept different preprocessed inputs. Hope this is clear.
from two-stream-pytorch.
Thank for your reply.
To compute the input (both rgb frames and optical flows) I used the extract_optical_flow.sh script of this related project https://github.com/yjxiong/temporal-segment-networks .
The input fit well with your data parser without any issue, so I guess they are ok!
from two-stream-pytorch.
That should work. I also used the same script to compute the input.
from two-stream-pytorch.
I wonder on what is the cause of the difference in the accuracy. After how many epochs you get 78%?
I'm also working on the use of multiple (25/50/100..) frame samples for each test video, then averaging all the prediction to get the actual one, almost all the papers do something like this. I might share the code here once it works.
from two-stream-pytorch.
@dataintensiveapplication I updated the code and add my test script, also reported my accuracy. You can try to see if you can reproduce.
from two-stream-pytorch.
Related Issues (20)
- About pre-trained Model HOT 2
- test video HOT 5
- Question about training the models together HOT 2
- Different running env? HOT 5
- Can you provide your results for loss and accuracy values of spatial and temporal training?
- The number of GPUs? HOT 1
- I use your restnet152 model parameters for testing, but in split_1 the accuracy is only 67.59%.
- Use the video input from the camera for action recognition HOT 7
- Problems about VideoSpatialPrediction.py HOT 2
- How is the two streams fused ? HOT 1
- What is the accuracy of UCF101?
- what's version of pytorch and cuda
- dense_flow 可不可以在windows安装 HOT 1
- 如果没有安装dense_flow,运行build_of.py文件,是不是不会运行出结果 HOT 1
- fusion two stream feature?
- a PROBLEM when using VGG as motion model
- 老师我想问下怎么late fusion呀 HOT 1
- 关于抽帧的图片存放路径 HOT 2
- video sampling rate in training two-stream network
- About parameter --new_length in training RGB videos
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from two-stream-pytorch.