Giter VIP home page Giter VIP logo

two-stream-pytorch's People

Contributors

bryanyzhu avatar qijiezhao avatar turingyizhu avatar wenwu313 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

two-stream-pytorch's Issues

Is there a demo to recognition the action?

Hi, may I know whether there is a demo to detect the human action in a short video?Not giving the accuracy of this net,but the prediction just like jump, run ,fall ,dance.Thanks.

Are optical fow compenents rescaled to [0,255] ?

Hello @bryanyzhu ,

Thank you for the work.

In the paper you mentioned that your rescaled the optical flow components to [0,255] before feeding it to temporal netwok. I want to make sure if it also the case for the pretrained flow model resnet152.

Thank you

Loss changes little (~4.5) and Prec@1 keeps extremely low (~2%) when training with flow images

Hi. The code work well with rgb frames after I made some changes. However, it met some mistakes when training with flow imges. The loss and Prec@1 seemd to keep unchanged.

I ran this code on 4 GPUs and the batch size is 224. I set the learning rate to the initial LR (0.005) decayed by 10 every 30 epochs. The new_length was set as 5 and in_channels was change to be 10 (5*2). The flow images were computed with OpenCV and saved as '.jpg':
flow = cv2.calcOpticalFlowFarneback(prevGray, nextGray, None, 0.5, 3, 15, 3, 5, 1.2, 0)
flow_x = cv2.normalize(flow[..., 0],None,0,255,cv2.NORM_MINMAX)
flow_y = cv2.normalize(flow[..., 1],None,0,255,cv2.NORM_MINMAX)

The following figure shows the results at the beginning.
screenshot from 2017-08-22 15-59-31

And the following figure shows the results after 30 epochs.
screenshot from 2017-08-22 16-08-13

I tried vgg16 and inception_v3 (pre-trained model / training from scratch). Also I tried different initial LR, from 0.005 to 0.0001. Same issue. It's weird. Does anyone have comments about this?
screenshot from 2017-08-22 16-54-47

About flow_vgg16

Thanks for your open source code! And I meet a problem on flow_vgg16. I train flow_vgg16 by
python main_single_gpu.py /data/ltj/tsn -m flow -a flow_vgg16 --new_length=10 --epochs 350 --lr 0.001 --lr_steps 200 300
but the result of last epoch on training process is just
* Prec@1 48.744 Prec@3 69.469
and I think I can't obtain 80% even though I run temporal_demo.py.
Can you give me some advice?
And I just change one line of your code:
rgb_weight_mean = torch.mean(rgb_weight, dim=1, keepdim=True)
it is in function change_key_name of flow_vgg16.py. I just add keepdim=True because it will squeeze on dim 1 if I don't add it.

And I have another question about
clip_mean = [0.5, 0.5] * args.new_length clip_std = [0.226, 0.226] * args.new_length
it is in main_single_gpu.py. I want to know why the clip_main and clip_std need to multiply by new_length. Shouldn't the mean value be constant even though there are 10 samples?

a PROBLEM when using VGG as motion model

Hi,I try to use VGG16 as motion-model to fine-tune on ucf101 dataset,first stack 10 x-axis optical folw pics and 10 y-axis pics to get a 20 pics clip,then send it to VGG16 pretrained on ImagNet(change the in_channel from 3 to 20 and change the last classifier layer to 101)to do classification task. I met the same question as #1.
But when I use ResNet architechiture,everything goes well,does it correspoding to the preprocess on optical-flow pics?I dont normalize the optiflow pics.
my training acc1 and test acc1 is around 1% and dosnt go up.

FileNotFoundError: [Errno 2] No such file or directory: '/home/taimoor/.keras/datasets/cifar-10-batches-py/data_batch_1'

I am running cifar10_cnn.py file using this command and getting following exception
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python cifar10_cnn.py
Using TensorFlow backend.
Traceback (most recent call last):
File "cifar10_cnn.py", line 37, in
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
File "/home/taimoor/anaconda3/envs/my_env/lib/python3.6/site-packages/keras/datasets/cifar10.py", line 26, in load_data
data, labels = load_batch(fpath)
File "/home/taimoor/anaconda3/envs/my_env/lib/python3.6/site-packages/keras/datasets/cifar.py", line 18, in load_batch
f = open(fpath, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/home/taimoor/.keras/datasets/cifar-10-batches-py/data_batch_1'

file is present in following directory. i tried by giving absolute path but couldn't run . getting same error.
Tried to get help from google and tried many ways that found via searching issue description but... not resolved
I am Stuck..

Question about training the models together

Hi, I was wondering can the models be trained together with one single loss instead of trained separately?
I have seen other two-stream UCF codes but all of them seem to train separately and then find the loss by adding the output logits or something like that, but do you think its possible to train at once? has this been done?
and if it has, is there any recommended hyperparameters because it seems that they are different for each modality.

Thanks in advance!

The performance of the flow-vgg16

Hi~For flow-vgg, you obtain 80%, the performance reported in the paper your reproduce is 85.7%. What is the back propagation detph of your experiments, may be an impact factor of your performance. And you didn't metion the setting details of VGG-16.

Batch normalization in VGG16

Hi, I just want to know that if batch_norm will influence the performance in rgb_vgg16, because it looks like that you didn't use batch_norm in rgb_vgg16.
def make_layers(cfg, batch_norm=False):
model = VGG(make_layers(cfg['D']), **kwargs)
I saw there is a rgb_vgg16_bn function but it has no available pretrained module, so how about its performance?

speed and pytorch version

excuse me, i want to known the speed of your project, and the pytroch version. if you do not mind, and the other requirement

How to obtain better precision

I set the parameter as readme but obtain 81.76% for spatial stream on split 1 of UCF101 dataset using ResNet152. How to achieve better performance? Thanks.

fusion two stream feature?

Thank you for your coding! How to use SVM to fuse two stream features?I have searched a lot of data, but I still can't complete the feature fusion of SVM?Can you provide some code or some suggestions!thank you!

ModuleNotFoundError: No module named 'torch'

Unable to run source code.getting error. tried many ways to solve but couldn't.
i am using anaconda latest version. Tunning this command and getting error.
'conda install -c peterjc123 pytorch'

pytorch-0.2.1- 100% |###############################| Time: 0:12:11 727.92 kB/s
pytorch-0.2.1- 100% |###############################| Time: 0:04:07 2.15 MB/s
pytorch-0.2.1- 100% |###############################| Time: 0:01:14 7.19 MB/s

CondaError: CondaHTTPError: HTTP 000 CONNECTION FAILED for url https://conda.anaconda.org/peterjc123/win-64/pytorch-0.2.1-py36he6bf560_0.2.1cu80.tar.bz2
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaError: CondaHTTPError: HTTP 000 CONNECTION FAILED for url https://conda.anaconda.org/peterjc123/win-64/pytorch-0.2.1-py36he6bf560_0.2.1cu80.tar.bz2
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

CondaError: CondaHTTPError: HTTP 000 CONNECTION FAILED for url https://conda.anaconda.org/peterjc123/win-64/pytorch-0.2.1-py36he6bf560_0.2.1cu80.tar.bz2
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

.....
Also disable ssl verification in .condarc . and tried again but same issue occurs again n again. i am googling from last three days but stuck :(

prec@1 prec@3

Hello. I trained the resnet152 model on my own dataset. The accuracy of the train phase is quite good. The prec@1 of the train phase is 86%. But in the validation phase, the display accuracy is very low, and prec@1 and prec@3 The difference is relatively large, the prec@1 and prec@3 in the validation phase are 54% and 76% respectively. What is the reason for this? Prec@3 and prec@1 represent what they are, look forward to your reply, thank you!

out of memory

(1)when i run train flow,it often meet out of memory in validate(epoch 23 or 47...),can you give me some advice?Thanks!!!!
(2)How to run it in multi-GPU?
:
Current learning rate is 0.001000:
Epoch: [23][120/383] Time 10.661 (12.373) Loss 1.7992 (1.8488) Prec@1 47.200 (51.433)
Epoch: [23][240/383] Time 10.836 (12.372) Loss 1.7938 (1.8701) Prec@1 50.400 (50.050)
Epoch: [23][360/383] Time 10.346 (12.612) Loss 1.9167 (1.8574) Prec@1 52.000 (50.722)
main_single_gpu.py:282: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
input_var = torch.autograd.Variable(input, volatile=True)
main_single_gpu.py:283: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
target_var = torch.autograd.Variable(target, volatile=True)
main_single_gpu.py:291: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
losses.update(loss.data[0], input.size(0))
main_single_gpu.py:292: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
top1.update(prec1[0], input.size(0))
main_single_gpu.py:293: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
top3.update(prec3[0], input.size(0))
Test: [0/152] Time 18.244 (18.244) Loss 1.9410 (1.9410) Prec@1 44.000 (44.000) Prec@3 76.000 (76.000)
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "main_single_gpu.py", line 362, in
main()
File "main_single_gpu.py", line 192, in main
prec1 = validate(val_loader, model, criterion)
File "main_single_gpu.py", line 286, in validate
output = model(input_var)
File "/root/anaconda3/envs/pn-pytorch/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/media/ml/G/panna/two-stream-pytorch-master/models/flow_resnet.py", line 154, in forward
x = self.layer3(x)
File "/root/anaconda3/envs/pn-pytorch/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/pn-pytorch/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/root/anaconda3/envs/pn-pytorch/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/media/ml/G/panna/two-stream-pytorch-master/models/flow_resnet.py", line 87, in forward
out = self.conv3(out)
File "/root/anaconda3/envs/pn-pytorch/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/pn-pytorch/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

About learning rate setting

Hi Yi,
How did you decide the lr step ?
Did you follow somewhere else or experiment it youself ?
Thanks in advance!

Different running env?

Hello,i want to know this repository can be runned in a different environment like cuda 10 and python 3.7 or higher?

what's version of pytorch and cuda

Thank you for your codes ! can you tell me what's version of pytorch and cuda? I run the code with pytorch==0.3.0 and cuda10.1,when i training about 50 epoch,error occurred:
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/torch/lib/THC/THCBlas.cu:246

the problem about pre-trained model

I download the pre-trained model(ucf101_s1_rgb_resnet152.pth.tar) you give ,but I can't extract the file, Could you please give me some advices? Thanks!

dense_flow

I am facing this error which installing dense flow:

.
.
[ 57%] Building CXX object CMakeFiles/extract_gpu.dir/tools/extract_flow_gpu.cpp.o
[ 64%] Building CXX object CMakeFiles/extract_warp_gpu.dir/tools/extract_warp_flow_gpu.cpp.o
[ 71%] Building CXX object CMakeFiles/pydenseflow.dir/src/py_denseflow.cpp.o
CMakeFiles/Makefile2:141: recipe for target 'CMakeFiles/extract_warp_gpu.dir/all' failed
make[1]: *** [CMakeFiles/extract_warp_gpu.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/extract_cpu.dir/all' failed
make[1]: *** [CMakeFiles/extract_cpu.dir/all] Error 2
CMakeFiles/Makefile2:178: recipe for target 'CMakeFiles/extract_gpu.dir/all' failed
make[1]: *** [CMakeFiles/extract_gpu.dir/all] Error 2
CMakeFiles/Makefile2:215: recipe for target 'CMakeFiles/pydenseflow.dir/all' failed
make[1]: *** [CMakeFiles/pydenseflow.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

How to solve this issue?

Flipping in flow

I have seen that in the test code, for the horizontal flipping of x component of optical flow, the image is reversed and the reversed image is deducted from 255. Why dont we just take the reversed image instead? This is the line 71 of video temporal prediction which is
255 - img_x[:, ::-1]

Thanks in advance

Errors about imput.size[1]

Hello. when I run ' main_single_gpu.py' with rgb frames, I met mistakes as follows:

Traceback (most recent call last):
  File "/home/zxf/Desktop/learn/reference/two-stream-pytorch/main_single_gpu.py", line 357, in <module>
    main()
  File "/home/zxf/Desktop/learn/reference/two-stream-pytorch/main_single_gpu.py", line 182, in main
    train(train_loader, model, criterion, optimizer, epoch)
  File "/home/zxf/Desktop/learn/reference/two-stream-pytorch/main_single_gpu.py", line 229, in train
    output = model(input_var)
  File "/home/zxf/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zxf/Desktop/learn/reference/two-stream-pytorch/models/rgb_vgg16.py", line 30, in forward
    x = self.features(x)
  File "/home/zxf/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zxf/anaconda2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/home/zxf/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zxf/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 254, in forward
    self.padding, self.dilation, self.groups)
  File "/home/zxf/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 52, in conv2d
    return f(input, weight, bias)
RuntimeError: Need input.size[1] == 3 but got 30 instead.

The input.size is (2L, 30L, 224L, 224L). Every videos' 10 frames are fed to VGG, then the net cannot deal with the input.

two stream fusion

Hi bryanyzhu:
thanks for your opensource code!
the spatial and temporal stream were ahieved. I want to know ,wheter the fusion of these two steram is reached ??.I didn't find the relevant source code under the project.

dense_flow issue

when I run: python build_of.py --src_dir ./UCF-101 --out_dir ./ucf101_frames --df_path
I have an error :build_of.py: error: unrecognized arguments
build_of.py [-h] [--src_dir SRC_DIR] [--out_dir OUT_DIR]
[--df_path DF_PATH] [--new_width NEW_WIDTH]
[--new_height NEW_HEIGHT] [--num_worker NUM_WORKER]
[--num_gpu NUM_GPU] [--out_format {dir,zip}]
[--ext {avi,mp4}]

test video

Can the model be directly input video for testing? How do I do that?

Expected accuracy

Hi,
May I ask what is the expected best accuracy I might find? So far I was able to achieve 73.64% as validation accuracy in RGB and I currently running the Flow experiments.

Thank you

A new error about input.size[1]

Hello! It's very nice of you to modify main_single_gpu.py. I found you change the parameter new_length from 10 to 1. Does new_length represent the number of frames fed into the network? When I change new_length's value, I get an error RuntimeError Need input.size[1] == 3 but got xx instead

the result of ssn_test.py

Hi yaunjun
Your work is so useful, thank you very much for your open source. I'm trying to run your code, and I want to know the meaning of the ssn_test.py result(variable rst). can you help me point them out in this code project?
thanks a lot

Problems about VideoSpatialPrediction.py

Hello! Thanks for your codes! I have learned a lot from it. But as a beginner, I am a little confused about the code below in VideoSpatialPrediction.py:

crop

rgb_1 = rgb[:224, :224, :,:]
rgb_2 = rgb[:224, -224:, :,:]
rgb_3 = rgb[16:240, 60:284, :,:]
rgb_4 = rgb[-224:, :224, :,:]
rgb_5 = rgb[-224:, -224:, :,:]
rgb_f_1 = rgb_flip[:224, :224, :,:]
rgb_f_2 = rgb_flip[:224, -224:, :,:]
rgb_f_3 = rgb_flip[16:240, 60:284, :,:]
rgb_f_4 = rgb_flip[-224:, :224, :,:]
rgb_f_5 = rgb_flip[-224:, -224:, :,:]

rgb = np.concatenate((rgb_1,rgb_2,rgb_3,rgb_4,rgb_5,rgb_f_1,rgb_f_2,rgb_f_3,rgb_f_4,rgb_f_5), axis=3)

_, _, _, c = rgb.shape

Why do you code like this? I do not know why we should do that and what they mean.
Looking for your answering!

The number of GPUs?

Hello,I double how can i train this code with several GPUs? Since i just tried to run the code "main_sigle_gpu.py",only one GPU began to work.How to solve this problem?

[Question] Splits used for pretrained model...

Thanks for sharing this repo, it is very helpful. Just a quick question about the pretrained temporal resnet152 model you have for download:

  • Was the model ONLY trained on the split01 train set?
  • Are the splits you are using the official splits from the UCF101 website?

Thanks for the info!

Flow_model_accuarcy

hi, @bryanyzhu thanks for your nice share!
Wang[1] provide a method called Cross modality pre-training which may improve the flow model performance.

[1]. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

Stuck there when I run "python main_single_gpu.py datasets/"

2017-09-21 04:19:56,066 - INFO - Building model ...
/home/ytan/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py:360: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
own_state[name].copy_(param)
2017-09-21 04:20:14,050 - INFO - Model flow_vgg16 is loaded.
2017-09-21 04:20:14,051 - INFO - Saving everything to directory ./checkpoints.
2017-09-21 04:20:14,100 - INFO - 13320 samples found, 9537 train samples and 3783 test samples.
2017-09-21 04:20:14,100 - INFO - 0.005
Could not load file datasets/JumpingJack/v_JumpingJack_g25_c04/flow_x_00029.jpg or datasets/JumpingJack/v_JumpingJack_g25_c04/flow_y_00029.jpg
Could not load file datasets/CliffDiving/v_CliffDiving_g17_c01/flow_x_00060.jpg or datasets/CliffDiving/v_CliffDiving_g17_c01/flow_y_00060.jpg
Could not load file datasets/RopeClimbing/v_RopeClimbing_g17_c05/flow_x_00074.jpg or datasets/RopeClimbing/v_RopeClimbing_g17_c05/flow_y_00074.jpg
Could not load file datasets/PlayingSitar/v_PlayingSitar_g24_c07/flow_x_00186.jpg or datasets/PlayingSitar/v_PlayingSitar_g24_c07/flow_y_00186.jpg
/////////////////////////////////////////////////////////////////////////
It just stops here and do not have any progress.

What is the accuracy of UCF101?

Hi, thank you very much for providing the code!
Could you tell me what is the accuracy of UCF101 and under what parameters?
Thanks again!

About pre-trained Model

If I want to obtain a new pre-trained Flow_VGG16 Model of other video data, what should I do?

What's your next plan?

Hi, @bryanyzhu :

I recently write a same repo about two-stream video classification using PyTorch, I realized 85.5% of spatial frame's accuracy on split1 of UCF101 last month. I want to cooperate with you, help you build a more useful framework. (implement some famous works, such as TSN, C3D and so on), so I want to communicate more with you if you are interested in my thought.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.