Giter VIP home page Giter VIP logo

ahmetgunduz / real-time-gesrec Goto Github PK

View Code? Open in Web Editor NEW
600.0 15.0 166.0 21.64 MB

Real-time Hand Gesture Recognition with PyTorch on EgoGesture, NvGesture, Jester, Kinetics and UCF101

Home Page: https://arxiv.org/abs/1901.10323

License: MIT License

Python 92.30% Shell 7.70%
gesture-recognition cnn pytorch video-processing machine-learning deep-neural-networks hand-gesture-recognition resnet jester egogesture

real-time-gesrec's Introduction

Hi there 👋 I'm Ahmet

  • 🔭 I’m currently working @aixplain as a Data Scientist
  • 🎓 I have education background from Bogazici, TUM, and LMU
  • 📜 I have two Master's degrees in Data Science and Telecommunications Engineering
  • ⚡ Fun fact: I do not like to code, but I love to solve problems with code

My GitHub stats

real-time-gesrec's People

Contributors

ahmetgunduz avatar hypothesis2304 avatar okankop avatar parkjh688 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

real-time-gesrec's Issues

Queries about training

Hello,

I am trying to implement your paper from scratch as part of my project and have some questions which I was hoping you could answer. I am only trying to train the detector half of the network for now and using the JESTER dataset to do so.

  1. How is the data getting fed in? Each folder has 'n' frames which belong to a category(i.e. gesture or no gesture). Taking the detector queue as 8 frames, do you then split the 'n' frames into n/8 chunks each having the gesture or no gesture label?
  2. How long did you pre-train it on JESTER for? You paper mentions 25 epochs but I am guessing that'd for the classifier? Your code seems to indicate 100 epochs instead.

I am hoping you can help me with this. Thanks in advance!

Regards,
Nishant Bhattacharya

Test the model with RGB_D camera

Hi, thanks for sharing your amazing model.

I am now trying to test the model with my RGB-D camera. However, I am a beginner in pytorch. So, I need some help to go through the code:

  1. I plan to feed the model with depth images, which is achieved form the camera with openni and opencv. The shape of each frame is (112,112,3). If I want to detect and classify n frames in each iteration, what shape should the input be.

  2. What does "sample_duration" mean? What is the difference between "sample_duration_det" and detector queue.

I am using egogesture depth model.

underfitting for nvGesture?

Hi, I tried to train the classifier for nvgesture from scratch(the hyperparams come from run_online.sh, batchsize=8, resnext101, cls=25, lr=0.01, duration=32). But I found it's almost not fitting, after tens of epoch, the acc for trainset and valset is about 5%.
And I tried to increase batchsize to 16, the acc can however converged to 9%, it's still very low. I also tried to set norm_value=255 to normalize the input data to small range or smaller lr, but it didn't work.
Did I miss something?
BTW, the detector trained from scratch is well with acc about 80%.

Training on the jester dataset - questions

Hi,

I have some question regarding the training process:

  1. Do you trained separability the two models? detection & classification
  2. Can I use the jester dataset to train the two models? (something that confuses me is that one of the classes in the jester dataset is 'no gesture'?
  3. As I understand you used the same code in the main.py to train separability the two models?
  4. If so, what do I need to change to switch between the training of the two models? excepts the network parameters?

Thank you,
Olga

RGB-D result

Hi,

from the paper, I only saw RGB or Depth results, how about RGB-D? Are you able to release the pre-trained RGB-D models?

Thanks.

accuracy

Hi Ahmet,

I trained the classifier using Egogesture dataset. But the validation accuracy is just around 50% and also training accuracy is around 60%.

I am using ResNext101 architecture

Am I missing anything?

RGB pre-trained models

Are the models in the drive also usable for RGB prediction/classification?

If not, could I kindly ask you to upload these models as well?

I am asking since the names would suggest that every model (except the jester) is for Depth data.

Thank you very much

Reshaping error "shape '[32, -1, 112, 112]' is invalid for input of size 865536"

I am trying to do both detection and clasification for jster dataset and I have trained the detector part of it and saved it's checkpoint and for the classification part I am using the checkpoint that you have provided. So for inferencing pupose I am running run_online.sh file by putting both checkpoints there. So for that I have made Jester_online.py file just like you made egogesture_online.py to get the dataset to provide for evaluation but it is showing some reshape error in the following line -
clip = torch.cat(clip, 0).view((self.sample_duration, -1) + im_dim).permute(1, 0, 2, 3)
the error is -
RuntimeError: shape '[32, -1, 112, 112]' is invalid for input of size 865536

For simple classification, I didn't get this error. I don't know where did this number 865536 come from.
Can you help me out with this problem. I am attaching the screenshot here.
Screenshot from 2020-04-17 12-17-47

Model accuracy with Jester dataset is poor

Hi I have tried to validate the pretrained model with Jester dataset.

Preconditions:

  1. Retrained model used jester_resnext_101_RGB_32.pth
  2. Dataset Jester
  3. Configurations opts.zip
  4. Source modifications diff.zip
  5. PyTorch version 1.1.0
  6. Python version 3.7.3

Test:

  1. python utils/jester_json.py 'annotation_Jester' to prepare the dataset
  2. python offline_test.py to start the execution

But the output precision is very poor

[11/3721] Time 1.07421 (1.13381) prec@1 0.03409 prec@5 0.20455 precision 0.00000 (0.03213) recall 0.00000 (0.01278)
[12/3721] Time 1.09013 (1.13017) prec@1 0.03646 prec@5 0.20312 precision 0.03030 (0.03198) recall 0.03030 (0.01424)
[13/3721] Time 1.07996 (1.12631) prec@1 0.03365 prec@5 0.20192 precision 0.00000 (0.02952) recall 0.00000 (0.01315)
[14/3721] Time 1.08615 (1.12344) prec@1 0.03125 prec@5 0.20089 precision 0.00000 (0.02741) recall 0.00000 (0.01221)

Could you please help me to find what am missing to get the proper output?
Regards,
Albin

How the pre-trained model on Jester could be used to train EgoGesture?

When I tried to use the pre-trained classification model on Jester to train EgoGesture dataset, it showed that

RuntimeError: Error(s) in loading state_dict for DataParallel:
size mismatch for module.fc.weight: copying a param with shape torch.Size([27, 2048]) from checkpoint, the shape in current model is torch.Size([83, 2048]).
size mismatch for module.fc.bias: copying a param with shape torch.Size([27]) from checkpoint, the shape in current model is torch.Size([83]).

It seems it is because Jester and EgoGesture have different classes of gestures. So how should I change this parameter?

My code is shown like this:
#!/bin/bash
python main.py
--root_path ~/
--video_path /home/wisccitl/Desktop/EgoGesture
--annotation_path Real-time-GesRec/annotation_EgoGesture/egogestureall_but_None.json
--result_path Real-time-GesRec/results
--resume_path Real-time-GesRec/models/jester_resnext_101_RGB_32.pth
--dataset egogesture
--sample_duration 32
--learning_rate 0.01
--model resnext
--model_depth 101
--resnet_shortcut B
--batch_size 64
--n_classes 83
--n_finetune_classes 83
--n_threads 16
--checkpoint 1
--modality RGB
--train_crop random
--n_val_samples 1
--test_subset test
--n_epochs 100 \

n_frames

Hi,

I am trying to use your AMAZING code with the jester dataset.
I have some questions:

  1. I see that you assume that for each video there is a directory names "n_frames".
    How can I create those directories?
    I downloaded the jester dataset and extracted the data as describes in the link but there aren't directories with the name "n_frames"
  2. Can you give the parameters for testing on the jester dataset?

Thank you!

opts

Hey Ahmet,

I am trying to replicate your work. I am having problems in dataloader (probably). I haven't changed anything in your code except for the paths and few minor changes for which I was getting errors.

My opt looks like:

annotation_path='/home/ndhingra/Real-time-GesRec/Real-time-GesRec/annotation_EgoGesture/egogestureall.json', arch='resnet-10', batch_size=128, begin_epoch=1, checkpoint=10, crop_position_in_test='c', dampening=0.9, dataset='egogesture', ft_begin_index=0, initial_scale=1.0, learning_rate=0.1, lr_patience=10, lr_steps=[10, 25, 50, 80, 100], manual_seed=1, mean=[114.7748, 107.7354, 99.475], mean_dataset='activitynet', modality='RGB', model='resnet', model_depth=10, momentum=0.9, n_classes=400, n_epochs=200, n_finetune_classes=400, n_scales=5, n_threads=4, n_val_samples=3, nesterov=False, no_cuda=False, no_hflip=False, no_mean_norm=False, no_softmax_in_test=False, no_train=False, no_val=False, norm_value=1, optimizer='sgd', pretrain_path='', resnet_shortcut='B', resnext_cardinality=32, result_path='/home/ndhingra/Real-time-GesRec/Real-time-GesRec/results', resume_path='', root_path='/home/ndhingra/Real-time-GesRec/Real-time-GesRec', root_video_path='/media/storage/ndhingra/EgoGesture', sample_duration=16, sample_size=112, scale_in_test=1.0, scale_step=0.84089641525, scales=[1.0, 0.84089641525, 0.7071067811803005, 0.5946035574934808, 0.4999999999911653], std=[38.7568578, 37.88248729, 40.02898126], std_norm=False, store_name='model', test=False, test_subset='val', train_crop='corner', train_validate=False, video_path='/home/ndhingra/Real-time-GesRec/Real-time-GesRec/images', weight_decay=0.001, weighted=False, wide_resnet_k=2)

I get error in main.py

train_loader = torch.utils.data.DataLoader(
training_data,
batch_size=opt.batch_size,
shuffle=True,
num_workers=opt.n_threads,
pin_memory=True)

i.e.,

ValueError: num_samples should be a positive integeral value, but got num_samples=0

Can you suggest what changes do I have to make? If possible can you also upload opts.py which you used for egogesture dataset. Since I haven't made any changes to your code, I expect it to work as it worked for you.

size mismatch for jester pre-trained model

Hi,

I am trying to apply offline_test.py on the jester dataset with your pre-trained model.
and I got:
"size mismatch for module.conv1.weight: copying a param with shape torch.Size([64, 3, 7, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 7, 7])."

I think that maybe I have some problem with my parameters.
Can you please help me?

Thank you again!
I appreciate all you help!

Variables in online_test.py

Hi again,

Can you please give more details of some of the variables in online_test script?
I think that I understand them, but after looking at the code, I am not sure...

  1. passive_count => # consecutive number of 'no gesture' classifier
  2. active
  3. active_index
  4. pre_predict
  5. finished_prediction
  6. prev_active

There is a situation where finished_prediction = false but we got to the last window frame?
so, results is empty (predicted = np.array(results)[:, 1]) and the code is fails...
I am wondering how to deal with that? how to calculate levenshtein_distance in this case?

Thank you again!!!!

RuntimeError: input and weight type

I am tyring train a detector on Jester dataset. However, when I run run_offline.sh I encounter the followring error right after the dataset is loaded:

Traceback (most recent call last):
File "main.py", line 177, in
train_logger, train_batch_logger)
File "/home/khasmamad/Desktop/kimo/Real-time-GesRec/train.py", line 34, in train_epoch
outputs = model(inputs)
File "/home/khasmamad/miniconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/khasmamad/miniconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/khasmamad/miniconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/khasmamad/Desktop/kimo/Real-time-GesRec/models/resnetl.py", line 177, in forward
x = self.conv1(x)
File "/home/khasmamad/miniconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/khasmamad/miniconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 448, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Googling showed me that this happens when the input and model are on separate devices (in this case, input is in GPU, while model is in CPU). But I still cannot figure out a solution. Please, help.

NvGesture dataset

There are 31 files named:

                      nvGesture_v1.7z.001
                     to nvGesture_v1.7z.031

I am looking to extract these files to video format. Since these files are zipped in .7z format. I tried using

                      cat nvGesture_v1.7z.0?? | 7za x

or

                      cat nvGesture_v1.7z.0?? | 7za e

but in both cases I get error:

                        Error:
                        Incorrect command line

Get stuck at running online_test.py with pretrained model on CPU

I was trying to run oneline_test.py with pretrained model on CPU (w/o CUDA). I did some modification in model.py and online_test.py, including:

  1. added opt.no_cuda = True right after opt = parse_opts_online()
  2. added map_location=torch.device('cpu') to torch.load(opt.pretrain_path)
  3. modified model.load_state_dict(pretrain['state_dict'])at line 120ish to
            state_dict =pretrain['state_dict']
            from collections import OrderedDict
            new_state_dict = OrderedDict()
            for k, v in state_dict.items():
                if 'module' in k:
                    name = k[7:] # remove 'module.' of dataparallel
                new_state_dict[name]=v
            model.load_state_dict(new_state_dict)

Solved the issue of

Missing key(s) in state_dict: "conv1.weight",...
Unexpected key(s) in state_dict: "module.conv1.weight", ...

solved by looking into here

However, now it gives me an error:

Traceback (most recent call last):
  File "online_test.py", line 138, in <module>
    detector,classifier = load_models(opt)
  File "online_test.py", line 76, in load_models
    detector, parameters = generate_model(opt)
  File "...../Real-time-GesRec-master/model.py", line 132, in generate_model
    model.load_state_dict(new_state_dict)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ResNetL:
        size mismatch for conv1.weight: copying a param with shape torch.Size([16, 1, 7, 7, 7]) from checkpoint, the shape in current model is torch.Size([16, 3, 7, 7, 7]).

Thanks in advance!!!

Miss alarm when same gesture twice in online mode?

Hi, ahmetgunduz:
I tested the model in online mode with my own video. And all looks fine except if I show the same gesture twice. The model failed to predict the second gesture(no gesture is detected). And I doubt the reason maybe the rule based filter, but I'm not sure. Could you please give me some advice?

Start frame and end frame missing from trainlist01.txt and vallist01.txt.

In README.md it says that N frames format is as following: "path to the folder" "class index" "start frame" "end frame".
However that information seems to be missing from annotation_Jester/trainlist01.txt and annotation_Jester/vallist01.txt. Is it somewhere else? Am I looking at the right files?

Thanks.

cuda gpu device Error

Hi.

I have 1 GPU in my computer but I got this error.
I'm newbie of Pytorch so I don't know this Error's meaning.

Traceback (most recent call last):
  File "main.py", line 177, in <module>
    train_logger, train_batch_logger)
  File "/home/eden/Real-time-GesRec/train.py", line 34, in train_epoch
    outputs = model(inputs)
  File "/home/eden/anaconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/eden/anaconda3/envs/gesrec/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward
    "them on device: {}".format(self.src_device_obj, t.device))
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu

Unable to extract Jester datasets

Hi,
I am getting an error while extracting the Jester dataset files except 20bn-jester-v1-00 data file, while other files are giving the error. And when I checked the type of the data files, So what I observed it the file type of 20bn-jester-v1-00 is different from the other files. I am attching the screenshot of the error I am gettting which also includes the file types, please help if you have also resolved the same issue.
Screenshot from 2020-04-13 16-22-38

egogesture_online.py--IndexError

when I ran "online_test.py", the error-"IndexError index -3 is out of bounds for axis 0 with size 0" happened in the line 152 of egogesture_online.py
(counts = np.bincount(label_list[np.array(list(range(_ - int(sample_duration/8), _ )))])).
I do not know how to resolve it.

how upzip nvGesture dataset, i use 7za e datafilename

7-Zip (A) [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,72 CPUs)

Processing archive: /home/lxj/Gesture_recognition/data/nvGesture/nvGesture_v1.7z.001

Error: E_FAIL

Prediction Accuracy on Nvidia Gesture dataset is very poor.

Hi Ahmet Gündüz,

I was using you pre-trained model (nv_resnext_101_Depth_32.pth) to test on nvidia gesture dataset. My accuracy for this dataset is very poor (not even 20%). Can you explain whether the model is correct one to test and if yes than why the prediction accuracy is so poor.
I have followed the steps mentioned by you in your github post.

Not able to run offline_test.py for Jester dataset

Thank you so much for the great solution.

I am in the processing validating the solution and understanding more. Tried to test the pretrained model jester_resnext_101_RGB_32.pth with Jester dataset

Downloaded dataset and performed frame creation with python utils/jester_json.py 'annotation_Jester'.

But the command Python offline_test.py is giving below error:
dataset loading [14780/14787] run Traceback (most recent call last): File "offline_test.py", line 161, in <module> outputs = model(inputs) File "/home/albin/anaconda3/envs/l3c_env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/albin/anaconda3/envs/l3c_env/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward "them on device: {}".format(self.src_device_obj, t.device)) RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu

Please someone help me to resolve this issue

Regards,
Albin

I modify run_offline.sh to fit jester dataset, but got precision0.03 recall0.03

Hi,
Here's my modified run_offline.sh. If I make some error params, please help to correct. Thanks!

  • python offline_test.py
    --root_path ~/
    --video_path /home/ps/NewDisk1/Public_open/Jester/20bn-jester-v1
    --annotation_path ~/Codes/Real-time-GesRec/annotation_Jester/jester.json
    --result_path ~/Codes/Real-time-GesRec/results
    --pretrain_path Codes/Real-time-GesRec/pretrained_models/jester_resnext_101_RGB_32.pth
    --dataset jester
    --sample_duration 32
    --learning_rate 0.01
    --model resnext
    --model_depth 101
    --batch_size 1
    --n_classes 27
    --n_finetune_classes 27
    --modality RGB
    --n_threads 8
    --checkpoint 1
    --train_crop random
    --n_val_samples 1
    --test_subset val
    --n_epochs 100

[14787/14787] Time 0.04845 (0.09407) prec@1 0.03293 prec@5 0.18780 precision 0.00000 (0.03293) recall 0.00000 (0.03293)
-----Evaluation is finished------
Overall Prec@1 0.03293% Prec@5 0.18780%

jester pretrained model

Hello, you did a great work good job !
My question is about the jester resnext pretrained weights:

  1. I want to know given for exampe a single folder with 35 RGB frames what are the transformations that I need to make to this frames to get the right input and then make prediction ?
  2. And how I can load the resnext model properly ?
  3. And is the pretrained weights compatible with pytorch 1.0.1 cuda 10

No correct results printed

Hi Ahmet,
I ran online_test.py on egogesture with cpu only by setting "opt.no_cuda=True" and "opt.n_threads =0",
but it didn't produce the right results,I changed the code like this:

图片1
and one of the results printed by console like this:

image

it seems that the switch never been activated,causing no result been appended:

image

I'd appreciate it if you could help me with this

How to feed input to classifier in online_test.py using tensor.float

Hi,
How to feed input to classifier in online_test.py using tensor.float .
I tried ,

frame= np.reshape(frame,(1,1,1,512,512)) frame=cv2.normalize(frame,None,alpha=0,beta=1,norm_type=cv2.NORM_MINMAX,dtype=cv2.CV_32F)
input_clf = torch.from_numpy(frame).float()
outputs_det = classifier(inputs_clf)

I get the following error,

RuntimeError: invalid argument 2: input image (T: 1 H: 32 W: 16) smaller than kernel size (kT: 2 kH: 3 kW: 3) at /pytorch/aten/src/THCUNN/generic/VolumetricAveragePooling.cu:57

Originally posted by @sathiez in #17 (comment)

EgoGesture Dataset is not available

Hi,

Currently I am learning your code and trying to get your result.

However, I could not download the EgoGesture dataset(author's email is wrong).

Could you please provide another link to download the dataset or any other help.

Thank you!

ResNet Detector Model

Hi Ahmet,

Could you quickly send the .PTH file for the ResNet10 Detector model you talk about in your paper? This would help with replicating what you did in your paper a lot! Thanks!

Explanation of Testing

Hey Ahmet,
thanks for this amazing work.

I was going to test your pretrained model but there are a lot of disambiguations! can you please explain how, or give an instruction on running and testing your model?

AssertionError

I'm new to pytorch, and I ran into a lot of errors while debugging the program. Most of them have been resolved, but I'm stuck with an AssertionError

caffe-repository

Hi, congratulations on your wonderful job, but I wonder if you have any plans to release a caffe repository?
Thanks!

pre-trained model

Hi, would you like to share your pre-trained model that can be finetuned for both detection and classification?
Thanks.

any readme?

Hi:
Thanks for your shared code.
But could you please write a readme?

Could you please give me some advice for more class finetune from jester?

Hi, ahmetgunduz:
I tried to finetune a model(classifier) trained on jester to my own gesture datasets, but the performance is awful. Could you please give me some advice? And Could you please share some details for your ego finetuning experiment details?(lr? freeze some layers? freeze bn?)
My own gesture dataset has 88 classes, almost the union of jester gesture class and ego gesture class. And only a few samples per class, (train: 42 samples, val: 6 samples). The dataset is small and with distortion(wide angle camera), second person perspective, and similar with jester.

In my experiment, the arch of trained model is resnet34_0.5channels (4 block layers: [3, 4, 6, 3]). Here are results:
a. only train fc layer, all conv and bn are frozen, lr 0.001, dropout 0.7, performance: train 0.573, val: 0.463
b. block layer4 and fc, conv1 and layer1~3(include bn) are frozen , lr 0.01, dropout 0.7, performance: train 0.743, val: 0.447
c. entire model, large lr 0.01 due to my bug, no dropout, but the model got best performance: train0.909, val0.582
d. train from scratch, lr 0.01, dropout 0.7, performance: train 0.712, val 0.329

I also found the pretrained model from jester may predict some swiping case in my own dataset as sliding due to distortion.

It seems the model has not benefit much from finetuning due to more classes than jester and distortion. Could you give me some advice?

Thanks.

Torch size error with Jester Dataset

Hi.
I got this error message with Jester Dataset.

Screenshot from 2019-08-02 15-44-12

I don't know the second number's meaning of torch.Size([64, 3, 7, 7, 7]) or torch.Size([64, 1, 7, 7, 7]).

and this is my run_offline.sh file.

#!/bin/bash
python main.py \
        --root_path ~/ \
        --video_path /home/eden/20BN-jester/20bn-jester-v1/videos \
        --annotation_path ~/Real-time-GesRec/annotation_Jester/jester.json\
        --result_path ~/Real-time-GesRec/results \
        --resume_path ~/Real-time-GesRec/jester_resnext_101_RGB_32.pth \
        --dataset jester \
        --sample_duration 8 \
    --learning_rate 0.01 \
    --model resnext \
        --model_depth 101 \
        --resnet_shortcut A \
        --batch_size 16 \
        --n_classes 27 \
        --n_finetune_classes 27 \
        --n_threads 16 \
        --checkpoint 1 \
        --modality Depth \
        --train_crop random \
        --n_val_samples 3 \
        --test_subset test \
     --n_epochs 100 \

nvGesture training

Hello Ahmet, I am trying to run your code on Nvidia dataset. On running the main.py, the train.log looks like this.
epoch loss acc precision recall lr

1 0 0 0 0 0.1

2 0 0 0 0 0.1

3 0 0 0 0 0.1

which I don't think is right. Can you please tell me what am I doing wrong?
Other than setting path and reducing the epoch value from 100 to 50. I haven't changed anything.
GesRec.pdf
These are the parameters and a part of train.log after running the main.py.

Testing on real-time RGB video using jester Pretrained model

I am trying to understand your code. I have understood that for other datasets there is two models detector and classifier. I only have jester dataset available with me. And for that we only one model.

Can you please tell me How can we do real time detection RGB camera video without detetctor?
And can we recognize gesture with jester model?
or which model of other datasets can be used for same?

run online_test.py

Hi Mr Ahmet,
thanks for sharing your perfect project.
I was going to test your pretrained model in online mode but I confront an error when loading the model.
please help me .
Namespace(annotation_path='/home/sattarian/Documents/projects/hand-guesture/annotation_EgoGesture/egogestureall.json', arch='resnetl-10', batch_size=1, begin_epoch=1, checkpoint=1, clf_queue_size=16, clf_strategy='median', clf_threshold_final=0.15, clf_threshold_pre=0.6, crop_position_in_test='c', dampening=0.9, dataset='egogesture', det_counter=2.0, det_queue_size=4, det_strategy='median', ft_begin_index=0, initial_scale=1.0, learning_rate=0.1, lr_patience=10, lr_steps=[10, 20, 30, 40, 100], manual_seed=1, mean=[114.7748, 107.7354, 99.475], mean_dataset='activitynet', modality='Depth', modality_clf='Depth', modality_det='Depth', model='resnetl', model_clf='resnext', model_depth=10, model_depth_clf=101, model_depth_det=10, model_det='resnetl', momentum=0.9, n_classes=2, n_classes_clf=83, n_classes_det=2, n_epochs=200, n_finetune_classes=2, n_finetune_classes_clf=83, n_finetune_classes_det=2, n_scales=5, n_threads=16, n_val_samples=1, nesterov=False, no_cuda=False, no_hflip=False, no_mean_norm=False, no_softmax_in_test=False, no_train=False, no_val=False, norm_value=1, optimizer='sgd', pretrain_path='/home/sattarian/Documents/projects/hand-guesture/egogesture_resnetl_10_Depth_8.pth', pretrain_path_clf='/home/sattarian/Documents/projects/hand-guesture/egogesture_resnext_101_Depth_32.pth', pretrain_path_det='/home/sattarian/Documents/projects/hand-guesture/egogesture_resnetl_10_Depth_8.pth', resnet_shortcut='A', resnet_shortcut_clf='B', resnet_shortcut_det='A', resnext_cardinality=32, resnext_cardinality_clf=32, resnext_cardinality_det=32, result_path='/home/sattarian/Documents/projects/hand-guesture/results', resume_path='/home/sattarian/Documents/projects/hand-guesture/egogesture_resnetl_10_Depth_8.pth', resume_path_clf='/home/sattarian/Documents/projects/hand-guesture/egogesture_resnext_101_Depth_32.pth', resume_path_det='/home/sattarian/Documents/projects/hand-guesture/egogesture_resnetl_10_Depth_8.pth', root_path='/home/sattarian/Documents/projects/hand-guesture/', sample_duration=8, sample_duration_clf=32, sample_duration_det=8, sample_size=112, scale_in_test=1.0, scale_step=0.84089641525, scales=[1.0, 0.84089641525, 0.7071067811803005, 0.5946035574934808, 0.4999999999911653], std=[38.7568578, 37.88248729, 40.02898126], std_norm=False, store_name='model', stride_len=1, test=True, test_subset='test', train_crop='random', video_path='/home/sattarian/Documents/projects/hand-guesture/video_kinetics_jpg', weight_decay=0.001, whole_path='video_kinetics_jpg', wide_resnet_k=2, wide_resnet_k_clf=2, wide_resnet_k_det=2)
loading pretrained model /home/sattarian/Documents/projects/hand-guesture/egogesture_resnetl_10_Depth_8.pth
Traceback (most recent call last):
File "online_test.py", line 137, in
detector,classifier = load_models(opt)
File "online_test.py", line 75, in load_models
detector, parameters = generate_model(opt)
File "/home/sattarian/Documents/projects/hand-guesture/model.py", line 68, in generate_model
model.load_state_dict(pretrain['state_dict'])
File "/home/sattarian/anaconda3/envs/deep-learning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
size mismatch for module.conv1.weight: copying a param with shape torch.Size([16, 1, 7, 7, 7]) from checkpoint, the shape in current model is torch.Size([16, 3, 7, 7, 7]).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.