maghoumi / jitgru Goto Github PK

Second-order differentiable PyTorch GRUs in JIT with TorchScript

License: MIT License

Python 100.00%

jitgru's Introduction

Hi there 👋

My name is Mehran Maghoumi and I'm a senior deep learning engineer at NVIDIA. My primary area of work is parking space perception using surround camera setups for autonomous driving. I also hold a Ph.D. degree in computer sceince from the University of Central Florida. Feel free to checkout my full profile on my homepage.

What's all this? 🤔

Below is the list of my open source projects that ✨ I'm the most proud of ✨. I've worked on these either during my spare time or as a part of my Ph.D. dissertation. Countless hours of my time have gone into the development of each one, and nothing makes me happier than seeing people use them in their projects.

If you see something you like, please consider ⭐ starring ⭐ the repo. It gives me a better idea of where to focus my efforts!

Happy browsing! 💥

jitgru's People

Contributors

Stargazers

Watchers

Forkers

shaun95 mozammalchy elixir-code thegodone

jitgru's Issues

Error when batch is set to 1

First thank you for sharing this code

I just copied the original code, when the batch is set to 1, there is an error says:

Traceback (most recent call last):
  File "jit.py", line 209, in <module>
    test_script_gru_layer(5, 1, 3, 7)
  File "jit.py", line 161, in test_script_gru_layer
    out, out_state = gru_jit(inp, h)
  File "/home/wu/anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
The above operation failed in interpreter.
Traceback (most recent call last):
  File "<string>", line 10
                  alpha: number = 1.0):
            result = torch.add(self, other, alpha=alpha)
            self_size, other_size = AD_sizes_if_not_equal_multi_1(self, other, result)
                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            def backward(grad_output):
                grad_other = (grad_output * alpha)._grad_sum_to_size(other_size)
  File "<string>", line 10, in AD_sizes_if_not_equal_multi_1
                  alpha: number = 1.0):
            result = torch.add(self, other, alpha=alpha)
            self_size, other_size = AD_sizes_if_not_equal_multi_1(self, other, result)
                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            def backward(grad_output):
                grad_other = (grad_output * alpha)._grad_sum_to_size(other_size)
  File "<string>", line 10, in AD_sizes_if_not_equal_multi_1
                  alpha: number = 1.0):
            result = torch.add(self, other, alpha=alpha)
            self_size, other_size = AD_sizes_if_not_equal_multi_1(self, other, result)
                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            def backward(grad_output):
                grad_other = (grad_output * alpha)._grad_sum_to_size(other_size)

The above operation failed in interpreter.
Traceback (most recent call last):

Do you have idea why this is happening?

Support for variable length sequences and bi-directional GRU

I am working with a NLP model which uses bi-directional GRUs and also uses a higher-order derivative in the loss function.

Is it possible to extend this work to support variable length sequences and also support the bi-directional variant of the GRU?

I am interested in understanding if this technically possible and if implementing it using JIT would give speed improvements over the approach of disabling CUDNN with with torch.backends.cudnn.flags(enabled=False): for nn.GRU.

P.S.: I understand that Torchscript does not support PackedSequence making things difficult.

Speed comparison

Does JITGRU provide any speedup for both CPU and GPU training? Thank you

Unit test does not work

Hi, just FYI, I get that with pytorch v1.4.0a0:

python jit_gru.py 
[2, 3]
[2, 3]
[2, 3]
[2, 3]
[2, 3]
Traceback (most recent call last):
  File "models/gru/jit_gru.py", line 184, in <module>
    test_script_gru_layer(5, 2, 3, 7)
  File "models/gru/jit_gru.py", line 146, in test_script_gru_layer
    assert lstm_param.shape == custom_param.shape
AssertionError

And printing:

print(custom_param.shape)
print(lstm_param.shape)

shows:
torch.Size([21, 7])
torch.Size([21, 3])

maghoumi / jitgru Goto Github PK

jitgru's Introduction

Hi there 👋

What's all this? 🤔

jitgru's People

Contributors

Stargazers

Watchers

Forkers

jitgru's Issues

Error when batch is set to 1

Support for variable length sequences and bi-directional GRU

Speed comparison

Unit test does not work

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent