espnet / warp-ctc Goto Github PK

This project forked from jnishi/warp-ctc

Pytorch Bindings for warp-ctc maintained by ESPnet

License: Apache License 2.0

CMake 1.32% Cuda 59.04% C++ 31.07% C 2.08% Python 5.24% Shell 0.62% Dockerfile 0.64%

warp-ctc's Introduction

PyTorch bindings for Warp-ctc

branch	status
`pytorch_bindings`
`pytorch-0.4`
`pytorch-1.0`

This is an extension onto the original repo found here.

Installation

Install PyTorch first.

warpctc-pytorch wheel uses local version identifiers, which has a restriction that users have to specify the version explicitly.

$ pip install warpctc-pytorch==X.X.X+torchYY.cudaZZ

The latest version is 0.2.1 and if you work with PyTorch 1.6 and CUDA 10.2, you can run:

$ pip install warpctc-pytorch==0.2.1+torch16.cuda102

for PyTorch 1.4 - 1.6

warpctc-pytorch wheels are provided for Python 3.8, 3.7, 3.6 and CUDA 10.2, 10.1, 10.0, 9.2.

for PyTorch 1.1 - 1.3

warpctc-pytorch wheels are provided for Python 3.7, 3.6 and CUDA 10.2, 10.1, 10.0, 9.2.

for PyTorch 1.0

warpctc-pytorch10-cudaYY wheels are provided for Python 3.7, 3.6 and CUDA 10.1, 10.0, 9.2, 9.1, 9.0, 8.0.

If you work with CUDA 10.1, you can run:

$ pip install warpctc-pytorch10-cuda101

for PyTorch 0.4.1

Wheels for PyTorch 0.4.1 are not provided so users have to build from source manually.

WARP_CTC_PATH should be set to the location of a built WarpCTC (i.e. libwarpctc.so). This defaults to ../build, so from within a new warp-ctc clone you could build WarpCTC like this:

$ git clone https://github.com/espnet/warp-ctc.git
$ cd warp-ctc; git checkout -b pytorch-0.4 remotes/origin/pytorch-0.4
$ mkdir build; cd build
$ cmake ..
$ make

Now install the bindings:

$ cd ../pytorch_binding
$ pip install numpy cffi
$ python setup.py install

Example

Example to use the bindings below.

import torch
from warpctc_pytorch import CTCLoss
ctc_loss = CTCLoss()
# expected shape of seqLength x batchSize x alphabet_size
probs = torch.FloatTensor([[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1]]]).transpose(0, 1).contiguous()
labels = torch.IntTensor([1, 2])
label_sizes = torch.IntTensor([2])
probs_sizes = torch.IntTensor([2])
probs.requires_grad_(True)  # tells autograd to compute gradients for probs
cost = ctc_loss(probs, labels, probs_sizes, label_sizes)
cost.backward()

Documentation

CTCLoss(size_average=False, length_average=False, reduce=True)
    # size_average (bool): normalize the loss by the batch size (default: False)
    # length_average (bool): normalize the loss by the total number of frames in the batch. If True, supersedes size_average (default: False)
    # reduce (bool): average or sum over observation for each minibatch.
        If `False`, returns a loss per batch element instead and ignores `average` options.
        (default: `True`)

forward(acts, labels, act_lens, label_lens)
    # acts: Tensor of (seqLength x batch x outputDim) containing output activations from network (before softmax)
    # labels: 1 dimensional Tensor containing all the targets of the batch in one large sequence
    # act_lens: Tensor of size (batch) containing size of each output sequence from the network
    # label_lens: Tensor of (batch) containing label length of each example

warp-ctc's People

Contributors

Stargazers

Watchers

Forkers

yangmingqi chevaliernoir kamo-naoyuki sw005320 qianlanwyd nicolaspanel robin1001 xxuefeii yuyq96 luchuanze

warp-ctc's Issues

CUDA 11.0 support

Does warp-ctc supports cuda 11.0 (I have a cuda 11.0 on my ubuntu 20.04)

Warp-ctc request for cuda_ver in CPU standalone.

@jnishi from the update (@ysk24ok) to warp-ctc, it request for cuda version even when there is a CPU-only pc.

The output is:

Torch was not built with CUDA support, not building warp-ctc GPU extensions.
Traceback (most recent call last):
  File "setup.py", line 66, in <module>
    get_torch_version(), get_cuda_version())
  File "setup.py", line 55, in get_cuda_version
    proc = Popen(['nvcc', '--version'], stdout=PIPE, stderr=PIPE)
  File "/home/nelson/docstrings/tools/venv/lib/python3.7/subprocess.py", line 775, in __init__
    restore_signals, start_new_session)
  File "/home/nelson/docstrings/tools/venv/lib/python3.7/subprocess.py", line 1522, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'nvcc': 'nvcc'

warp-ctc/pytorch_binding/setup.py

Lines 65 to 66 in 4209ce4

 package_name = 'warpctc_pytorch{}_cuda{}'.format( 

 get_torch_version(), get_cuda_version())

It will be better to set to:

if enable_gpu: 
    tag = 'cuda{}'.format(get_cuda_version())
else:
    tag = 'cpu'
package_name = 'warpctc_pytorch{}_{}'.format(
    get_torch_version(), tag)

or similar

get_torch_version raise error

I met the error in get_torch_version when use 1.0.1.post2.

warp-ctc/pytorch_binding/setup.py

Lines 47 to 49 in e9de7af

 def get_torch_version(): 

 major_ver, minor_ver, _ = torch.__version__.split('.') 

 return major_ver + minor_ver

>>> torch.__version__
'1.0.1.post2'
>>> a, b, _ = torch.__version__.split(".")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 3)

Maybe better to use major_ver, minor_ver = torch.__version__.split('.')[:2].

warpctc-pytorch 0.2.0 wheels are broken.

[root@677710b2f623 warp-ctc]# pip install warpctc-pytorch==0.2.0+torch13.cuda100
Collecting warpctc-pytorch==0.2.0+torch13.cuda100
  Using cached warpctc_pytorch-0.2.0%2Btorch13.cuda100-cp37-cp37m-manylinux1_x86_64.whl (2.7 MB)
Installing collected packages: warpctc-pytorch
Successfully installed warpctc-pytorch-0.2.0
WARNING: You are using pip version 20.1.1; however, version 20.2.3 is available.
You should consider upgrading via the '/opt/pyenv/versions/3.7.9/bin/python3.7 -m pip install --upgrade pip' command.
[root@677710b2f623 warp-ctc]# cat run.py
import torch
from warpctc_pytorch import CTCLoss
ctc_loss = CTCLoss()
probs = torch.FloatTensor([[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1]]]).transpose(0, 1).contiguous()
labels = torch.IntTensor([1, 2])
label_sizes = torch.IntTensor([2])
probs_sizes = torch.IntTensor([2])
probs.requires_grad_(True)  # tells autograd to compute gradients for probs
cost = ctc_loss(probs, labels, probs_sizes, label_sizes)
cost.backward()
[root@677710b2f623 warp-ctc]# python3 run.py
Traceback (most recent call last):
  File "run.py", line 11, in <module>
    cost.backward()
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/site-packages/torch/tensor.py", line 150, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: function _CTCBackward returned an incorrect number of gradients (expected 8, got 7)

Because the change introduced by jnishi#5 is not in SeanNaren/warp-ctc, #27 breaks the compatibility.

Fixing this bug is simple; just add one more None in CTCLoss.backward. But I think we should add a test to ensure backward can work correctly.

Build wheels for CUDA10.0, 9.1 and 9.0

Currently, pytorch-1.0.0 branch builds wheels for CUDA 10.1 and 9.2, the latest version of each major version. In addition, wheels for CUDA 10.0, 9.1 and 9.0 should be built.

Installation error with cpuonly

Hi, I am trying to install warpctc-pytorch as part of ESPnet (following the install guide here).

However, I can not install warpctc-pytorch on my machine... when I am trying to build ESPnet with make, I get the following error:

expr: syntax error
Perform on CPU mode: CPU_ONLY=0
PYTHON=/Users/Ladislas/anaconda3/bin/python3
PYTHON_VERSION=Python 3.7.6
IS_CONDA=0
USE_PIP=0
TH_VERSION=1.4.0
CONDA_PYTORCH=pytorch=1.4.0 cpuonly
PIP_PYTORCH=torch==1.4.0 -f https://download.pytorch.org/whl/cpu/torch_stable.html
CHAINER_VERSION=6.0.0
PIP_CHAINER=chainer==6.0.0
NO_CUPY=0
. ./activate_python.sh && { command -v cmake || conda install -y cmake; }
/usr/local/bin/cmake
touch cmake.done
. ./activate_python.sh && { command -v flac || conda install -y libflac -c conda-forge; }
/usr/local/bin/flac
touch flac.done
. ./activate_python.sh && { command -v ffmpeg || conda install -y ffmpeg -c conda-forge; }
/usr/local/bin/ffmpeg
touch ffmpeg.done
. ./activate_python.sh && { command -v sox || conda install -y sox -c conda-forge; }
/usr/local/bin/sox
touch sox.done
. ./activate_python.sh && { python3 -c "from ctypes.util import find_library as F; assert F('sndfile') is not None" || conda install -y libsndfile=1.0.28 -c conda-forge; }
touch sndfile.done
touch conda_packages.done
. ./activate_python.sh && ./installers/install_warp-ctc.sh
cuda_version=
ERROR: Could not find a version that satisfies the requirement warpctc-pytorch==0.2.1+torch14.cpu (from versions: none)
ERROR: No matching distribution found for warpctc-pytorch==0.2.1+torch14.cpu
make: *** [warp-ctc.done] Error 1

When I try to manually install warpctc-pytorch, I also get this error:

python3 -m pip install warpctc-pytorch==0.2.1+torch16.cpu

ERROR: Could not find a version that satisfies the requirement warpctc-pytorch==0.2.1+torch16.cpu (from versions: none)
ERROR: No matching distribution found for warpctc-pytorch==0.2.1+torch16.cpu

I understand that the version I am trying to install does not exist... but how can I install warpctc-pytorch for a CPU-only usage?

Thank you in advance.

Ladislas

Need to set LD_LIBRARY_PATH

I observed that when the cuda ver. is different from the one install in the pc (conda ver=10.0, pc ver=10.2), warpctc cannot be loaded.

(base) nelson@nelson-lab2:/export/db/espnet/libri/tools/venv/lib/python3.7/site-packages/warpctc_pytorch$ ldd _warp_ctc.cpython-37m-x86_64-linux-gnu.so 
	linux-vdso.so.1 (0x00007ffd2a958000)
	libwarpctc.so => /export/db/espnet/libri/tools/venv/lib/python3.7/site-packages/warpctc_pytorch/./lib/libwarpctc.so (0x00007f36be0e5000)
	libcudart.so.10.0 => not found
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f36bdd5c000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f36bd9be000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f36bd7a6000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f36bd587000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f36bd196000)
	libcudart.so.10.0 => not found
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f36bcf67000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f36be96d000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f36bcd63000)

The libraries are found in <espnet_dir>/tools/venv/lib, but activate does not add lib to the LD_LIBRARY_PATH.

I am wondering if LD can be configured by setup.py, or then it just keep it manually.

Delete warpctc-pytorch11-XXX wheels from pypi.org

Now that espnet/espnet#2453 is merged, users can install warpctc-pytorch wheel for PyTorch1.1 , warpctc-pytorch11-XXX wheels should be deleted from pypi.org, after we update installation instruction in README.

Python 3.5 build

Hello folks

Debian 9 has Python 3.5.3 in it's core system. Since Debian 9 should be still pretty common, could you please build 3.5 to pypi too?

Exclude maintenance version of PyTorch from wheel name

As of now, Travis CI builds a wheel whose name is warpctc_pytorch100_cuda101, but maintenance version of PyTorch should be excluded and renamed to warpctc_pytorch10_cuda101, which means the wheel works with PyTorch 1.0.X and CUDA 10.1.Y.

No matching distribution found for warpctc-pytorch==0.2.2+torch16.cpu

We have the following issue:
https://github.com/espnet/espnet/actions/runs/3045499004/jobs/4921624873#step:6:1432

@ysk24ok, could you deal with it?

warp CTC building is failed

@ysk24ok, warp CTC building is failed (possibly due to pep440 introduced in pip20.3???)
https://github.com/espnet/espnet/runs/1473939967
Could you check this?

fatal error: cuda_runtime_api.h: No such file or directory

I was building warp-ctc included in ESPNet package.
All is going well then I got this error.

building 'warpctc_pytorch._warp_ctc' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/src
gcc -pthread -B /mls/john/espnet/tools/venv/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/mls/john/espnet/tools/warp-ctc/include -I/mls/john/espnet/tools/venv/lib/python3.7/site-packages/torch/lib/include -I/mls/john/espnet/tools/venv/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/mls/john/espnet/tools/venv/lib/python3.7/site-packages/torch/lib/include/TH -I/mls/john/espnet/tools/venv/lib/python3.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/bin/include -I/mls/john/espnet/tools/venv/include/python3.7m -c src/binding.cpp -o build/temp.linux-x86_64-3.7/src/binding.o -std=c++11 -fPIC -DWARPCTC_ENABLE_GPU -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_warp_ctc -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /mls/john/espnet/tools/venv/lib/python3.7/site-packages/torch/lib/include/ATen/cuda/CUDAContext.h:5:0,
from src/binding.cpp:9:
/mls/john/espnet/tools/venv/lib/python3.7/site-packages/torch/lib/include/ATen/cuda/CUDAStream.h:6:30: fatal error: cuda_runtime_api.h: No such file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
make: *** [warp-ctc.done] Error 1

Even though cuda_runtime_api.h file can be found in its directory, Warp-CTC couldn't find it.

Luckly, I solved this error based on methods from URL below.

https://blog.csdn.net/idwtwt/article/details/98602951

Please, fix this problem as fast as you can.

libcudart.so.10.2: cannot open shared object file: No such file or directory

After installing warpctc Torch = = 0.2.1 + torch 16.cuda102, an error is reported. How to solve this problem? Thank you for your help

Reorganize branch structure

Current branch structure is as follows,

pytorch_binding (default branch, for PyTorch 0.4)
pytorch-1.0.0 for PyTorch 1.0
many other WIP branches

I think branches should be reorganized. One idea is like this,

      *---* pytorch-0.4 branch
     /
*---*---*---*---*---*---* default branch
             \       \
              \       *---* pytorch-1.1 branch
               \
                * pytorch-1.0 branch

In this new branch structure, we work on the default branch (BTW we may rename pytorch_binding branch, because it's redundant.) to make it adapt to new PyTorch version.
When we finish the work and the code works with new PyTorch version, we checkout the new branch (e.x. pytorch-X.Y).
After checkout pytorch-X.Y branch, we do nothing on that branch except a bug. In that case a hotfix commit will be added on top of that branch.

To change current branch structure into this new structure, we will follow these steps:

Run git checkout -b pytorch-0.4 at pytorch-binding branch

$ git checkout pytorch_binding
$ git checkout -b pytorch-0.4

Update tools/Makefile of espnet to run git checkout -b pytorch-0.4 remotes/origin/pytorch-0.4 when TH_VERSION is 0.4
After 2 is merged and new version (v0.6.0?) of espnet is released, merge pytorch-1.0.0 into default branch. This merge breaks backward compatibility because older version of espnet builds warp-ctc for PyTorch0.4.0 at default branch.
After 3, I'll work at default branch to adapt to newer version of PyTorch.

If this proposal is valid, I want maintainers to do 1. After that, I'll create a pull request for 2. We have to wait for new release to do 3, so until then I'll work on pytorch-1.0.0 instead of default branch to make wheels for PyTorch 1.0.

Would you please check this issue @sw005320 @jnishi ?

	package_name = 'warpctc_pytorch{}_cuda{}'.format(
	get_torch_version(), get_cuda_version())

	def get_torch_version():
	major_ver, minor_ver, _ = torch.__version__.split('.')
	return major_ver + minor_ver