pytorch / audio Goto Github PK

View Code? Open in Web Editor NEW

2.4K 73.0 640.0 1.43 GB

Data manipulation and transformation for audio signal processing, powered by PyTorch

Home Page: https://pytorch.org/audio

License: BSD 2-Clause "Simplified" License

Python 80.61% C++ 14.26% Shell 0.19% Batchfile 0.45% CMake 0.81% Cuda 3.68% C 0.02%

audio python io speech machine-learning pytorch audio-processing

audio's Introduction

torchaudio: an audio library for PyTorch

The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch operations which makes it easy to use and feel like a natural extension.

Support audio I/O (Load files, Save files)
- Load a variety of audio formats, such as wav, mp3, ogg, flac, opus, sphere, into a torch Tensor using SoX
- Kaldi (ark/scp)
Dataloaders for common audio datasets
Audio and speech processing functions
- forced_align
Common audio transforms
- Spectrogram, AmplitudeToDB, MelScale, MelSpectrogram, MFCC, MuLawEncoding, MuLawDecoding, Resample
Compliance interfaces: Run code using PyTorch that align with other libraries
- Kaldi: spectrogram, fbank, mfcc

Installation

Please refer to https://pytorch.org/audio/main/installation.html for installation and build process of TorchAudio.

API Reference

API Reference is located here: http://pytorch.org/audio/main/

Contributing Guidelines

Please refer to CONTRIBUTING.md

Citation

If you find this package useful, please cite as:

@article{yang2021torchaudio,
  title={TorchAudio: Building Blocks for Audio and Speech Processing},
  author={Yao-Yuan Yang and Moto Hira and Zhaoheng Ni and Anjali Chourdia and Artyom Astafurov and Caroline Chen and Ching-Feng Yeh and Christian Puhrsch and David Pollack and Dmitriy Genzel and Donny Greenberg and Edward Z. Yang and Jason Lian and Jay Mahadeokar and Jeff Hwang and Ji Chen and Peter Goldsborough and Prabhat Roy and Sean Narenthiran and Shinji Watanabe and Soumith Chintala and Vincent Quenneville-Bélair and Yangyang Shi},
  journal={arXiv preprint arXiv:2110.15018},
  year={2021}
}

@misc{hwang2023torchaudio,
      title={TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch}, 
      author={Jeff Hwang and Moto Hira and Caroline Chen and Xiaohui Zhang and Zhaoheng Ni and Guangzhi Sun and Pingchuan Ma and Ruizhe Huang and Vineel Pratap and Yuekai Zhang and Anurag Kumar and Chin-Yun Yu and Chuang Zhu and Chunxi Liu and Jacob Kahn and Mirco Ravanelli and Peng Sun and Shinji Watanabe and Yangyang Shi and Yumeng Tao and Robin Scheibler and Samuele Cornell and Sean Kim and Stavros Petridis},
      year={2023},
      eprint={2310.17864},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Disclaimer on Datasets

This is a utility library that downloads and prepares public datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have license to use the dataset. It is your responsibility to determine whether you have permission to use the dataset under the dataset's license.

If you're a dataset owner and wish to update any part of it (description, citation, etc.), or do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thanks for your contribution to the ML community!

Pre-trained Model License

The pre-trained models provided in this library may have their own licenses or terms and conditions derived from the dataset used for training. It is your responsibility to determine whether you have permission to use the models for your use case.

For instance, SquimSubjective model is released under the Creative Commons Attribution Non Commercial 4.0 International (CC-BY-NC 4.0) license. See the link for additional details.

Other pre-trained models that have different license are noted in documentation. Please checkout the documentation page.

audio's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala chagge sanyam5 dhpollack taras-sereda greaber iacolippo raff willqucd santi-pdp egorlakomkin frankatmech hulalazz pandinosaurus faroit goldsborough cuijianaaa pglolo archermmt dsp6414 pfriesch ezyang yennanliu ahmed-fau kyungyunlee fotwo normonisping dendisuhubdy ssnl yangyangii afcarl mingfeima dajoker78 experimenti amarioconnell kongxiangrui15095288006 yanxiaobin-ben wzugang dhruvramani shivamagrawal2014 lianfei yuhonghong66 primitiveman tklem stes innovativecoder kotenev chipper1 jph00 akshayb7 ccoulombe pcerles simonvalentin ginking gaoyiyeah thirdformant yazici marc-moreaux krishnakalyan3 javiernistal fanofjava bhwan1118 tengyifei siarheifedartsou tklebanoff w3ss fan1117 thommackey pedrodiamel felixdollack hagenw jamarshon xuhdev thomasbrandon erjihaoshi ksanjeevan djangid blank-wang orlandomelchor vincentqb mpariente alderpaw cpuhrsch mlaradji rchavezj stjordanis kytening yveoms jenthe ninjayoto mistobaan yf225 orrimoch engineerchuan hadryan aliutkus swchen1234 tjadamlee entn-at gavinljj

audio's Issues

Additional spectral feature transformations

While torchaudio provides a Mel-scaled spectrogram transformation (torchaudio.transforms.MEL), there’re a few additional spectral feature transformations that are extremely useful for pre-processing and data augmentation. For example, two feature transformations that I’d love to see in torchaudio are

can not import torchaudio

Hi,
I am trying to use torchaudio on Ubuntu and get the following exception:
lib/python3.7/site-packages/torchaudio-0.1-py3.7-linux-x86_64.egg/_torch_sox.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN2at5ErrorC1ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

ldd the so ,get the following:
ldd _torch_sox.cpython-37m-x86_64-linux-gnu.so linux-vdso.so.1 (0x00007ffeccdb2000) libsox.so.3 => /usr/lib/x86_64-linux-gnu/libsox.so.3 (0x00007f9d99960000) libstdc++.so.6 => /home/uname/.conda/envs/PyTorch/lib/libstdc++.so.6 (0x00007f9d99c93000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f9d995c2000) libgcc_s.so.1 => /home/uname/.conda/envs/PyTorch/lib/libgcc_s.so.1 (0x00007f9d99c7e000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9d993a3000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9d98fb2000) libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007f9d98da8000) libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f9d98b76000) libz.so.1 => /home/uname/.conda/envs/PyTorch/lib/libz.so.1 (0x00007f9d9895f000) libmagic.so.1 => /usr/lib/x86_64-linux-gnu/libmagic.so.1 (0x00007f9d9873d000) libgsm.so.1 => /usr/lib/x86_64-linux-gnu/libgsm.so.1 (0x00007f9d98530000) libgomp.so.1 => /home/uname/.conda/envs/PyTorch/lib/libgomp.so.1 (0x00007f9d99c56000) /lib64/ld-linux-x86-64.so.2 (0x00007f9d99be8000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9d9832c000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f9d98124000)

objdump the so ,get the following:

objdump -tT _torch_sox.cpython-37m-x86_64-linux-gnu.so | grep _ZN2at5Error
0000000000009850 w F .text 0000000000000047 _ZN2at5ErrorD0Ev
0000000000000000 UND 0000000000000000 _ZN2at5ErrorC1ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
0000000000009510 w F .text 000000000000003f _ZN2at5ErrorD1Ev
0000000000009510 w F .text 000000000000003f _ZN2at5ErrorD2Ev
0000000000000000 D UND 0000000000000000 _ZN2at5ErrorC1ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
0000000000009510 w DF .text 000000000000003f Base _ZN2at5ErrorD1Ev
0000000000009510 w DF .text 000000000000003f Base _ZN2at5ErrorD2Ev
0000000000009850 w DF .text 0000000000000047 Base _ZN2at5ErrorD0Ev

module 'torch' has no attribute 'hann_window' when importing torchaudio

I have installed torchaudio on top of pytorch 0.3.1 for python3.5.
But When I try to import torchaudioI get the following error:

AttributeError: module 'torch' has no attribute 'hann_window'

Please help.

ImportError when using pytorch compiled from the master branch with CUDA 10

I have compiled pytorch from the master branch using Python 3.7.1 and CUDA 10. Everything except pytorch-audio works. I receive this error when trying to import.

~/.pyenv/versions/3.7.1/envs/pytorch-venv/lib/python3.7/site-packages/torchaudio/__init__.py in <module>
      2 
      3 import torch
----> 4 import _torch_sox
      5 
      6 from torchaudio import transforms

ImportError: /home/kureta/.pyenv/versions/3.7.1/envs/pytorch-venv/lib/python3.7/site-packages/_torch_sox.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN2at18SparseCUDATensorIdEv

I can provide additional information if necessary.

Weird build warning about ABI compatibility

Hi there,

when trying to build torchaudio from a Dockerfile, I obtain this warning, I suspect it to provoke pip installation failures afterwards. How can I fix it ?
At a first glance the installed versions of gcc and libstdc++ are correct.

François

Step 11/15 : RUN git clone --recursive https://github.com/pytorch/audio.git
---> Running in 1c0152e3a375
Cloning into 'audio'...
Removing intermediate container 1c0152e3a375
---> 1e796407307e
Step 12/15 : RUN cd audio; python setup.py install
---> Running in ffaec8803f4a
running install
running bdist_egg
running egg_info
creating torchaudio.egg-info
writing torchaudio.egg-info/PKG-INFO
writing top-level names to torchaudio.egg-info/top_level.txt
writing dependency_links to torchaudio.egg-info/dependency_links.txt
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/torchaudio
copying torchaudio/transforms.py -> build/lib.linux-x86_64-2.7/torchaudio
copying torchaudio/init.py -> build/lib.linux-x86_64-2.7/torchaudio
creating build/lib.linux-x86_64-2.7/torchaudio/datasets
copying torchaudio/datasets/init.py -> build/lib.linux-x86_64-2.7/torchaudio/datasets
copying torchaudio/datasets/yesno.py -> build/lib.linux-x86_64-2.7/torchaudio/datasets
copying torchaudio/datasets/vctk.py -> build/lib.linux-x86_64-2.7/torchaudio/datasets
running build_ext
/usr/local/lib/python2.7/dist-packages/torch/utils/cpp_extension.py:118: UserWarning:

                           !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (c++) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 4.9 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.

See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 4.9 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                          !! WARNING !!

warnings.warn(ABI_INCOMPATIBILITY_WARNING.format(compiler))
building '_torch_sox' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/torchaudio
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/torch/lib/include -I/usr/local/lib/python2.7/dist-packages/torch/lib/include/TH -I/usr/local/lib/python2.7/dist-packages/torch/lib/include/THC -I/usr/include/python2.7 -c torchaudio/torch_sox.cpp -o build/temp.linux-x86_64-2.7/torchaudio/torch_sox.o -DTORCH_EXTENSION_NAME=_torch_sox -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
c++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/torchaudio/torch_sox.o -lsox -o build/lib.linux-x86_64-2.7/_torch_sox.so
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/_torch_sox.so -> build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/torchaudio
copying build/lib.linux-x86_64-2.7/torchaudio/transforms.py -> build/bdist.linux-x86_64/egg/torchaudio
copying build/lib.linux-x86_64-2.7/torchaudio/init.py -> build/bdist.linux-x86_64/egg/torchaudio
creating build/bdist.linux-x86_64/egg/torchaudio/datasets
copying build/lib.linux-x86_64-2.7/torchaudio/datasets/init.py -> build/bdist.linux-x86_64/egg/torchaudio/datasets
copying build/lib.linux-x86_64-2.7/torchaudio/datasets/yesno.py -> build/bdist.linux-x86_64/egg/torchaudio/datasets
copying build/lib.linux-x86_64-2.7/torchaudio/datasets/vctk.py -> build/bdist.linux-x86_64/egg/torchaudio/datasets
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/transforms.py to transforms.pyc
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/init.py to init.pyc
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/datasets/init.py to init.pyc
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/datasets/yesno.py to yesno.pyc
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/datasets/vctk.py to vctk.pyc
creating stub loader for _torch_sox.so
byte-compiling build/bdist.linux-x86_64/egg/_torch_sox.py to _torch_sox.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
copying torchaudio.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
creating dist
creating 'dist/torchaudio-0.1-py2.7-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing torchaudio-0.1-py2.7-linux-x86_64.egg
Copying torchaudio-0.1-py2.7-linux-x86_64.egg to /usr/local/lib/python2.7/dist-packages
Adding torchaudio 0.1 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/torchaudio-0.1-py2.7-linux-x86_64.egg
Processing dependencies for torchaudio==0.1
Finished processing dependencies for torchaudio==0.1
Removing intermediate container ffaec8803f4a

Why do you not allow for installation using pip or conda?

There are many installation error issues.

reduce aliasing in downsampling

Currently the VCTK dataset loader does implement downsampling without low-pass filtering (=decimation). This results in severe aliasing artifacts and should be avoided.

I would propose to add resampy as a dependency and include this in the base modules so that dataset loaders can utilize a high quality resampling.

I can compile a PR if you like this proposal

MacOS X 10.13 Installation Issue

Obtaining file:///Users/grokmachine/pytorch/audio
Installing collected packages: torchaudio
  Running setup.py develop for torchaudio
    Complete output from command /Users/grokmachine/anaconda3/envs/torch/bin/python -c "import setuptools, tokenize;__file__='/Users/grokmachine/pytorch/audio/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps:
    which: no nvcc in (/Users/grokmachine/anaconda3/envs/torch/bin:/Users/grokmachine/anaconda3/bin:/Users/grokmachine/.cargo/bin:/usr/local/opt/mysql-client/bin:/usr/local/opt/gettext/bin:/usr/local/opt/icu4c/sbin:/usr/local/opt/icu4c/bin:/usr/local/opt/berkeley-db@4/bin:/usr/local/opt/llvm/bin:/usr/local/opt/llvm/bin:/usr/local/sbin:/Users/grokmachine/bin:/usr/local/bin:/usr/local/opt/mysql-client/bin:/usr/local/opt/gettext/bin:/usr/local/opt/icu4c/sbin:/usr/local/opt/icu4c/bin:/usr/local/opt/berkeley-db@4/bin:/usr/local/opt/llvm/bin:/usr/local/opt/llvm/bin:/usr/local/sbin:/Users/grokmachine/bin:/usr/local/bin:/Users/grokmachine/.cargo/bin:/Library/Frameworks/Python.framework/Versions/3.7/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/MacGPG2/bin:/Applications/Wireshark.app/Contents/MacOS)
    running develop
    running egg_info
    writing torchaudio.egg-info/PKG-INFO
    writing dependency_links to torchaudio.egg-info/dependency_links.txt
    writing top-level names to torchaudio.egg-info/top_level.txt
    reading manifest file 'torchaudio.egg-info/SOURCES.txt'
    writing manifest file 'torchaudio.egg-info/SOURCES.txt'
    running build_ext
    building '_torch_sox' extension
    gcc -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/lib/python3.7/site-packages/numpy/core/include/ -I/Users/grokmachine/pytorch/pytorch/torch/lib/include -I/Users/grokmachine/pytorch/pytorch/torch/lib/include/TH -I/Users/grokmachine/pytorch/pytorch/torch/lib/include/THC -I/Users/grokmachine/anaconda3/envs/torch/include/python3.7m -c torchaudio/torch_sox.cpp -o build/temp.macosx-10.7-x86_64-3.7/torchaudio/torch_sox.o -DTORCH_EXTENSION_NAME=_torch_sox -std=c++11
    In file included from torchaudio/torch_sox.cpp:1:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/torch/torch.h:5:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/pybind11.h:43:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/attr.h:13:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/cast.h:13:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/pytypes.h:12:
    /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/detail/common.h:139:10: fatal error: 'forward_list' file not found
    #include <forward_list>
             ^~~~~~~~~~~~~~
    1 error generated.
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/Users/grokmachine/anaconda3/envs/torch/bin/python -c "import setuptools, tokenize;__file__='/Users/grokmachine/pytorch/audio/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps" failed with error code 1 in /Users/grokmachine/pytorch/audio/

So I tried using clang instead, still the same error

 grokmachine@spock   ~/pytorch/audio     master  CC=clang CXX=clang++ pip install -e .
Obtaining file:///Users/grokmachine/pytorch/audio
Installing collected packages: torchaudio
  Running setup.py develop for torchaudio
    Complete output from command /Users/grokmachine/anaconda3/envs/torch/bin/python -c "import setuptools, tokenize;__file__='/Users/grokmachine/pytorch/audio/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps:
    which: no nvcc in (/Users/grokmachine/anaconda3/envs/torch/bin:/Users/grokmachine/anaconda3/bin:/Users/grokmachine/.cargo/bin:/usr/local/opt/mysql-client/bin:/usr/local/opt/gettext/bin:/usr/local/opt/icu4c/sbin:/usr/local/opt/icu4c/bin:/usr/local/opt/berkeley-db@4/bin:/usr/local/opt/llvm/bin:/usr/local/opt/llvm/bin:/usr/local/sbin:/Users/grokmachine/bin:/usr/local/bin:/usr/local/opt/mysql-client/bin:/usr/local/opt/gettext/bin:/usr/local/opt/icu4c/sbin:/usr/local/opt/icu4c/bin:/usr/local/opt/berkeley-db@4/bin:/usr/local/opt/llvm/bin:/usr/local/opt/llvm/bin:/usr/local/sbin:/Users/grokmachine/bin:/usr/local/bin:/Users/grokmachine/.cargo/bin:/Library/Frameworks/Python.framework/Versions/3.7/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/usr/local/MacGPG2/bin:/Applications/Wireshark.app/Contents/MacOS)
    running develop
    running egg_info
    writing torchaudio.egg-info/PKG-INFO
    writing dependency_links to torchaudio.egg-info/dependency_links.txt
    writing top-level names to torchaudio.egg-info/top_level.txt
    reading manifest file 'torchaudio.egg-info/SOURCES.txt'
    writing manifest file 'torchaudio.egg-info/SOURCES.txt'
    running build_ext
    building '_torch_sox' extension
    clang -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/lib/python3.7/site-packages/numpy/core/include/ -I/Users/grokmachine/pytorch/pytorch/torch/lib/include -I/Users/grokmachine/pytorch/pytorch/torch/lib/include/TH -I/Users/grokmachine/pytorch/pytorch/torch/lib/include/THC -I/Users/grokmachine/anaconda3/envs/torch/include/python3.7m -c torchaudio/torch_sox.cpp -o build/temp.macosx-10.7-x86_64-3.7/torchaudio/torch_sox.o -DTORCH_EXTENSION_NAME=_torch_sox -std=c++11
    In file included from torchaudio/torch_sox.cpp:1:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/torch/torch.h:5:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/pybind11.h:43:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/attr.h:13:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/cast.h:13:
    In file included from /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/pytypes.h:12:
    /Users/grokmachine/pytorch/pytorch/torch/lib/include/pybind11/detail/common.h:139:10: fatal error: 'forward_list' file not found
    #include <forward_list>
             ^~~~~~~~~~~~~~
    1 error generated.
    error: command 'clang' failed with exit status 1

    ----------------------------------------
Command "/Users/grokmachine/anaconda3/envs/torch/bin/python -c "import setuptools, tokenize;__file__='/Users/grokmachine/pytorch/audio/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps" failed with error code 1 in /Users/grokmachine/pytorch/audio/

but if I use the latest gcc-8

 grokmachine@spock   ~/pytorch/audio     master  CC=gcc-8 CXX=g++-8 pip install -e .   1 ↵  10209  17:42:41
Obtaining file:///Users/grokmachine/pytorch/audio
Installing collected packages: torchaudio
  Running setup.py develop for torchaudio
Successfully installed torchaudio```

Saving stereo files causes fatal error if re-loading generated file

Saving stereo files does not properly add the length info to the saved file. Thus you cannot reload these files with torchaudio, although they are playable in most audio programs. Below is a test case.

import torchaudio
sig, sr = torchaudio.load("test/steam-train-whistle-daniel_simon.mp3")
torchaudio.save("test/file.wav", sig, sr)
sig, sr = torchaudio.load("test/file.wav")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dhpollack/repos/audio/torchaudio/__init__.py", line 30, in load
    func(str(filepath).encode("utf-8"), out, sample_rate_p)
  File "/home/dhpollack/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 177, in safe_cal
l
    result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: [read_audio] Unknown length at torchaudio/src/generic/th_sox.c:14

undefined symbol when importing torchaudio with pytorch

Hi,
When importing torchaudio with pytorch 0.4.1 I get an undefined symbol. It does however work with v0.4.0. audio version: 7314b36

Successfully installed numpy-1.15.0 torch-cpu-0.4.1 torchaudio-0.1
(test_venv) [~]$ python -c "import torchaudio;"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "test_venv/lib/python3.6/site-packages/torchaudio/__init__.py", line 4, in <module>
    import _torch_sox
ImportError: test_venv/lib/python3.6/site-packages/_torch_sox.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at5ErrorC1ENS_14SourceLocationESs

Thanks

Build failed

I ran this command

python setup.py install &> log

and saw this error log

running install
running bdist_egg
running egg_info
writing torchaudio.egg-info/PKG-INFO
writing dependency_links to torchaudio.egg-info/dependency_links.txt
writing top-level names to torchaudio.egg-info/top_level.txt
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.7-x86_64/egg
running install_lib
running build_py
running build_ext
building '_torch_sox' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/include -arch x86_64 -I/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/include -arch x86_64 -I/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include -I/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include/TH -I/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include/THC -I/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/include/python3.6m -c torchaudio/torch_sox.cpp -o build/temp.macosx-10.7-x86_64-3.6/torchaudio/torch_sox.o -DTORCH_EXTENSION_NAME=_torch_sox -std=c++11
In file included from torchaudio/torch_sox.cpp:1:
In file included from /Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include/torch/torch.h:5:
In file included from /Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:5:
In file included from /Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include/ATen/Allocator.h:6:
/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/lib/include/ATen/Retainable.h:3:10: fatal error: 'atomic' file not found
#include <atomic>
         ^~~~~~~~
1 error generated.
/Users/npkk/.pyenv/versions/anaconda3-5.1.0/envs/pytorch0.4.0/lib/python3.6/site-packages/torch/utils/cpp_extension.py:106: UserWarning: 

                               !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (g++) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 4.9 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.

See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 4.9 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                              !! WARNING !!

  warnings.warn(ABI_INCOMPATIBILITY_WARNING.format(compiler))
error: command 'gcc' failed with exit status 1

I thought that the build command

gcc -Wno-unused-result ... -std=c++11

should be following...

g++ -Wno-unused-result ... -std=c++11

No module named 'torch.utils.cpp_extension'

I am getting this error when I run python setup.py install

Please help
Thnx

Which library is torchaudio consistent with?

Hi, I'm currently updating my torch codebase from using librosa to torchaudio for transforms, to take advantage of the (much) faster stft torch implementation on the GPU. However, running into several occasions where the output from Spectrogram vs. librosa.core._spectrogram, MelSpectrogram vs. librosa.melspectrogram have different results. Does this repo ensure consistency with another python audio library for those transformations? I think it would be good to have consistency with another widely used library. Currently figuring out the correct params to ensure consistency and I can PR something if that sounds useful.

For example:

sound, sample_rate = torchaudio.load('wav_file.wav')
sound = sound
sound_librosa = sound.cpu().numpy().squeeze().T

sample_rate = 16000
n_mels = 40
window_stride = 0.01
window_size = 0.025
hop_length = int(sample_rate * window_stride)
n_fft = int(sample_rate * window_size)

stft_librosa = librosa.stft(y=sound_librosa,
                            hop_length=hop_length,
                            n_fft=n_fft)
spectro_librosa, n_fft = librosa.core.spectrum._spectrogram(y=sound_librosa,
                            hop_length=hop_length,
                            n_fft=n_fft, power=2)
mel_basis = librosa.filters.mel(sample_rate,
                                n_mels=n_mels,
                                n_fft=n_fft,
                                norm=None, # non-standard
                                htk=True) # non-standard
check = np.dot(mel_basis, spectro_librosa)

stft_torch = torch.stft(soundcuda,
                        hop_length=hop_length,
                        n_fft=n_fft,
                        window=window).transpose(1, 2)
spectro_torch = stft_torch.pow(2).sum(-1)
melscale = torchaudio.transforms.MelScale(n_mels=n_mels)
check2 = melscale(check)

#check == check2

The torchaudio MelScale uses the non-default librosa options norm=None, htk=True on librosa.filters.mel (https://librosa.github.io/librosa/_modules/librosa/filters.html#mel). I also removed the default spectrogram normalization at https://github.com/pytorch/audio/blob/master/torchaudio/transforms.py#L198, which is not a librosa option.

There's also functional inconsistencies between the librosa and torchaudio function calls -- librosa returns a spectrogram with librosa.feature.melspectrogram, whereas torchaudio converts the spectrogram to the Db scale.

docstring should be worded for audio

I guess 'takes in an PIL image' should be changed to something like 'takes in raw audio'

audio/torchaudio/datasets/vctk.py

Line 83 in 7212f24

transform (callable, optional): A function/transform that takes in an PIL image

audio/torchaudio/datasets/yesno.py

Line 20 in 7212f24

transform (callable, optional): A function/transform that takes in an PIL image

it can not calculate on GPU

I try to make it on GPU,but it can not run, I want to know whether it can work on GPU if I try to revise code?

thank you,author!!!

Consistency between torchvision/torchaudio

When we load a sound file with torchaudio, we get an output Tensor of size (L x C) (L the number of audio frames and C the number of channels).
It's not a better idea to get a Tensor of shape (C x L) ?

Because, with torchvision, when I load an image and I use the function ToTensor, the dimension of the output tensor is (C x H x W) with the channel in first dimension. Is not it more coherent if the output of the load function in torchaudio use a similar output shape to the ToTensor function ?

In this case, the functions LC2CL and BLC2CBL are no longer necessary.

Change in `F2M`; more docstrings including examples

Hi, I think I could work on these if they seem useful for users/maintainers.

in transforms.F2M,
- The computation of filterbanks (fb) can be move to init() to remove its redundancy.
Would more docstrings be useful?
Seems like there could be potential -inf or NaN during applying _tlog10() (Example). Perhaps we could make it little more stable?

ModuleNotFoundError: No module named 'torchaudio._ext.th_sox._th_sox'

Hi. Even though I've got sox installed, I get a 'module not found' error.
This is on Mac, with Anaconda & Python 3.6...

$ python
Python 3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 13:44:09) 
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchaudio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/shawley/Downloads/audio/torchaudio/__init__.py", line 9, in <module>
    from ._ext import th_sox
  File "/Users/shawley/Downloads/audio/torchaudio/_ext/th_sox/__init__.py", line 3, in <module>
    from ._th_sox import lib as _lib, ffi as _ffi
ModuleNotFoundError: No module named 'torchaudio._ext.th_sox._th_sox'
>>>

This is after following installation instructions.
Any suggestions? Thanks.

Install log follows below...

$ git clone https://github.com/pytorch/audio.git
Cloning into 'audio'...
remote: Counting objects: 304, done.
remote: Compressing objects: 100% (45/45), done.
remote: Total 304 (delta 24), reused 19 (delta 10), pack-reused 249
Receiving objects: 100% (304/304), 4.85 MiB | 1.24 MiB/s, done.
Resolving deltas: 100% (106/106), done.

$ brew install sox
Error: sox 14.4.2 is already installed
To upgrade to 14.4.2_1, run `brew upgrade sox`

$ brew upgrade sox
==> Upgrading 1 outdated package, with result:
sox 14.4.2 -> 14.4.2_1
==> Upgrading sox 
==> Downloading https://homebrew.bintray.com/bottles/sox-14.4.2_1.high_sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring sox-14.4.2_1.high_sierra.bottle.tar.gz
🍺  /usr/local/Cellar/sox/14.4.2_1: 23 files, 1.8MB

$ pip install cffi
Requirement already satisfied: cffi in /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages (1.11.5)
Requirement already satisfied: pycparser in /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages (from cffi) (2.18)


$ $ python setup.py install
running install
running bdist_egg
running egg_info
creating torchaudio.egg-info
writing torchaudio.egg-info/PKG-INFO
writing dependency_links to torchaudio.egg-info/dependency_links.txt
writing requirements to torchaudio.egg-info/requires.txt
writing top-level names to torchaudio.egg-info/top_level.txt
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.9-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-3.6
creating build/lib.macosx-10.9-x86_64-3.6/torchaudio
copying torchaudio/transforms.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio
copying torchaudio/__init__.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio
creating build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets
copying torchaudio/datasets/__init__.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets
copying torchaudio/datasets/yesno.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets
copying torchaudio/datasets/vctk.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets
creating build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext
copying torchaudio/_ext/__init__.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext
creating build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext/th_sox
copying torchaudio/_ext/th_sox/__init__.py -> build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext/th_sox
running build_ext
generating cffi module 'build/temp.macosx-10.9-x86_64-3.6/torchaudio._ext.th_sox._th_sox.c'
creating build/temp.macosx-10.9-x86_64-3.6
building 'torchaudio._ext.th_sox._th_sox' extension
creating build/temp.macosx-10.9-x86_64-3.6/build
creating build/temp.macosx-10.9-x86_64-3.6/build/temp.macosx-10.9-x86_64-3.6
creating build/temp.macosx-10.9-x86_64-3.6/Users
creating build/temp.macosx-10.9-x86_64-3.6/Users/shawley
creating build/temp.macosx-10.9-x86_64-3.6/Users/shawley/Downloads
creating build/temp.macosx-10.9-x86_64-3.6/Users/shawley/Downloads/audio
creating build/temp.macosx-10.9-x86_64-3.6/Users/shawley/Downloads/audio/torchaudio
creating build/temp.macosx-10.9-x86_64-3.6/Users/shawley/Downloads/audio/torchaudio/src
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/shawley/anaconda/envs/py36/include -arch x86_64 -I/Users/shawley/anaconda/envs/py36/include -arch x86_64 -I/Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -Itorchaudio/src -I/Users/shawley/anaconda/envs/py36/include/python3.6m -c build/temp.macosx-10.9-x86_64-3.6/torchaudio._ext.th_sox._th_sox.c -o build/temp.macosx-10.9-x86_64-3.6/build/temp.macosx-10.9-x86_64-3.6/torchaudio._ext.th_sox._th_sox.o
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/shawley/anaconda/envs/py36/include -arch x86_64 -I/Users/shawley/anaconda/envs/py36/include -arch x86_64 -I/Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -Itorchaudio/src -I/Users/shawley/anaconda/envs/py36/include/python3.6m -c /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c -o build/temp.macosx-10.9-x86_64-3.6/Users/shawley/Downloads/audio/torchaudio/src/th_sox.o
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:10:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateFloatTypes.h:10:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:10:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateFloatTypes.h:11:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateIntTypes.h:10:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateIntTypes.h:11:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateIntTypes.h:12:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateIntTypes.h:13:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/shawley/Downloads/audio/torchaudio/src/th_sox.c:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateAllTypes.h:11:
In file included from /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH/THGenerateIntTypes.h:14:
In file included from generic/th_sox.c:1:
torchaudio/src/generic/th_sox.c:11:18: warning: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
    if (nsamples != -1) {
        ~~~~~~~~ ^  ~~
torchaudio/src/generic/th_sox.c:27:14: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
  for (x=0; x<samples_read/nchannels; x++) {
            ~^~~~~~~~~~~~~~~~~~~~~~~
14 warnings generated.
gcc -bundle -undefined dynamic_lookup -Wl,-rpath,/Users/shawley/anaconda/envs/py36/lib -L/Users/shawley/anaconda/envs/py36/lib -headerpad_max_install_names -Wl,-rpath,/Users/shawley/anaconda/envs/py36/lib -L/Users/shawley/anaconda/envs/py36/lib -headerpad_max_install_names -arch x86_64 build/temp.macosx-10.9-x86_64-3.6/build/temp.macosx-10.9-x86_64-3.6/torchaudio._ext.th_sox._th_sox.o build/temp.macosx-10.9-x86_64-3.6/Users/shawley/Downloads/audio/torchaudio/src/th_sox.o -L/Users/shawley/anaconda/envs/py36/lib -lsox -o build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext/th_sox/_th_sox.abi3.so
creating build/bdist.macosx-10.9-x86_64
creating build/bdist.macosx-10.9-x86_64/egg
creating build/bdist.macosx-10.9-x86_64/egg/torchaudio
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/transforms.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio
creating build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets/__init__.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets/yesno.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/datasets/vctk.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/__init__.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio
creating build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext
creating build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext/th_sox
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext/th_sox/_th_sox.abi3.so -> build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext/th_sox
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext/th_sox/__init__.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext/th_sox
copying build/lib.macosx-10.9-x86_64-3.6/torchaudio/_ext/__init__.py -> build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/transforms.py to transforms.cpython-36.pyc
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets/__init__.py to __init__.cpython-36.pyc
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets/yesno.py to yesno.cpython-36.pyc
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/datasets/vctk.py to vctk.cpython-36.pyc
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/__init__.py to __init__.cpython-36.pyc
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext/th_sox/__init__.py to __init__.cpython-36.pyc
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext/__init__.py to __init__.cpython-36.pyc
creating stub loader for torchaudio/_ext/th_sox/_th_sox.abi3.so
byte-compiling build/bdist.macosx-10.9-x86_64/egg/torchaudio/_ext/th_sox/_th_sox.py to _th_sox.cpython-36.pyc
creating build/bdist.macosx-10.9-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/PKG-INFO -> build/bdist.macosx-10.9-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/SOURCES.txt -> build/bdist.macosx-10.9-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/dependency_links.txt -> build/bdist.macosx-10.9-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/requires.txt -> build/bdist.macosx-10.9-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/top_level.txt -> build/bdist.macosx-10.9-x86_64/egg/EGG-INFO
writing build/bdist.macosx-10.9-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
torchaudio._ext.th_sox.__pycache__._th_sox.cpython-36: module references __file__
creating dist
creating 'dist/torchaudio-0.1-py3.6-macosx-10.9-x86_64.egg' and adding 'build/bdist.macosx-10.9-x86_64/egg' to it
removing 'build/bdist.macosx-10.9-x86_64/egg' (and everything under it)
Processing torchaudio-0.1-py3.6-macosx-10.9-x86_64.egg
creating /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torchaudio-0.1-py3.6-macosx-10.9-x86_64.egg
Extracting torchaudio-0.1-py3.6-macosx-10.9-x86_64.egg to /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages
Adding torchaudio 0.1 to easy-install.pth file

Installed /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages/torchaudio-0.1-py3.6-macosx-10.9-x86_64.egg
Processing dependencies for torchaudio==0.1
Searching for cffi==1.11.5
Best match: cffi 1.11.5
Adding cffi 1.11.5 to easy-install.pth file

Using /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages
Searching for pycparser==2.18
Best match: pycparser 2.18
Adding pycparser 2.18 to easy-install.pth file

Using /Users/shawley/anaconda/envs/py36/lib/python3.6/site-packages
Finished processing dependencies for torchaudio==0.1

Further poking around:

$ cat torchaudio/_ext/th_sox/__init__.py 

from torch.utils.ffi import _wrap_function
from ._th_sox import lib as _lib, ffi as _ffi

__all__ = []
def _import_symbols(locals):
    for symbol in dir(_lib):
        fn = getattr(_lib, symbol)
        if callable(fn):
            locals[symbol] = _wrap_function(fn, _ffi)
        else:
            locals[symbol] = fn
        __all__.append(symbol)

_import_symbols(locals())

...I don't see how this is supposed to import other modules from the th_sox directory if there's nothing else in there:

$ ls torchaudio/_ext/th_sox/
__init__.py      __pycache__/

Can't compile on Windows (MinGW)

I tried installing torchaudio with the command python setup.py install using the MinGW64 gcc compiler and got this error:

running install
running bdist_egg
running egg_info
writing torchaudio.egg-info\PKG-INFO
writing dependency_links to torchaudio.egg-info\dependency_links.txt
writing requirements to torchaudio.egg-info\requires.txt
writing top-level names to torchaudio.egg-info\top_level.txt
reading manifest file 'torchaudio.egg-info\SOURCES.txt'
writing manifest file 'torchaudio.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_py
copying torchaudio\_ext\th_sox\__init__.py -> build\lib.win-amd64-3.6\torchaudio\_ext\th_sox
running build_ext
generating cffi module 'build\\temp.win-amd64-3.6\\Release\\torchaudio._ext.th_sox._th_sox.c'
already up-to-date
building 'torchaudio._ext.th_sox._th_sox' extension
C:\Program Files\mingw-w64\x86_64-7.3.0-win32-seh-rt_v5-rev0\mingw64\bin\gcc.exe -mdll -O -Wall -DMS_WIN64 -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\lib\site-packages\torch\utils\ffi\..\..\lib\include -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\lib\site-packages\torch\utils\ffi\..\..\lib\include\TH -Itorchaudio/src -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\include -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\include -c build\temp.win-amd64-3.6\Release\torchaudio._ext.th_sox._th_sox.c -o build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\torchaudio._ext.th_sox._th_sox.o
C:\Program Files\mingw-w64\x86_64-7.3.0-win32-seh-rt_v5-rev0\mingw64\bin\gcc.exe -mdll -O -Wall -DMS_WIN64 -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\lib\site-packages\torch\utils\ffi\..\..\lib\include -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\lib\site-packages\torch\utils\ffi\..\..\lib\include\TH -Itorchaudio/src -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\include -IC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\include -c C:\Users\n15hugr\Documents\code\audio-master\torchaudio/src/th_sox.c -o c:\users\n15hugr\documents\code\audio-master\torchaudio\src\th_sox.o
writing build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\_th_sox.cp36-win_amd64.def
C:\Program Files\mingw-w64\x86_64-7.3.0-win32-seh-rt_v5-rev0\mingw64\bin\gcc.exe -shared -s build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\torchaudio._ext.th_sox._th_sox.o c:\users\n15hugr\documents\code\audio-master\torchaudio\src\th_sox.o build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\_th_sox.cp36-win_amd64.def -LC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\lib\site-packages\torch\utils\ffi\..\..\lib -LC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\libs -LC:\Users\n15hugr\AppData\Local\Continuum\miniconda3\PCbuild\amd64 -lsox -lpython36 -lmsvcr140 -o build\lib.win-amd64-3.6\torchaudio\_ext\th_sox\_th_sox.cp36-win_amd64.pyd
c:\users\n15hugr\documents\code\audio-master\torchaudio\src\th_sox.o:th_sox.c:(.text+0x0): multiple definition of `log1p'
build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\torchaudio._ext.th_sox._th_sox.o:torchaudio._ext.th_sox._th_sox.c:(.text+0x2e83): first defined here
c:\users\n15hugr\documents\code\audio-master\torchaudio\src\th_sox.o:th_sox.c:(.text+0xf): multiple definition of `log2'
build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\torchaudio._ext.th_sox._th_sox.o:torchaudio._ext.th_sox._th_sox.c:(.text+0x2e92): first defined here
c:\users\n15hugr\documents\code\audio-master\torchaudio\src\th_sox.o:th_sox.c:(.text+0x1e): multiple definition of `expm1'
build\temp.win-amd64-3.6\Release\build\temp.win-amd64-3.6\release\torchaudio._ext.th_sox._th_sox.o:torchaudio._ext.th_sox._th_sox.c:(.text+0x2ea1): first defined here
C:/Program Files/mingw-w64/x86_64-7.3.0-win32-seh-rt_v5-rev0/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/7.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -lsox
C:/Program Files/mingw-w64/x86_64-7.3.0-win32-seh-rt_v5-rev0/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/7.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -lmsvcr140
collect2.exe: error: ld returned 1 exit status
error: command 'C:\\Program Files\\mingw-w64\\x86_64-7.3.0-win32-seh-rt_v5-rev0\\mingw64\\bin\\gcc.exe' failed with exit status 1

different library for loading audio files

I read that PyTorch doesn't want to add complex number tensor types. A lot of audio processing requires fourier transforms, specifically into MEL spectrograms. It would be nice to use a library that already handles a lot of these transformations in numpy. The most popular one seems to be librosa, which uses numpy arrays for easy conversion to torch tensors.

Is there any reason this library uses a cffi extension instead of a library such as librosa?

There are actually a lot more reasons to use librosa rather than the current solution, but it'd take too long to list them. I wouldn't mind making a pull request to switch from the current libsox cffi way of doing things to the librosa/numpy way.

Envnet model and related transforms

Hey,

I'm implementing EnvNet with ESC50 dataset with pytorch, it works great, but I would like to pull it into torchaudio.
So far I'm using the utils from bc_learning , what would be the best way to put this utils in torch audio ? should I rewrite the utils to be full pytorch or leave it in numpy (as in torchvision, the PIL Image in transforms.py)

error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode

The following error occurred when running 'setup.py'

running install
running bdist_egg
running egg_info
writing requirements to torchaudio.egg-info/requires.txt
writing torchaudio.egg-info/PKG-INFO
writing top-level names to torchaudio.egg-info/top_level.txt
writing dependency_links to torchaudio.egg-info/dependency_links.txt
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying torchaudio/_ext/th_sox/__init__.py -> build/lib.linux-x86_64-3.5/torchaudio/_ext/th_sox
running build_ext
generating cffi module 'build/temp.linux-x86_64-3.5/torchaudio._ext.th_sox._th_sox.c'
already up-to-date
building 'torchaudio._ext.th_sox._th_sox' extension
gcc -pthread -B /home/amust/anaconda3/envs/python35/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Itorchaudio/src -I/home/amust/anaconda3/envs/python35/include/python3.5m -c build/temp.linux-x86_64-3.5/torchaudio._ext.th_sox._th_sox.c -o build/temp.linux-x86_64-3.5/build/temp.linux-x86_64-3.5/torchaudio._ext.th_sox._th_sox.o
In file included from /home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THVector.h:5:0,
                 from /home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/TH.h:12,
                 from build/temp.linux-x86_64-3.5/torchaudio._ext.th_sox._th_sox.c:492:
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h: In function ‘TH_polevl’:
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h:134:3: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
   for (size_t i = 0; i <= len; i++) {
   ^
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h:134:3: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h: In function ‘TH_polevlf’:
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h:142:3: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
   for (size_t i = 0; i <= len; i++) {
   ^
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h: In function ‘TH_trigamma’:
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h:260:3: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
   for (int i = 0; i < 6; ++i) {
   ^
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h: In function ‘TH_trigammaf’:
/home/amust/anaconda3/envs/python35/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH/THMath.h:278:3: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
   for (int i = 0; i < 6; ++i) {
   ^
error: command 'gcc' failed with exit status 1

category audio on discuss.pytorch.org

Hi,
I am just curious why there is no category audio on pytorch discussion forum.
There seems to be vision, nlp but no audio.
I feel that are some greater number of post
https://discuss.pytorch.org/search?q=audio

@apaszke @soumith what do you think? Can you introduce that ?

Segmentation fault (core dumped)

I'm unable to load any file after first time installation. I faced an issue #53 which I somehow resolved by modifying setup.py file, but am not sure if it is the correct way...
OS: " CentOS Linux release 7.2.1511 (Core)"

[1]: import torchaudio
[2]: torchaudio.load('speech-data/can-you-get-it.wav')

Segmentation fault (core dumped)

MEL2 gives incorrect output for n_mels=80

The MEL2 transform seems to work fine for n_mels=40, howver n_mels=80 gives artifacts.

`n_mels=40`

`n_mels=80`

import torchaudio
from torchaudio import transforms
import matplotlib.pyplot as plt


def spect_loader(path):
   y, sr = torchaudio.load(path, normalization=True)
   # n_mels = 40
   n_mels = 80

   to_melspec = transforms.Compose([
       transforms.LC2CL(),
       transforms.MEL2(sr, n_mels=n_mels)
   ])
   melspec = to_melspec(y)
   return melspec


if __name__ == '__main__':
   mel = spect_loader("00f0204f_nohash_0.wav")
   plt.matshow(mel[0].numpy().T)
   plt.colorbar()
   plt.savefig("mel.png")
   plt.clf()
   print(mel.shape)

weird case w/ torchaudio / yaafe / jupyter crashes kernel, not sure where to start

Hi!

I am getting a super weird bug and I have no idea where to start (so I'm here).

As of right now, this is only happening in the following scenario:

Inside a jupyter notebook,

import torchaudio
import yaafelib
yaafelib.FeaturePlan(sample_rate=22000)

I get an invalid pointer. Partial trace at bottom.

If i reverse the import orders

import yaafelib
import torchaudio
yaafelib.FeaturePlan(sample_rate=22000)

It doesn't happen.

If I don't run in a jupyter notebook, it doesn't happen.

partial stack trace (can put rest.. just long).

*** Error in `/home/brian/anaconda3/envs/py3torch/bin/python': free(): invalid pointer: 0x00007f99ace84ae0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f99d3cd67e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f99d3cdf37a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f99d3ce353c]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/site-packages/zmq/backend/cython/../../../../.././libstdc++.so.6(_ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE8overflowEi+0x160)[0x7f99cd4a77e0]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/site-packages/zmq/backend/cython/../../../../.././libstdc++.so.6(_ZNSt15basic_streambufIcSt11char_traitsIcEE6xsputnEPKcl+0x89)[0x7f99cd4fa759]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/site-packages/torch/lib/libshm.so(_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE15_M_insert_floatIdEES3_S3_RSt8ios_baseccT_+0x26c)[0x7f99acc234fc]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/site-packages/torch/lib/libshm.so(_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE6do_putES3_RSt8ios_basecd+0x10)[0x7f99acc23600]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/site-packages/torch/lib/libshm.so(_ZNSo9_M_insertIdEERSoT_+0xb5)[0x7f99acbfa125]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/lib-dynload/../.././libyaafe-core.so.0(_ZN5YAAFE16ComponentFactory7versionEv+0x189)[0x7f997e6050c9]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c)[0x7f99cdc2b550]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(ffi_call+0x1f5)[0x7f99cdc2acf5]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x3dc)[0x7f99cdc2283c]
/home/brian/anaconda3/envs/py3torch/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x9da3)[0x7f99cdc1ada3]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(_PyObject_FastCallDict+0x9e)[0x7f99d4bc9ade]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(+0x1482bb)[0x7f99d4ca62bb]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x26fd)[0x7f99d4ca915d]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(+0x145e74)[0x7f99d4ca3e74]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(+0x1485e8)[0x7f99d4ca65e8]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x26fd)[0x7f99d4ca915d]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(+0x146a60)[0x7f99d4ca4a60]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(_PyFunction_FastCallDict+0x10c)[0x7f99d4ca4cfc]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(_PyObject_FastCallDict+0x166)[0x7f99d4bc9ba6]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(_PyObject_Call_Prepend+0xcc)[0x7f99d4bc9dfc]
/home/brian/anaconda3/envs/py3torch/bin/../lib/libpython3.6m.so.1.0(PyObject_Call+0x56)[0x7f99d4bc9e96]

Unable to install

Hi,

I am unable to install torchaudio. Getting the following error when I try to install. The dependencies mentioned in the installation instructions have been installed. Any idea why this might be happening?

python setup.py install
running install
running bdist_egg
running egg_info
writing torchaudio.egg-info/PKG-INFO
writing dependency_links to torchaudio.egg-info/dependency_links.txt
writing requirements to torchaudio.egg-info/requires.txt
writing top-level names to torchaudio.egg-info/top_level.txt
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying torchaudio/_ext/th_sox/init.py -> build/lib.linux-x86_64-3.6/torchaudio/_ext/th_sox
running build_ext
generating cffi module 'build/temp.linux-x86_64-3.6/torchaudio._ext.th_sox._th_sox.c'
already up-to-date
building 'torchaudio._ext.th_sox._th_sox' extension
x86_64-conda_cos6-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -Wstrict-prototypes -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -fPIC -I/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -Itorchaudio/src -I/home/paperspace/anaconda3/include/python3.6m -c build/temp.linux-x86_64-3.6/torchaudio._ext.th_sox._th_sox.c -o build/temp.linux-x86_64-3.6/build/temp.linux-x86_64-3.6/torchaudio._ext.th_sox._th_sox.o
unable to execute 'x86_64-conda_cos6-linux-gnu-gcc': No such file or directory
error: command 'x86_64-conda_cos6-linux-gnu-gcc' failed with exit status 1

No handler for .mp4 format.

When I try to read an audio track from .mp4 file (Kinetics dataset),
I get the following error:
formats: no handler for file extension 'mp4'
with the full stack trace:

~/bin/miniconda3/envs/pytorch03/lib/python3.6/site-packages/torch/utils/ffi/__init__.py in safe_call(*args, **kwargs)
    178                      for arg in args)
    179         args = (function,) + args
--> 180         result = torch._C._safe_call(*args, **kwargs)
    181         if isinstance(result, ffi.CData):
    182             typeof = ffi.typeof(result)

FatalError: [read_audio_file] Failure to read file at torchaudio/src/generic/th_sox.c:42

I take it it has something to do with my SoX install, but for the life of me can't figure out what I'm missing. Any ideas?

Error while importing torchaudio

Hi
I am getting this error while importing torchaudio. can someone please help me solve this.
Python 2.7.14 |Anaconda custom (64-bit)| (default, Oct 16 2017, 17:29:19)
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import torchaudio
Traceback (most recent call last):
File "", line 1, in
File "build/bdist.linux-x86_64/egg/torchaudio/init.py", line 11, in
from torchaudio import transforms
File "build/bdist.linux-x86_64/egg/torchaudio/transforms.py", line 152, in
class SPECTROGRAM(object):
File "build/bdist.linux-x86_64/egg/torchaudio/transforms.py", line 166, in SPECTROGRAM
pad=0, window=torch.hann_window, wkwargs=None):
AttributeError: 'module' object has no attribute 'hann_window'

Music Information Retrieval Evaluation Exchange (MIREX) datasets

Each year the Music Information Retrieval Evaluation Exchange (MIREX) sponsors a number of noteworthy competitions for problems like chord, key change, and tempo estimation. Many of the competitions have longstanding (sometimes 10+ years) training and test sets. If torchaudio provided these datasets in a standard format (similar to VCTK or something like PASCAL from torchvision), PyTorch would become an invaluable toolkit for researchers working on these types of problems.

Need API for saving to file.

Currently we only have a load function. But after training the network it would be great if we can save the generated tensor to a file.

@soumith I think we can reuse a lot of code from this repo https://github.com/MattVitelli/GRUV

Making Spectrogram as a layer

Having the spectrogram transform be a layer for a custom model has big benefits (GPU computation, no need to store the transforms, etc.). Similar to kapre (more in this paper). Making a few modifications to the SPECTROGRAM class would allow for such usage (subclassing nn.Module, using nn.Paramater instead of autograd.Variable for the window, etc.). Thoughts?

python setup.py install failed

running install
running bdist_egg
running egg_info
writing torchaudio.egg-info/PKG-INFO
writing top-level names to torchaudio.egg-info/top_level.txt
writing dependency_links to torchaudio.egg-info/dependency_links.txt
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
/home/rajeev/environments/speech/local/lib/python2.7/site-packages/torch/utils/cpp_extension.py:118: UserWarning:

                           !! WARNING !!

                          !! WARNING !!

warnings.warn(ABI_INCOMPATIBILITY_WARNING.format(compiler))
building '_torch_sox' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/home/rajeev/environments/speech/local/lib/python2.7/site-packages/torch/lib/include -I/home/rajeev/environments/speech/local/lib/python2.7/site-packages/torch/lib/include/TH -I/home/rajeev/environments/speech/local/lib/python2.7/site-packages/torch/lib/include/THC -I/usr/include/python2.7 -c torchaudio/torch_sox.cpp -o build/temp.linux-x86_64-2.7/torchaudio/torch_sox.o -DTORCH_EXTENSION_NAME=_torch_sox -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
torchaudio/torch_sox.cpp:1:29: fatal error: torch/extension.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

It says fatal error: torch/extension.h: No such file or directory, any idea how to fix this?

Unable to install on OSX - fatal error: 'atomic' file not found #include <atomic>

$ python setup.py install

running install
running bdist_egg
running egg_info
writing torchaudio.egg-info/PKG-INFO
writing dependency_links to torchaudio.egg-info/dependency_links.txt
writing top-level names to torchaudio.egg-info/top_level.txt
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.7-x86_64/egg
running install_lib
running build_py
running build_ext
building '_torch_sox' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/anaconda3/include -arch x86_64 -I/anaconda3/include -arch x86_64 -I/anaconda3/lib/python3.6/site-packages/torch/lib/include -I/anaconda3/lib/python3.6/site-packages/torch/lib/include/TH -I/anaconda3/lib/python3.6/site-packages/torch/lib/include/THC -I/anaconda3/include/python3.6m -c torchaudio/torch_sox.cpp -o build/temp.macosx-10.7-x86_64-3.6/torchaudio/torch_sox.o -DTORCH_EXTENSION_NAME=_torch_sox -std=c++11
In file included from torchaudio/torch_sox.cpp:1:
In file included from /anaconda3/lib/python3.6/site-packages/torch/lib/include/torch/torch.h:5:
In file included from /anaconda3/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:5:
In file included from /anaconda3/lib/python3.6/site-packages/torch/lib/include/ATen/Allocator.h:6:
/anaconda3/lib/python3.6/site-packages/torch/lib/include/ATen/Retainable.h:3:10: fatal error: 'atomic' file not found
#include <atomic>
         ^~~~~~~~
1 error generated.
error: command 'gcc' failed with exit status 1

This is after installing sox and cloning the repo. Is there something I am missing?
I also tried upgrading my g++ to g++-8, but with no luck.

Any help appreciated!

DownmixMono channels_first wrong default value

While the docs state the default value for channels_dim is True, it seems to be set to None instead as shown here.

This would yield to the default channel dimension as dim1:

channels_first = None
ch_dim = int(not channels_first)
print(ch_dim)
> 1

Reported in the forum here.

If you want, I could create a fast fix using channels_first=True as the default value or update the docs instead.

Best,
ptrblck

On test example -- TypeError: initializer for ctype 'char *' must be a bytes or list or tuple, not str

For the test script:

import torchaudio
y, sr = torchaudio.load("test/wave_000000_lib.wav")
print(y, sr)
torchaudio.save("test/wave_000000_torch.wav", y, sr)

I get the output:

-1.5139e+07
-1.6122e+07
-1.7498e+07
     ⋮      
 1.7695e+06
 1.7039e+06
 9.8304e+05
[torch.FloatTensor of size 38400x1]
 22000
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-74b7b84c18a8> in <module>()
      2 y, sr = torchaudio.load("test/wave_000000_lib.wav")
      3 print(y, sr)
----> 4 torchaudio.save("test/wave_000000_torch.wav", y, sr)

/usr/local/miniconda3/lib/python3.6/site-packages/torchaudio-0.1-py3.6-macosx-10.7-x86_64.egg/torchaudio/__init__.py in save(filepath, src, sample_rate)
     38     func = getattr(th_sox, 'libthsox_{}_write_audio_file'.format(typename))
     39 
---> 40     func(bytes(filepath, "ascii"), src, extension[1:], sample_rate)

/usr/local/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/__init__.py in safe_call(*args, **kwargs)
    175                      for arg in args)
    176         args = (function,) + args
--> 177         result = torch._C._safe_call(*args, **kwargs)
    178         if isinstance(result, ffi.CData):
    179             typeof = ffi.typeof(result)

TypeError: initializer for ctype 'char *' must be a bytes or list or tuple, not str

Any idea what's going on?

For reference, the .wav file is saved in float32 format; it seems like sox is configured to read it in int16 format, hence the large numbers. But I feel like that's not the problem, but rather the string used in the save method.

trouble while installing torchaudio

Hi I am having trouble installing torchaudio. Can someone please tell me how to install it.

Best way to calculate delta and delta-delta for MelSpectrogram?

Any plans to do this?

torchaudio-contrib

Hi all, the torchaudio-contrib repo became public just now. We (@faroit, @ksanjeevan and myself @keunwoochoi) have been working on it so far and there are lots of things to discuss.
Everyone's invited :) let's talk!

Trying to load corrupt file segfaults instead of raising exception.

The problem

Trying to load an invalid file segfaults and crashes the entire process instead of raising an exception.

Steps to reproduce

$ touch test.mp3
$ python -c "import torchaudio; torchaudio.load('test.mp3')"
/home/kpar/.pyenv/versions/foley/lib/python3.7/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
formats: can't open input file `test.mp3': Invalid argument
[1]    27224 segmentation fault (core dumped)  python -c "import torchaudio; torchaudio.load('test.mp3')"

GDB Backtrace

/home/kpar/.pyenv/versions/3.7.0/envs/foley/lib/python3.7/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
formats: can't open input file `test.mp3': Invalid argument

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffa92e0cb9 in sox_close () from /usr/lib/x86_64-linux-gnu/libsox.so.3
(gdb) bt
#0  0x00007fffa92e0cb9 in sox_close () from /usr/lib/x86_64-linux-gnu/libsox.so.3
#1  0x00007fffa955da1c in torch::audio::(anonymous namespace)::SoxDescriptor::~SoxDescriptor (this=0x7fffffffd368, __in_chrg=<optimized out>) at torchaudio/torch_sox.cpp:21
#2  torch::audio::read_audio_file (file_name=..., output=..., nframes=<optimized out>, offset=<optimized out>) at torchaudio/torch_sox.cpp:83
#3  0x00007fffa956b291 in pybind11::detail::argument_loader<std::string const&, at::Tensor, long, long>::call_impl<int, int (*&)(std::string const&, at::Tensor, long, long), 0ul, 1ul, 2ul, 3ul, pybind11::detail::void_type> (f=<optimized out>, this=0x7fffffffd3c0) at /home/kpar/.pyenv/versions/foley/lib/python3.7/site-packages/torch/lib/include/pybind11/cast.h:1919
#4  pybind11::detail::argument_loader<std::string const&, at::Tensor, long, long>::call<int, pybind11::detail::void_type, int (*&)(std::string const&, at::Tensor, long, long)>(int (*&)(std::string const&, at::Tensor, long, long)) && (f=<optimized out>, this=<optimized out>) at /home/kpar/.pyenv/versions/foley/lib/python3.7/site-packages/torch/lib/include/pybind11/cast.h:1896
#5  void pybind11::cpp_function::initialize<int (*&)(std::string const&, at::Tensor, long, long), int, std::string const&, at::Tensor, long, long, pybind11::name, pybind11::scope, pybind11::sibling, char [34]>(int (*&)(std::string const&, at::Tensor, long, long), int (*)(std::string const&, at::Tensor, long, long), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [34])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (call=..., __closure=0x0)
    at /home/kpar/.pyenv/versions/foley/lib/python3.7/site-packages/torch/lib/include/pybind11/pybind11.h:154
#6  void pybind11::cpp_function::initialize<int (*&)(std::string const&, at::Tensor, long, long), int, std::string const&, at::Tensor, long, long, pybind11::name, pybind11::scope, pybind11::sibling, char [34]>(int (*&)(std::string const&, at::Tensor, long, long), int (*)(std::string const&, at::Tensor, long, long), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [34])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) () at /home/kpar/.pyenv/versions/foley/lib/python3.7/site-packages/torch/lib/include/pybind11/pybind11.h:132
#7  0x00007fffa9568875 in pybind11::cpp_function::dispatcher (self=self@entry=0x7ffff7e44990, args_in=args_in@entry=0x7ffff7f04278, kwargs_in=kwargs_in@entry=0x0)
    at /home/kpar/.pyenv/versions/foley/lib/python3.7/site-packages/torch/lib/include/pybind11/pybind11.h:619
#8  0x00005555555c9f57 in _PyMethodDef_RawFastCallKeywords (kwnames=0x0, nargs=140737352321424, args=0x4574c92a0000000c, self=0x7ffff7e44990, method=0x555556aeaef0) at Objects/call.c:690
#9  _PyCFunction_FastCallKeywords (func=func@entry=0x7fffa98ab120, args=args@entry=0x555555b42630, nargs=nargs@entry=4, kwnames=kwnames@entry=0x0) at Objects/call.c:730
#10 0x00005555555b6c4e in call_function (kwnames=0x0, oparg=4, pp_stack=<synthetic pointer>) at Python/ceval.c:4547
#11 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3086
#12 0x0000555555686e0d in PyEval_EvalFrameEx (throwflag=0, f=0x555555b42488) at Python/ceval.c:547
#13 _PyEval_EvalCodeWithName (_co=0x7ffff7e55300, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x0, kwargs=0x7ffff7f86b78, kwcount=0, kwstep=1,
    defs=0x7ffff7e63ce0, defcount=4, kwdefs=0x0, closure=0x0, name=0x7ffff7f43538, qualname=0x7ffff7f43538) at Python/ceval.c:3923
#14 0x00005555555c9386 in _PyFunction_FastCallKeywords (func=<optimized out>, stack=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at Objects/call.c:433
#15 0x00005555555b6d6b in call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>) at Python/ceval.c:4586
#16 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3086
#17 0x0000555555686e0d in PyEval_EvalFrameEx (throwflag=0, f=0x7ffff7f869f8) at Python/ceval.c:547
#18 _PyEval_EvalCodeWithName (_co=_co@entry=0x7ffff7f136f0, globals=globals@entry=0x7ffff7eda0d8, locals=locals@entry=0x7ffff7eda0d8, args=args@entry=0x0, argcount=argcount@entry=0,
    kwnames=kwnames@entry=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:3923
#19 0x0000555555686f43 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, locals=locals@entry=0x7ffff7eda0d8,
    globals=globals@entry=0x7ffff7eda0d8, _co=_co@entry=0x7ffff7f136f0) at Python/ceval.c:3952
#20 PyEval_EvalCode (co=co@entry=0x7ffff7f136f0, globals=globals@entry=0x7ffff7eda0d8, locals=locals@entry=0x7ffff7eda0d8) at Python/ceval.c:524
#21 0x00005555556c021e in run_mod (arena=0x7ffff7fd4078, flags=0x7ffff7e4eed0, locals=0x7ffff7eda0d8, globals=0x7ffff7eda0d8, filename=0x7ffff7f14e30, mod=<optimized out>) at Python/pythonrun.c:1035
#22 PyRun_StringFlags (flags=0x7ffff7e4eed0, locals=0x7ffff7eda0d8, globals=0x7ffff7eda0d8, start=257, str=0x7ffff7f040a0 "import torchaudio; torchaudio.load('test.mp3')\n") at Python/pythonrun.c:959
#23 PyRun_SimpleStringFlags (command=0x7ffff7f040a0 "import torchaudio; torchaudio.load('test.mp3')\n", flags=flags@entry=0x7fffffffdca0) at Python/pythonrun.c:455
#24 0x00005555555bb250 in pymain_run_command (cf=0x7fffffffdca0, command=<optimized out>) at Modules/main.c:383
#25 pymain_run_python (pymain=0x7fffffffddf0) at Modules/main.c:2514
#26 pymain_main (pymain=pymain@entry=0x7fffffffddf0) at Modules/main.c:2662
#27 0x00005555555bcccb in _Py_UnixMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:2697
#28 0x00007ffff7041b97 in __libc_start_main (main=0x5555555ae620 <main>, argc=3, argv=0x7fffffffe028, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe018)
    at ../csu/libc-start.c:310
#29 0x00005555555b825a in _start ()
(gdb)

len(audios)==1000

download code is crash in VCTK

Exception at 'import torchaudio'

Hi,
I am trying to use torchaudio on Ubuntu and get the following exception:

>>> import torchaudio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "torchaudio/__init__.py", line 5, in <module>
    from ._ext import th_sox
  File "torchaudio/_ext/th_sox/__init__.py", line 3, in <module>
    from ._th_sox import lib as _lib, ffi as _ffi
ImportError: No module named _th_soxq

All dependencies are met and up-to-date:

libsox-dev is already the newest version.
libsox-fmt-all is already the newest version.
sox is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 196 not upgraded.

Also, install process seems ok, no errors or warnings:

running install
running bdist_egg
running egg_info
writing requirements to torchaudio.egg-info/requires.txt
writing torchaudio.egg-info/PKG-INFO
writing top-level names to torchaudio.egg-info/top_level.txt
writing dependency_links to torchaudio.egg-info/dependency_links.txt
reading manifest file 'torchaudio.egg-info/SOURCES.txt'
writing manifest file 'torchaudio.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying torchaudio/_ext/th_sox/__init__.py -> build/lib.linux-x86_64-2.7/torchaudio/_ext/th_sox
running build_ext
generating cffi module 'build/temp.linux-x86_64-2.7/torchaudio._ext._th_sox.c'
already up-to-date
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/torchaudio
creating build/bdist.linux-x86_64/egg/torchaudio/_ext
creating build/bdist.linux-x86_64/egg/torchaudio/_ext/th_sox
copying build/lib.linux-x86_64-2.7/torchaudio/_ext/th_sox/__init__.py -> build/bdist.linux-x86_64/egg/torchaudio/_ext/th_sox
copying build/lib.linux-x86_64-2.7/torchaudio/_ext/_th_sox.so -> build/bdist.linux-x86_64/egg/torchaudio/_ext
copying build/lib.linux-x86_64-2.7/torchaudio/_ext/__init__.py -> build/bdist.linux-x86_64/egg/torchaudio/_ext
copying build/lib.linux-x86_64-2.7/torchaudio/__init__.py -> build/bdist.linux-x86_64/egg/torchaudio
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/_ext/th_sox/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/_ext/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/__init__.py to __init__.pyc
creating stub loader for torchaudio/_ext/_th_sox.so
byte-compiling build/bdist.linux-x86_64/egg/torchaudio/_ext/_th_sox.py to _th_sox.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying torchaudio.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
creating 'dist/torchaudio-0.1-py2.7-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing torchaudio-0.1-py2.7-linux-x86_64.egg
Removing /usr/local/lib/python2.7/dist-packages/torchaudio-0.1-py2.7-linux-x86_64.egg
Copying torchaudio-0.1-py2.7-linux-x86_64.egg to /usr/local/lib/python2.7/dist-packages
torchaudio 0.1 is already the active version in easy-install.pth

Installed /usr/local/lib/python2.7/dist-packages/torchaudio-0.1-py2.7-linux-x86_64.egg
Processing dependencies for torchaudio==0.1
Searching for cffi==1.10.0
Best match: cffi 1.10.0
Adding cffi 1.10.0 to easy-install.pth file

Using /usr/local/lib/python2.7/dist-packages
Searching for pycparser==2.17
Best match: pycparser 2.17
Adding pycparser 2.17 to easy-install.pth file

Using /usr/local/lib/python2.7/dist-packages
Finished processing dependencies for torchaudio==0.1

Any idea what went wrong?

Thanks!

"fatal error: sox.h: No such file or directory" during installation

When I try to install the module on " CentOS Linux release 7.2.1511 (Core)", I get the above error. I've done yum install sox and yum install sox-devel. I'm not sure what else I'm missing...

Windows support

Now that PyTorch 0.4.0 supports Windows does torchaudio support Windows ? If yes could the installation instructions for the dependencies for Windows be added to the README. Thanks

Python 3: Last Commit BREAKS usage of DeepSpeech.PyTorch with PyTorch Audio

The last commit BREAKS using PyTorch Audio with Python 3.

f80d6e3

Traceback (most recent call last):
  File "train.py", line 381, in <module>
    main()
  File "train.py", line 210, in main
    for i, (data) in enumerate(train_loader, start=start_iter):
  File "/home/dlm/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 212, in __next__
    return self._process_next_batch(batch)
  File "/home/dlm/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 239, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
  File "/home/dlm/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 41, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/dlm/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 41, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/dlm/code/deepspeech.pytorch/data/data_loader.py", line 159, in __getitem__
    spect = self.parse_audio(audio_path)
  File "/home/dlm/code/deepspeech.pytorch/data/data_loader.py", line 106, in parse_audio
    y = load_audio(audio_path)
  File "/home/dlm/code/deepspeech.pytorch/data/data_loader.py", line 17, in load_audio
    sound, _ = torchaudio.load(path.encode('utf-8'))  # py3 fix
  File "/home/dlm/anaconda3/lib/python3.6/site-packages/torchaudio-0.1-py3.6-linux-x86_64.egg/torchaudio/__init__.py", line 26, in load
    func(bytes(filename, "ascii"), out, sample_rate_p)
TypeError: encoding without a string argument

transforms.py: hann_window does not included in torch

I got the following error when running the transforms.py:
AttributeError: module 'torch' has no attribute 'hann_window'

libGOMP not found

I had an error similar to that reported here on trying to load torchaudio: pytorch/pytorch#643

I found that there were two independent things that would each lead to the error. One was compiling with an older version of gcc (4.8.something instead of 5.4.0). The other was importing torch before torchaudio.

cannot download libsox-dev and libsox-fmt-all

I work in a network of linux clusters and cant use sudo command: sudo apt-get install sox libsox-dev libsox-fmt-all
howelse can I install libsox-dev and libsox-fmt-all ?
I tried all ways but getting stuck with some error occuring while building '_torch_sox' extension.
Any help would be appreciated.

tag for torch==0.3.X

I'm trying to install a repo that still uses torchaudio as a dependency and since the new pytorch 0.4 update, it's not possible to do a clean install from a new user point of view.

I guess I won't be the only one having this problem. Is it possible to make a tag on the last commit compatible with torch v0.3?

Thanks,
Miguel

No module named '_torch_sox'

In init.py, no _torch_sox module can be found, even though I run setup.py successfully.