rabitt / pysox Goto Github PK

View Code? Open in Web Editor NEW

509.0 13.0 79.0 4.2 MB

Python wrapper around sox.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

pysox's Introduction

pysox

Python wrapper around sox. Read the Docs here.

This library was presented in the following paper:

R. M. Bittner, E. J. Humphrey and J. P. Bello, "pysox: Leveraging the Audio Signal Processing Power of SoX in Python", in Proceedings of the 17th International Society for Music Information Retrieval Conference Late Breaking and Demo Papers, New York City, USA, Aug. 2016.

Install

This requires that SoX version 14.4.2 or higher is installed.

To install SoX on Mac with Homebrew:

brew install sox

If you want support for mp3, flac, or ogg files, add the following flags:

brew install sox --with-lame --with-flac --with-libvorbis

on Linux:

apt-get install sox

or install from source.

To install the most up-to-date release of this module via PyPi:

pip install sox

To install the master branch:

pip install git+https://github.com/rabitt/pysox.git

git clone https://github.com/rabitt/pysox.git
cd pysox
python setup.py install

Tests

If you have a different version of SoX installed, it's recommended that you run the tests locally to make sure everything behaves as expected, by simply running:

pytest

Examples

import sox
# create transformer
tfm = sox.Transformer()
# trim the audio between 5 and 10.5 seconds.
tfm.trim(5, 10.5)
# apply compression
tfm.compand()
# apply a fade in and fade out
tfm.fade(fade_in_len=1.0, fade_out_len=0.5)
# create an output file.
tfm.build_file('path/to/input_audio.wav', 'path/to/output/audio.aiff')
# or equivalently using the legacy API
tfm.build('path/to/input_audio.wav', 'path/to/output/audio.aiff')
# get the output in-memory as a numpy array
# by default the sample rate will be the same as the input file
array_out = tfm.build_array(input_filepath='path/to/input_audio.wav')
# see the applied effects
tfm.effects_log
> ['trim', 'compand', 'fade']

Transform in-memory arrays:

import numpy as np
import sox
# sample rate in Hz
sample_rate = 44100
# generate a 1-second sine tone at 440 Hz
y = np.sin(2 * np.pi * 440.0 * np.arange(sample_rate * 1.0) / sample_rate)
# create a transformer
tfm = sox.Transformer()
# shift the pitch up by 2 semitones
tfm.pitch(2)
# transform an in-memory array and return an array
y_out = tfm.build_array(input_array=y, sample_rate_in=sample_rate)
# instead, save output to a file
tfm.build_file(
    input_array=y, sample_rate_in=sample_rate,
    output_filepath='path/to/output.wav'
)
# create an output file with a different sample rate
tfm.set_output_format(rate=8000)
tfm.build_file(
    input_array=y, sample_rate_in=sample_rate,
    output_filepath='path/to/output_8k.wav'
)

Concatenate 3 audio files:

import sox
# create combiner
cbn = sox.Combiner()
# pitch shift combined audio up 3 semitones
cbn.pitch(3.0)
# convert output to 8000 Hz stereo
cbn.convert(samplerate=8000, n_channels=2)
# create the output file
cbn.build(
    ['input1.wav', 'input2.wav', 'input3.wav'], 'output.wav', 'concatenate'
)
# the combiner does not currently support array input/output

Get file information:

import sox
# get the sample rate
sample_rate = sox.file_info.sample_rate('path/to/file.mp3')
# get the number of samples
n_samples = sox.file_info.num_samples('path/to/file.wav')
# determine if a file is silent
is_silent = sox.file_info.silent('path/to/file.aiff')
# file info doesn't currently support array input

pysox's People

Contributors

Stargazers

Watchers

pysox's Issues

Support using same file path for input & output

Sox doesn't support using the same file as both input and output - doing this will result in an empty, invalid audio file. While this is sox behavior and not pysox, it would be nice if pysox took care of this behind the scenes. Right now the user needs to worry about this logic themselves, e.g. like this:

import tempfile
import shutil
from scaper.util import _close_temp_files

audio_infile = '/Users/justin/Downloads/trimtest.wav'
audio_outfile = '/Users/justin/Downloads/trimtest.wav'
start_time = 2
end_time = 3

tfm = sox.Transformer()
tfm.trim(start_time, end_time)
if audio_outfile != audio_infile:
    tfm.build(audio_infile, audio_outfile)
else:
    # must use temp file in order to save to same file
    tmpfiles = []
    with _close_temp_files(tmpfiles):
        # Create tmp file
        tmpfiles.append(
            tempfile.NamedTemporaryFile(
                suffix='.wav', delete=True))
        # Save trimmed result to temp file
        tfm.build(audio_infile, tmpfiles[-1].name)
        # Copy result back to original file
        shutil.copyfile(tmpfiles[-1].name, audio_outfile)

Pysox does issue a warning when a file is about to be overwritten, which is even more confusing under this scenario since the user (who might be unfamiliar with the quirks of sox) has no reason to think that the overwritten file will be invalid.

soxi inaccessible from Windows command line

Dear pysox people,
My BirdVox intern (Elizabeth Mendoza) was trying to use scaper on her Windows machine today and scaper produced an error while calling
source_duration = sox.file_info.duration(source_file)

At the back of the trace was something like
Command 'soxi -D "C:\Users\User\path\to\file.wav"' returned non-zero exit status 1

In the meanwhile, running the demo code of pysox worked perfectly.
Therefore, pysox could read and write data (using the sox command) but not read metadata (using the soxi command).

We realized that this was because neither of the two was in the Windows command prompt path.
My hacky solution was to:

manually add sox to path (in My Computer / Properties / ...)
duplicate the sox.exe on the same folder and call it soxi.exe
restart python, so that the %PATH% global variable would be updated

It would be good to document this problem or offer a more elegant fix to Windows users.

@justinsalamon asked for being tagged on that issue, so that's it: Justin, you're tagged now :)

Pass error message from SoX to SoxError

when SoxError is raised, pass the command line error message.

Add advanced usage section to docs

Include usage of:

noiseprof / noisered
stat
stats
power_spectrum

Simplified submodule wrapper for one-off operations

so here's a idea for discussion...

I like the transform chaining for complex effects; it's a really smart way of optimizing how sox is controlled. However, I don't like how verbose it is for simple one-off operations.

For example, to convert a file currently:

tfm = sox.Transformer()
tfm.convert(
    samplerate=samplerate, 
    n_channels=n_channels,
    bitdepth=bitdepth)
tfm.build(input_file, output_file)

For simple operations, this doesn't feel very pythonic. I'd be curious how folks feel about abstracting this away for trivial / common pipelines:

sox.effects.convert(input_file, output_file, samplerate, n_channels, bitdepth)

This could also roll-up try-catch logic, if one were so inclined (e.g. strict=False could return a non-zero status, etc).

thoughts?

wav file that breaks pysox, enigmatic error message

This wav file

This code:

import sox

out_sr = 44100
out_channels = 1
out_bitdepth = 16

infile = '/Users/justin/Downloads/Adagio_sz1795.wav'
outfile = '/Users/justin/Downloads/Adagio_sz1795_sox.wav'

tfm = sox.Transformer(infile, outfile)
tfm.convert(samplerate=out_sr, channels=out_channels, bitdepth=out_bitdepth)
tfm.build()

This error:

---------------------------------------------------------------------------
SoxError                                  Traceback (most recent call last)
<ipython-input-22-15231ac67f0e> in <module>()
     10 tfm = sox.Transformer(infile, outfile)
     11 tfm.convert(samplerate=out_sr, channels=out_channels, bitdepth=out_bitdepth)
---> 12 tfm.build()

/usr/local/lib/python2.7/site-packages/sox-1.1-py2.7.egg/sox/transform.pyc in build(self)
    157 
    158         if status is False:
--> 159             raise SoxError
    160         else:
    161             logging.info(

SoxError:

Implement Transformer.stretch()

Implement Transformer.stat() and Transformer.stats()

get statistics about the output of an effects chain

implement Transformer.bend()

Trim allows start time without an end

Trim can trim a file from the beginning with only the start time (e.g. sox input output trim 10) but pysox's trim requires both start and end. Added #81 as a fix.

Getting wav not supported error

I am getting this error
"This install of SoX cannot process .wav files."

I am using the code in AWS lambda and the script looks like this

def split_channel(temp_file, channel_num):
result_temp_file = '/tmp/{}.wav'.format(uuid.uuid4())
tfm = sox.Transformer()
remix_dictionary = { 1: [channel_num]}
tfm.remix(remix_dictionary)
tfm.build(temp_file, result_temp_file)
return result_temp_file

Is there some step to make Sox work for .wav files ?

Combiner.preview() fails

Combiner inherits preview from Transformer but it needs to be overwritten because the base call is different for multiple inputs.

Trying to use the core API play.

Hello,

I'm trying to play music through sox. For this purpose, I'm using the play API available in core.py. But I get the following error OSError: Play failed! [WinError 2] The system cannot find the file specified

` retval = os.getcwd()

INPUT_FILE = os.path.join(retval + '\\BG\\test.wav')

print(INPUT_FILE)

arg_list = ['play', music_file]

handler = core.play(arg_list)

Please, provide an example use case of core.play() API.

Is `file_info.bitrate` correct?

Hello

This function seems to be returning the bit depth (aka bits per sample) not the bit rate.

soxi -b is the bit depth.

soxi -B is the bitrate.

Happy to submit a pull request with a fix and a second function for the bit depth.

Cheers.

Implement Transformer.splice()

Dependency on grep command which is missing on Windows by default

The problem arises in the _get_valid_formats() function in pysox/sox/core.py.

I propose a fix as follows:

def _get_valid_formats():
    ''' Calls SoX help for a lists of audio formats available with the current
    install of SoX.
    Returns:
    --------
    formats : list
        List of audio file extensions that SoX can process.
    '''
    if NO_SOX:
        return []

    pfm = platform.system()
    if pfm == 'Windows':
        shell_output = subprocess.check_output('sox -h | findstr /c:"AUDIO FILE FORMATS"', shell=True)
    else:
        shell_output = subprocess.check_output('sox -h | grep "AUDIO FILE FORMATS"', shell=True)

    formats = shell_output.decode('utf8').split()[3:] # this is better because it does not leave whitespace characters in the string
    return formats

Implement Transfomer.echos()

License badge doesn't match actual license

Badge -> MIT
LICENSE -> BSD 3-Clause "New" or "Revised" license

Implement Transformer.speed()

Rename file_info.stat to file_info.summary

file_info.stat and Transformer.stat do different things and having the same name is confusing. Eventually rename file_info.stat to file_info.summary.

Apply combiner to an input_filepath_list of length 1

Trying to combine a single file (which basically doesn't alter it unless the combiner applies transformations too):

cbn.build([filename], outfile, 'concatenate')

Raises an error: ValueError: input_filepath_list must have at least 2 files.

But there are scenarios where this is useful, for example in my case I need to concatenate a file to itself if the file is shorter than a certain value, but otherwise leave it unchanged. The number of concatenattions N is determined at runtime, so ideally I'd like to call build() like this:

cbn.build([filename] * N, outfile, 'concatenate')

So that if N=1 it basically leaves the file unchanged. This can of course be achieved using an if statement to determine whether I need to use the combiner or not, but it's much clunkier.

@rabitt Is there a particular reason why the combiner can't be called with an input_filepath_list list of length 1?

Return None rather than 0 when file info (bit rate, duration, etc. ) is NA

There is a number of soxi functions that have special return values if the info is unavailable of not applicable. Here's the current behavior of pysox functions in these cases:

bitrate: zero
comments: empty string
duration: zero
num_samples: zero

In an offline conversation with @rabitt, we figured that it would be good to use the None constant in Python to express this kind of value. That will break type stability, but on the other hand it will help catching bugs such as
if duration < max_duration
or
if bitrate < 128000

which really should not return True if these values are NA. In addition, pysox currently raises a warning in those cases, so the loss in speed incurred by the lack of type stability is negligible in front of the loss in speed incurred by raising a warning, I suppose.

Thoughts?

Is the output format automatically applied according to extension specified in build?

I'm trying to use pysox for a django app.

I want to know if I just need to specify the output extension in .build to get the desired output format.

transformer.stat returns dictionary with unclear structure

transformer.stat returns a dictionary. However in the documentation, the structure (i.e., keys) of the dictionary is not specified in the documentation.

Global logging params overwriten.

In https://github.com/rabitt/pysox/blob/master/sox/init.py#L5 you are overwitting global logger level.
Which destroys whatever is configured elsewhere.

Custom loggers/colorlog/ everything is messed up just by importing sox.

Implement Transformer.vol()

Effects chain volume adjustments

Base call to SoX uses shell=True in subprocess call

Filenames with wildcards have always behaved inconsistently with pysox. Looking into this more closely, found that - the SoX command arguments being passed to the shell rather than to the binary directly.

Thanks @jongwook for finding the source!

Output file validation breaks when using temp files

Output file validation, and in particular this line, crashes when trying to write to a temp file (created using tempfile). Consequently, pysox crashes when trying to write to a temporary file (e.g. using a Combiner). Not sure which version this was introduced in, since I didn't use to have this problem.

Minimal example:

import os, tempfile
tf = tempfile.NamedTemporaryFile(suffix='.wav', delete=True)
os.path.dirname(tf)

Possible fixes: ensure output file validation supports tempfiles, or at least allow to disable output file validation.

Implement Transfomer.synth()

Windows, error on import sox

The line 89 in core.py:
"sox -h | grep 'AUDIO FILE FORMATS'"
results in an error on windows

to make it work change to:
'sox -h | grep "AUDIO FILE FORMATS"'

ETA on reverb implementation?

Just curious. Need help maybe?

SoX doesn't seem to like scientific notation

Pysox needs to format its floats to not be scientific notation when formatting arguments for SoX, otherwise SoX seems to choke. I received

SoxError: Stdout: Stderr: sox FAIL pad: usage: {length[@position]}

when pysox was calling SoX with these arguments:

['sox', '-D', '-V2', u'../data/raw/scaper_audio/siren_wailing/5_siren_police.wav', '-c', '1', '/var/folders/xv/6nccdc7151j71bhvzlrvjhqm0000gp/T/tmpYCzbyz.wav', 'rate', '-h', '44100', 'trim', '0', '6.501565999', 'fade', 'q', '0.01', 'reverse', 'fade', 'q', '0.01', 'reverse', 'norm', '-7.36679718537', 'pad', '3.498434', '1.00000008274e-09']

Note the '1.00000008274e-09' at the end. When this number was changed to '0.00000000100000008274', this error went away, and when changed to a different sci notation number, e.g. '1.0e-3', the problem persisted.

Use directly in memory, instead of via files, possible?

Would it be possible to input audio as ndarray:s into pysox directly as well as files from disk? Why I'm asking is because I'm using librosa for onset detection, and thus already have the audio loaded, but would still like to apply some audio effects and stuff afterwards.

Like maybe the constructors could do a type check, and build() could return the destination audio or something?

y = librosa.load(path)[0]
tfm = sox.Transformer(y)
# Do stuff...
y = tfm.build()

I realize this adds extra complexity to pysox (like having librosa as a dependency or similar) which is intended to be a clean wrapper around SoX, so I get if it's deemed out of scope. Just asking!

Return transformer and combiner always?

Just a proposal for the API, it would be neat if all methods in Transformer and Combiner return self, so that one could create chains like this:

import sox
tfm = sox.Transformer('path/to/input_audio.wav', 'path/to/output/audio.aiff')
         .trim(5, 10.5)
         .compand()
         .fade(fade_in_len=1.0, fade_out_len=0.5)
         .build()

(reference)

Thread-safe?

It doesn't seem like pysox is thread-safe. Could this be fixed?

Combining a single file

Hello,

I'd like to use the -v (Combination) command on a single file. While this may seem like nonsense it's an easy way to change the volume of a file by a given percentage, like this:
sox -v 0.9 in.wav out.wav
Currently this is not possible since validate_input_file_list in the file_info.py raises an error when the list has less than 2 items.

While this helps people avoid mistakes it also prevents some functionality. Do you think it is worth removing?

Implement Transformer.mcompand()

override set_input_format in Combiner

For a combiner object, set_input_format should be rewritten to set formats for each input file.

Error with filepath containing whitespaces when using sox.file_info.info(...)

Hi,
The filepath does not use "enquote_filepath" in the file_info > info() > silent() call.
I created a small pull request to fix that.
Hope that helps.
Thanks!

Use pysox to generate silence?

It seems like it's possible to use sox to generate silence using the -n flag for null input. Is it possible to do this with pysox right now? If not, could it be added?

Implement Transfomer.sinc()

Implement Transfomer.remix()

Disable warnings when overwriting output file

There are scenarios where you know in advance that you're going to use pysox to overwrite an existing file, e.g. when using temporary files like in this example from #6. In this situation it's inconvenient to have pysox issue a warning for every file overwrite (as there are many), and right now there's no way to disable these warnings in pysox (you can hack it by changing the logging level as in the example provided above, but that's not ideal).

It would be nice to have an optional flag in pysox (or maybe at the transformer/combiner level?) to disable file overwrite warnings (and maybe another option to disable all warnings from pysox?).

sox --i, --info              Behave as soxi(1)

The error:

>>> sox.file_info.duration('c:\\tmp\\test.flac')
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\site-packages\sox\core.py", line 125, in soxi
    shell=True, stderr=subprocess.PIPE
  File "C:\Program Files\Python36\lib\subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "C:\Program Files\Python36\lib\subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'soxi -D c:\tmp\test.flac' returned non-zero exit status 1.

I have tried this patch and it works:

+++ .\test_core.py        Sun Feb 12 13:57:22 2017
--- .\core.py   Sun Feb 12 13:48:26 2017
@@ -115,7 +115,7 @@
     if argument not in SOXI_ARGS:
         raise ValueError("Invalid argument '{}' to Soxi".format(argument))

+    args = ['sox --i']
-    args = ['soxi']
     args.append("-{}".format(argument))
     args.append(enquote_filepath(filepath))

Thanks!

Logging is very noisy

How can I make the logging quieter? I don't see that you are using a named logger so I cannot mute it with my logging config.