Giter VIP home page Giter VIP logo

pysox's Introduction

pysox

Python wrapper around sox. Read the Docs here.

PyPI version Documentation Status GitHub license PyPI

Build Status Coverage Status

PySocks

This library was presented in the following paper:

R. M. Bittner, E. J. Humphrey and J. P. Bello, "pysox: Leveraging the Audio Signal Processing Power of SoX in Python", in Proceedings of the 17th International Society for Music Information Retrieval Conference Late Breaking and Demo Papers, New York City, USA, Aug. 2016.

Install

This requires that SoX version 14.4.2 or higher is installed.

To install SoX on Mac with Homebrew:

brew install sox

If you want support for mp3, flac, or ogg files, add the following flags:

brew install sox --with-lame --with-flac --with-libvorbis

on Linux:

apt-get install sox

or install from source.

To install the most up-to-date release of this module via PyPi:

pip install sox

To install the master branch:

pip install git+https://github.com/rabitt/pysox.git

or

git clone https://github.com/rabitt/pysox.git
cd pysox
python setup.py install

Tests

If you have a different version of SoX installed, it's recommended that you run the tests locally to make sure everything behaves as expected, by simply running:

pytest

Examples

import sox
# create transformer
tfm = sox.Transformer()
# trim the audio between 5 and 10.5 seconds.
tfm.trim(5, 10.5)
# apply compression
tfm.compand()
# apply a fade in and fade out
tfm.fade(fade_in_len=1.0, fade_out_len=0.5)
# create an output file.
tfm.build_file('path/to/input_audio.wav', 'path/to/output/audio.aiff')
# or equivalently using the legacy API
tfm.build('path/to/input_audio.wav', 'path/to/output/audio.aiff')
# get the output in-memory as a numpy array
# by default the sample rate will be the same as the input file
array_out = tfm.build_array(input_filepath='path/to/input_audio.wav')
# see the applied effects
tfm.effects_log
> ['trim', 'compand', 'fade']

Transform in-memory arrays:

import numpy as np
import sox
# sample rate in Hz
sample_rate = 44100
# generate a 1-second sine tone at 440 Hz
y = np.sin(2 * np.pi * 440.0 * np.arange(sample_rate * 1.0) / sample_rate)
# create a transformer
tfm = sox.Transformer()
# shift the pitch up by 2 semitones
tfm.pitch(2)
# transform an in-memory array and return an array
y_out = tfm.build_array(input_array=y, sample_rate_in=sample_rate)
# instead, save output to a file
tfm.build_file(
    input_array=y, sample_rate_in=sample_rate,
    output_filepath='path/to/output.wav'
)
# create an output file with a different sample rate
tfm.set_output_format(rate=8000)
tfm.build_file(
    input_array=y, sample_rate_in=sample_rate,
    output_filepath='path/to/output_8k.wav'
)

Concatenate 3 audio files:

import sox
# create combiner
cbn = sox.Combiner()
# pitch shift combined audio up 3 semitones
cbn.pitch(3.0)
# convert output to 8000 Hz stereo
cbn.convert(samplerate=8000, n_channels=2)
# create the output file
cbn.build(
    ['input1.wav', 'input2.wav', 'input3.wav'], 'output.wav', 'concatenate'
)
# the combiner does not currently support array input/output

Get file information:

import sox
# get the sample rate
sample_rate = sox.file_info.sample_rate('path/to/file.mp3')
# get the number of samples
n_samples = sox.file_info.num_samples('path/to/file.wav')
# determine if a file is silent
is_silent = sox.file_info.silent('path/to/file.aiff')
# file info doesn't currently support array input

pysox's People

Contributors

alubhorta avatar cjacoby avatar dt-kylecrayne avatar ejhumphrey avatar hadware avatar hugovk avatar jonathanmarmor avatar justinsalamon avatar ktrzcinskigda avatar lostanlen avatar machste avatar mpuels avatar nullptr-0 avatar page-david avatar pseeth avatar psobot avatar rabitt avatar ronhanson avatar stephwag avatar vsel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pysox's Issues

Support using same file path for input & output

Sox doesn't support using the same file as both input and output - doing this will result in an empty, invalid audio file. While this is sox behavior and not pysox, it would be nice if pysox took care of this behind the scenes. Right now the user needs to worry about this logic themselves, e.g. like this:

import tempfile
import shutil
from scaper.util import _close_temp_files

audio_infile = '/Users/justin/Downloads/trimtest.wav'
audio_outfile = '/Users/justin/Downloads/trimtest.wav'
start_time = 2
end_time = 3

tfm = sox.Transformer()
tfm.trim(start_time, end_time)
if audio_outfile != audio_infile:
    tfm.build(audio_infile, audio_outfile)
else:
    # must use temp file in order to save to same file
    tmpfiles = []
    with _close_temp_files(tmpfiles):
        # Create tmp file
        tmpfiles.append(
            tempfile.NamedTemporaryFile(
                suffix='.wav', delete=True))
        # Save trimmed result to temp file
        tfm.build(audio_infile, tmpfiles[-1].name)
        # Copy result back to original file
        shutil.copyfile(tmpfiles[-1].name, audio_outfile)

Pysox does issue a warning when a file is about to be overwritten, which is even more confusing under this scenario since the user (who might be unfamiliar with the quirks of sox) has no reason to think that the overwritten file will be invalid.

soxi inaccessible from Windows command line

Dear pysox people,
My BirdVox intern (Elizabeth Mendoza) was trying to use scaper on her Windows machine today and scaper produced an error while calling
source_duration = sox.file_info.duration(source_file)

At the back of the trace was something like
Command 'soxi -D "C:\Users\User\path\to\file.wav"' returned non-zero exit status 1

In the meanwhile, running the demo code of pysox worked perfectly.
Therefore, pysox could read and write data (using the sox command) but not read metadata (using the soxi command).

We realized that this was because neither of the two was in the Windows command prompt path.
My hacky solution was to:

  1. manually add sox to path (in My Computer / Properties / ...)
  2. duplicate the sox.exe on the same folder and call it soxi.exe
  3. restart python, so that the %PATH% global variable would be updated

It would be good to document this problem or offer a more elegant fix to Windows users.

@justinsalamon asked for being tagged on that issue, so that's it: Justin, you're tagged now :)

Simplified submodule wrapper for one-off operations

so here's a idea for discussion...

I like the transform chaining for complex effects; it's a really smart way of optimizing how sox is controlled. However, I don't like how verbose it is for simple one-off operations.

For example, to convert a file currently:

tfm = sox.Transformer()
tfm.convert(
    samplerate=samplerate, 
    n_channels=n_channels,
    bitdepth=bitdepth)
tfm.build(input_file, output_file)

For simple operations, this doesn't feel very pythonic. I'd be curious how folks feel about abstracting this away for trivial / common pipelines:

sox.effects.convert(input_file, output_file, samplerate, n_channels, bitdepth)

This could also roll-up try-catch logic, if one were so inclined (e.g. strict=False could return a non-zero status, etc).

thoughts?

wav file that breaks pysox, enigmatic error message

This wav file

This code:

import sox

out_sr = 44100
out_channels = 1
out_bitdepth = 16

infile = '/Users/justin/Downloads/Adagio_sz1795.wav'
outfile = '/Users/justin/Downloads/Adagio_sz1795_sox.wav'

tfm = sox.Transformer(infile, outfile)
tfm.convert(samplerate=out_sr, channels=out_channels, bitdepth=out_bitdepth)
tfm.build()

This error:

---------------------------------------------------------------------------
SoxError                                  Traceback (most recent call last)
<ipython-input-22-15231ac67f0e> in <module>()
     10 tfm = sox.Transformer(infile, outfile)
     11 tfm.convert(samplerate=out_sr, channels=out_channels, bitdepth=out_bitdepth)
---> 12 tfm.build()

/usr/local/lib/python2.7/site-packages/sox-1.1-py2.7.egg/sox/transform.pyc in build(self)
    157 
    158         if status is False:
--> 159             raise SoxError
    160         else:
    161             logging.info(

SoxError: 

Trim allows start time without an end

Trim can trim a file from the beginning with only the start time (e.g. sox input output trim 10) but pysox's trim requires both start and end. Added #81 as a fix.

Getting wav not supported error

I am getting this error
"This install of SoX cannot process .wav files."

I am using the code in AWS lambda and the script looks like this

def split_channel(temp_file, channel_num):
result_temp_file = '/tmp/{}.wav'.format(uuid.uuid4())
tfm = sox.Transformer()
remix_dictionary = { 1: [channel_num]}
tfm.remix(remix_dictionary)
tfm.build(temp_file, result_temp_file)
return result_temp_file

Is there some step to make Sox work for .wav files ?

Combiner.preview() fails

Combiner inherits preview from Transformer but it needs to be overwritten because the base call is different for multiple inputs.

Trying to use the core API play.

Hello,

I'm trying to play music through sox. For this purpose, I'm using the play API available in core.py. But I get the following error OSError: Play failed! [WinError 2] The system cannot find the file specified

` retval = os.getcwd()

INPUT_FILE = os.path.join(retval + '\\BG\\test.wav')

print(INPUT_FILE)

arg_list = ['play', music_file]

handler = core.play(arg_list)  

`

Please, provide an example use case of core.play() API.

Is `file_info.bitrate` correct?

Hello

This function seems to be returning the bit depth (aka bits per sample) not the bit rate.

soxi -b is the bit depth.

soxi -B is the bitrate.

Happy to submit a pull request with a fix and a second function for the bit depth.

Cheers.

Dependency on grep command which is missing on Windows by default

The problem arises in the _get_valid_formats() function in pysox/sox/core.py.

I propose a fix as follows:

def _get_valid_formats():
    ''' Calls SoX help for a lists of audio formats available with the current
    install of SoX.
    Returns:
    --------
    formats : list
        List of audio file extensions that SoX can process.
    '''
    if NO_SOX:
        return []

    pfm = platform.system()
    if pfm == 'Windows':
        shell_output = subprocess.check_output('sox -h | findstr /c:"AUDIO FILE FORMATS"', shell=True)
    else:
        shell_output = subprocess.check_output('sox -h | grep "AUDIO FILE FORMATS"', shell=True)

    formats = shell_output.decode('utf8').split()[3:] # this is better because it does not leave whitespace characters in the string
    return formats

Apply combiner to an input_filepath_list of length 1

Trying to combine a single file (which basically doesn't alter it unless the combiner applies transformations too):

cbn.build([filename], outfile, 'concatenate')

Raises an error: ValueError: input_filepath_list must have at least 2 files.

But there are scenarios where this is useful, for example in my case I need to concatenate a file to itself if the file is shorter than a certain value, but otherwise leave it unchanged. The number of concatenattions N is determined at runtime, so ideally I'd like to call build() like this:

cbn.build([filename] * N, outfile, 'concatenate')

So that if N=1 it basically leaves the file unchanged. This can of course be achieved using an if statement to determine whether I need to use the combiner or not, but it's much clunkier.

@rabitt Is there a particular reason why the combiner can't be called with an input_filepath_list list of length 1?

Return None rather than 0 when file info (bit rate, duration, etc. ) is NA

There is a number of soxi functions that have special return values if the info is unavailable of not applicable. Here's the current behavior of pysox functions in these cases:

  • bitrate: zero
  • comments: empty string
  • duration: zero
  • num_samples: zero

In an offline conversation with @rabitt, we figured that it would be good to use the None constant in Python to express this kind of value. That will break type stability, but on the other hand it will help catching bugs such as
if duration < max_duration
or
if bitrate < 128000

which really should not return True if these values are NA. In addition, pysox currently raises a warning in those cases, so the loss in speed incurred by the lack of type stability is negligible in front of the loss in speed incurred by raising a warning, I suppose.

Thoughts?

Output file validation breaks when using temp files

Output file validation, and in particular this line, crashes when trying to write to a temp file (created using tempfile). Consequently, pysox crashes when trying to write to a temporary file (e.g. using a Combiner). Not sure which version this was introduced in, since I didn't use to have this problem.

Minimal example:

import os, tempfile
tf = tempfile.NamedTemporaryFile(suffix='.wav', delete=True)
os.path.dirname(tf)

Possible fixes: ensure output file validation supports tempfiles, or at least allow to disable output file validation.

Windows, error on import sox

The line 89 in core.py:
"sox -h | grep 'AUDIO FILE FORMATS'"
results in an error on windows

to make it work change to:
'sox -h | grep "AUDIO FILE FORMATS"'

SoX doesn't seem to like scientific notation

Pysox needs to format its floats to not be scientific notation when formatting arguments for SoX, otherwise SoX seems to choke. I received

SoxError: Stdout: Stderr: sox FAIL pad: usage: {length[@position]}

when pysox was calling SoX with these arguments:

['sox', '-D', '-V2', u'../data/raw/scaper_audio/siren_wailing/5_siren_police.wav', '-c', '1', '/var/folders/xv/6nccdc7151j71bhvzlrvjhqm0000gp/T/tmpYCzbyz.wav', 'rate', '-h', '44100', 'trim', '0', '6.501565999', 'fade', 'q', '0.01', 'reverse', 'fade', 'q', '0.01', 'reverse', 'norm', '-7.36679718537', 'pad', '3.498434', '1.00000008274e-09']

Note the '1.00000008274e-09' at the end. When this number was changed to '0.00000000100000008274', this error went away, and when changed to a different sci notation number, e.g. '1.0e-3', the problem persisted.

Use directly in memory, instead of via files, possible?

Would it be possible to input audio as ndarray:s into pysox directly as well as files from disk? Why I'm asking is because I'm using librosa for onset detection, and thus already have the audio loaded, but would still like to apply some audio effects and stuff afterwards.

Like maybe the constructors could do a type check, and build() could return the destination audio or something?

y = librosa.load(path)[0]
tfm = sox.Transformer(y)
# Do stuff...
y = tfm.build()

I realize this adds extra complexity to pysox (like having librosa as a dependency or similar) which is intended to be a clean wrapper around SoX, so I get if it's deemed out of scope. Just asking!

Return transformer and combiner always?

Just a proposal for the API, it would be neat if all methods in Transformer and Combiner return self, so that one could create chains like this:

import sox
tfm = sox.Transformer('path/to/input_audio.wav', 'path/to/output/audio.aiff')
         .trim(5, 10.5)
         .compand()
         .fade(fade_in_len=1.0, fade_out_len=0.5)
         .build()

(reference)

Thread-safe?

It doesn't seem like pysox is thread-safe. Could this be fixed?

Combining a single file

Hello,

I'd like to use the -v (Combination) command on a single file. While this may seem like nonsense it's an easy way to change the volume of a file by a given percentage, like this:
sox -v 0.9 in.wav out.wav
Currently this is not possible since validate_input_file_list in the file_info.py raises an error when the list has less than 2 items.

While this helps people avoid mistakes it also prevents some functionality. Do you think it is worth removing?

Disable warnings when overwriting output file

There are scenarios where you know in advance that you're going to use pysox to overwrite an existing file, e.g. when using temporary files like in this example from #6. In this situation it's inconvenient to have pysox issue a warning for every file overwrite (as there are many), and right now there's no way to disable these warnings in pysox (you can hack it by changing the logging level as in the example provided above, but that's not ideal).

It would be nice to have an optional flag in pysox (or maybe at the transformer/combiner level?) to disable file overwrite warnings (and maybe another option to disable all warnings from pysox?).

Implment stat and volume adjustment

stat access useful data of a file.
Volume adjustment are implemented in combine.py for combine. Should we add it again in set_input_format in transform.py?

Relative file path support

if the output file path is a.wav why not just output to os.getcwd(). maybe sometime in python interface, '~' is used to shorter the input. Let's support it.

Fail sox.file_info(): soxi binary does not exists in Windows

In Windows the last version of SoX (14.4.2) does not have soxi.exe binary, It looks like it uses the following command:

sox --i, --info              Behave as soxi(1)

The error:

>>> sox.file_info.duration('c:\\tmp\\test.flac')
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\site-packages\sox\core.py", line 125, in soxi
    shell=True, stderr=subprocess.PIPE
  File "C:\Program Files\Python36\lib\subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "C:\Program Files\Python36\lib\subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'soxi -D c:\tmp\test.flac' returned non-zero exit status 1.

I have tried this patch and it works:

+++ .\test_core.py        Sun Feb 12 13:57:22 2017
--- .\core.py   Sun Feb 12 13:48:26 2017
@@ -115,7 +115,7 @@
     if argument not in SOXI_ARGS:
         raise ValueError("Invalid argument '{}' to Soxi".format(argument))

+    args = ['sox --i']
-    args = ['soxi']
     args.append("-{}".format(argument))
     args.append(enquote_filepath(filepath))

Thanks!

Logging is very noisy

How can I make the logging quieter? I don't see that you are using a named logger so I cannot mute it with my logging config.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.