justinsalamon / scaper Goto Github PK
View Code? Open in Web Editor NEWA library for soundscape synthesis and augmentation
License: BSD 3-Clause "New" or "Revised" License
A library for soundscape synthesis and augmentation
License: BSD 3-Clause "New" or "Revised" License
The simplified annotation (txt) generated by scaper loads fine into Audacity 2.0.5, but doesn't display any labels when loaded into 2.1.3 (macOS).
Example contents:
0.18131 5.127892 siren
1.8938210000000002 5.881943000000001 siren
5.342074 8.890392 siren
Need to find out if this is an Audacity bug, or change of expected format.
If you install with setup.py, the namespaces are not included with the installation, preventing a user from being able to import the scaper module.
Certain scenarios require limiting the amount of overlapping sound events (or prohibiting it altogether). Right now there is no way to explicitly control for sound overlap.
Proposed solution: add a max_overlap
kwarg to generate()
, which by default is set to None
, meaning any overlap is allowed. If set to 1
, it basically means no overlap is allowed, 2
means 2 overlapping sounds, etc. 0
would be an invalid value.
(This would address #62)
Just wondering if we have any thoughts about spatializing mixtures, say with a library of room impulse responses added as a directory for Scaper? Opening the issue early because I think this will be a rather complex change (that possibly won't happen). However, if we had the ability to spatialize sources in a mixture with varying degrees of reverberation or receiver/source placement, we could make some pretty cool stuff I think!
Here's a library out there that could be interesting: https://github.com/LCAV/pyroomacoustics
I'm not sure why snr, pitch_shift, and time_stretch shouldn't work for bg events as well.
Add flag to generate (passed to instantiate) which prevents the same source file being used more than once (can result in unnatural sounding scapes).
The bitdepth of the output mixtures should be a setting, like sample rate, that you can enforce in the Scaper object. Right now, the behavior seems to be to take the bitdepth of the input source files which can vary greatly (I believe they vary in UrbanSound 8k, for example). When the bitdepth of the output varies or is too big, you get really poor performance when loading the mixtures for processing by something like a deep net. I think the fix is simple, it's somewhere in sox
you can enforce the bit depth to a default like 16 or something.
hi, I used my own event to synthesize soundscapes following the example using the Scaper, in the txtfile, I find that the same event overlapping in time. The result as follows:
4.603466112312178 6.302389493397525 babble
5.111090174762038 7.444187049119387 babble
5.414931605968281 9.862953061306083 music
5.599100623962036 9.988183322552855 music
6.202366268546969 9.510957062119823 music
so, how can I avoid the overlpping for the same event?
Elizabeth Mendoza (@Elizabeth-12324) uses scaper v0.2.0 and has succeeded in using scaper for pasting long sounds on her Windows 10 machine. However, short sounds (below 500 ms) cause a "Permission denied" error while calling sc.generate
. See the full backtrace below my signature.
As you can see, the line at fault is
cbn.build([filepath] * n_tiles, concat_file.name, 'concatenate')
in get_integrated_lufs
This line appeared in v0.2.0 in PR #28, which closed issues #13 and #18.
It seems that this PR brought a bug on Windows.
@Elizabeth-12324 and myself looked at the page of SoX known bugs: http://sox.sourceforge.net/Docs/Bugs
and this mailing list thread:
https://sourceforge.net/p/sox/mailman/message/20864618/
IIRC, @rabitt discouraged using the same inputfile and outpufile in pysox 27
marl/pysox#27
I don't know if there is an easy and portable fix for this. Could it be that the concatenated file has the same name than the original, and that avoiding conflating the two names make the LUFS concatenation Windows-friendly?
Vincent
---------------------------------------------------------------------------
SoxError Traceback (most recent call last)
<ipython-input-14-4823d7c3b010> in <module>()
31 disable_sox_warnings=False,
32 no_audio=False,
---> 33 txt_path=txtfile)
34
35
~\Anaconda3\lib\site-packages\scaper\core.py in generate(self, audio_path, jams_path, allow_repeated_label, allow_repeated_source, reverb, disable_sox_warnings, no_audio, txt_path, txt_sep, disable_instantiation_warnings)
1703 if not no_audio:
1704 self._generate_audio(audio_path, ann, reverb=reverb,
-> 1705 disable_sox_warnings=disable_sox_warnings)
1706
1707 # Finally save JAMS to disk too
~\Anaconda3\lib\site-packages\scaper\core.py in _generate_audio(self, audio_path, ann, reverb, disable_sox_warnings)
1570 # NOW compute LUFS
1571 fg_lufs = get_integrated_lufs(
-> 1572 tmpfiles_internal[-1].name)
1573
1574 # Normalize to specified SNR with respect to
~\Anaconda3\lib\site-packages\scaper\audio.py in get_integrated_lufs(filepath, min_duration)
98
99 cbn = sox.Combiner()
--> 100 cbn.build([filepath] * n_tiles, concat_file.name, 'concatenate')
101
102 loudness_stats = r128stats(concat_file.name)
~\Anaconda3\lib\site-packages\sox\combine.py in build(self, input_filepath_list, output_filepath, combine_type, input_volumes)
98 if status != 0:
99 raise SoxError(
--> 100 "Stdout: {}\nStderr: {}".format(out, err)
101 )
102 else:
SoxError: Stdout:
Stderr: sox FAIL formats: can't open output file `C:\Users\User\AppData\Local\Temp\tmpv8959m5l.wav': Permission denied
Right now each event has to be explicitly added to the event specification (e.g. via for loop). It would be helpful to have high-level generators such that you'd only have to specify something along the lines of "generate a soundscape where the number of events is sampled from distribution X obeying temporal distribution Y with constraints Z".
This, in addition to simplifying some uses cases, would allow supporting non-iid event distributions, e.g. Hawkes (self-exciting) processes as suggested by @lostanlen
Related: cf. high-level knobs provided in SimScene (e.g. figure 1)
Right now there aren't any checks related to amplifying events based on their specified SNR values. It would be helpful for Scaper to print out a warning when an event distorts.
Currently when using sox for time stretching, the actual duration of the time stretched event can vary slightly from the estimated duration (estimated = duration * stretch factor). This caused a bug in post-padding, fixed by calculating post-padding based on the actual duration of the stretched event instead of the estimated duration.
Currently there are no tests to guarantee that these two values (estimated and actual duration of stretched event) are within an acceptable epsilon, so need to add tests for that.
Probably not worth implementing before migrating to pyrubberband for time stretching.
Because yes.
Some foreground sound (like engine passing or dog bark) sound weird/unnatural if trimmed. The idea is to add support for specifying "protected" label. When instantiated, scaper is forced to use the complete duration of the source file for protected labels (but should be ok to trim when soundscape ends).
Hi,
I'm trying to generate 1000 soundscapes use scaper but I realized that while running my code, after each iteration, the generation of soundscapes gets slower. For example: after generating 100 soundscapes, the 101th took almost 1 min, whereas for first few soundscapes it only took around 2 seconds. This increase in time is gradual and I can't figure out a root cause to it.
Please help. The code I'm using is below:
# Generate n soundscapes
for i in range(n_start, n_stop):
print('Generating soundscape: {:d}/{:d}'.format(i+1, n_soundscapes))
# add random number of foreground events
n_events = np.random.randint(1, 5)
for _ in range(n_events):
event_time = np.random.randint(0, duration)
event_duration = duration - event_time
sc.add_event(label = ("const", "chainsaw"),
source_file = ("choose", []),
source_time = ("uniform", 0, 30 - duration),
event_time = ("uniform", 0, event_time),
event_duration = ("uniform", 1, event_duration),
snr = ("uniform", -5, 0),
pitch_shift = ("uniform", -15, 15),
time_stretch = None)
audiofile = outfolder + f'soundscape_{i}.wav'
jamsfile = jamsfolder + f'soundscape_{i}.jams'
txtfile = txtfolder + f'soundscape_{i}.txt'
sc.generate(audiofile, jamsfile,
allow_repeated_label = True,
allow_repeated_source = True,
disable_sox_warnings = True,
no_audio = False,
txt_sep = ',',
txt_path = txtfile)
```
When, after installing the latest version using pip on Windows 10, I import Scaper I get the following error:
'grep' is not recognized as an internal or external command,
operable program or batch file.
with traceback
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev\pydevd.py", line 1683, in
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev\pydevd.py", line 1677, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev\pydevd.py", line 1087, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "E:/Github/how-noisy/util/create_soundscapes.py", line 6, in
import scaper
File "E:\Development\Python3\Anaconda3\lib\site-packages\scaper_init_.py", line 4, in
from .core import Scaper
File "E:\Development\Python3\Anaconda3\lib\site-packages\scaper\core.py", line 1, in
import sox
File "E:\Development\Python3\Anaconda3\lib\site-packages\sox_init_.py", line 19, in
from . import file_info
File "E:\Development\Python3\Anaconda3\lib\site-packages\sox\file_info.py", line 7, in
from .core import VALID_FORMATS
File "E:\Development\Python3\Anaconda3\lib\site-packages\sox\core.py", line 96, in
VALID_FORMATS = _get_valid_formats()
File "E:\Development\Python3\Anaconda3\lib\site-packages\sox\core.py", line 90, in _get_valid_formats
shell=True
File "E:\Development\Python3\Anaconda3\lib\subprocess.py", line 336, in check_output
**kwargs).stdout
File "E:\Development\Python3\Anaconda3\lib\subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'sox -h | grep "AUDIO FILE FORMATS"' returned non-zero exit status 255.
`
Often I don't store my audio in a flat directory structure so it would be cool if I could generate scapes from nested bg/fg folders.
bg:
label1
A
a.wav
B
b.wav
c.wav
label2
A
a.wav
...
Just quickly looking through, I don't think too many modifications would be necessary.
To automatically get nested files, we would need to run glob recursively here up to some depth, or just exhaustively. I know glob isn't a great tool for this so if we want to support an exhaustive index, we could use os.walk or similar. This would involve adding a max_label_depth attribute to the Scaper object. By default, it can just be set to 1
to maintain current behavior.
Line 89 in ec5a5f6
And then we would just have to update _populate_label_list
to get the nested labels. The most compatible way of constructing the labels would be to keep them as a directory structure, so the labels could be returned as ['label1/A', 'label1/B', 'label2/A']
.
Line 117 in ec5a5f6
Other than that, I don't think it should affect how Scaper works at all! When selecting a source file, you just run os.path.join(file_path, label)
so that would work smoothly.
An extension to this would be to allow glob-style pattern matching on the labels so you could specify ('choose', 'label1/*')
and have it filter labels matching the pattern. It would be easy to perform using something like fnmatch.filter(labels, pattern)
which is I think what glob uses under the hood. This would require more intensive changes involving docs and tuple validation so that may not be quite as simple.
Rather than outputting to disk, support pipelining scaper output to subsequent blocks in the training chain (e.g. augmentation, feature extraction, etc.). Simplest version of this is just returning audio/JAMS rather than saving to disk. --> infinite soundscape dataset.
It might be useful to have a way to reset the event specification in a Scaper object so that the same Scaper object can be used over and over to generate soundscapes, instead of making a Scaper object inside the loop for each soundscape like in the tutorial.
Current proposal is to add a function to the Scaper object called Scaper.reset_event_spec()
which would accomplish this. Looking through Scaper._instantiate
, the only objects that get touched that belong to self
are self.bg_spec
and self.fg_spec
. As these are both lists, I think the body of reset_event_spec
would look something like:
def reset_event_spec(self):
self.fg_spec = []
self.bg_spec = []
Which are their original settings in the init function.
We would need a corresponding test. Maybe generate something, then reset the object, then generate again? And make sure the generated soundscapes are different from one another?
See error below. No idea why this is happening (just started happening randomly, was fine 2 days ago, nothing has changed in the travis config yml). Temporary solution: comment out pip install pytest-faulthandler
in travis.yml
, tests still run without it.
1.91s$ pip install pytest-faulthandler
Collecting pytest-faulthandler
Downloading pytest_faulthandler-1.3.1-py2.py3-none-any.whl
Collecting pytest>=2.6 (from pytest-faulthandler)
Downloading pytest-3.2.2-py2.py3-none-any.whl (187kB)
100% |████████████████████████████████| 194kB 4.7MB/s
Collecting faulthandler; python_version == "2.6" or python_version == "2.7" (from pytest-faulthandler)
Downloading faulthandler-3.0.tar.gz (55kB)
100% |████████████████████████████████| 61kB 8.7MB/s
Requirement already satisfied: setuptools in /home/travis/miniconda/envs/test-environment/lib/python2.7/site-packages (from pytest>=2.6->pytest-faulthandler)
Collecting py>=1.4.33 (from pytest>=2.6->pytest-faulthandler)
Using cached py-1.4.34-py2.py3-none-any.whl
Building wheels for collected packages: faulthandler
Running setup.py bdist_wheel for faulthandler ... error
Complete output from command /home/travis/miniconda/envs/test-environment/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-wI1YGd/faulthandler/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpyGVM8Opip-wheel- --python-tag cp27:
running bdist_wheel
running build
running build_ext
building 'faulthandler' extension
creating build
creating build/temp.linux-x86_64-2.7
x86_64-conda_cos6-linux-gnu-gcc -pthread -fno-strict-aliasing -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -DNDEBUG -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/travis/miniconda/envs/test-environment/include/python2.7 -c faulthandler.c -o build/temp.linux-x86_64-2.7/faulthandler.o
unable to execute 'x86_64-conda_cos6-linux-gnu-gcc': No such file or directory
error: command 'x86_64-conda_cos6-linux-gnu-gcc' failed with exit status 1
----------------------------------------
Failed building wheel for faulthandler
Running setup.py clean for faulthandler
Failed to build faulthandler
Installing collected packages: py, pytest, faulthandler, pytest-faulthandler
Running setup.py install for faulthandler ... error
Complete output from command /home/travis/miniconda/envs/test-environment/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-wI1YGd/faulthandler/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-lKAMMU-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_ext
building 'faulthandler' extension
creating build
creating build/temp.linux-x86_64-2.7
x86_64-conda_cos6-linux-gnu-gcc -pthread -fno-strict-aliasing -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -DNDEBUG -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/travis/miniconda/envs/test-environment/include/python2.7 -c faulthandler.c -o build/temp.linux-x86_64-2.7/faulthandler.o
unable to execute 'x86_64-conda_cos6-linux-gnu-gcc': No such file or directory
error: command 'x86_64-conda_cos6-linux-gnu-gcc' failed with exit status 1
----------------------------------------
Command "/home/travis/miniconda/envs/test-environment/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-wI1YGd/faulthandler/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-lKAMMU-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-wI1YGd/faulthandler/
The command "pip install pytest-faulthandler" failed and exited with 1 during .
Your build has been stopped.
If an exception is raised before _close_temp_files
exits, the files won't be deleted, so to fix that, I propose catching the error, closing the files, then re-raising the error.
It would be convenient to be able to omit the source_time
for background files and have it start at any point in the recording. So essentially, have it default to ('uniform', [0, bg_audio_file_duration])
. And the same goes for event_time
. I'd like to be able to just specify ('uniform',)
for example and have them randomly placed throughout the file.
Similarly but less important, it would be nice to be able to omit event_duration and have it default to ('const', fg_audio_file_duration - source_time)
In general, I think providing sensible defaults for parameters (like label
and source_file
default to ('choose', [])
, source_time
defaults to ('const', 0)
, etc) would be helpful.
So this isn't a high-priority issue at all and I'm not suggesting we implement it any time soon, it's just something I've had on my mind for a while that I wanted to put on paper. Basically, just factoring out all of the distribution and event parameter validation so it's a bit cleaner and easier to add new distributions. Here's a rough sketch of what I was thinking. Obviously there are some things left to figure out, but I think it could potentially simplify the Scaper core logic nicely.
def _validate_value(spec, value):
if spec.get('can_be_none'):
return
elif value is None:
raise ScaperError('Value for parameter {} cannot be None.'.format(spec['name']))
if 'min' in spec and value < spec['min']:
raise ScaperError('Value {} for parameter {} exceeded minimum value {}.'.format(
value, spec['name'], spec['min']))
if 'max' in spec and value > spec['max']:
raise ScaperError('Value {} for parameter {} exceeded maximum value {}.'.format(
value, spec['name'], spec['max']))
if 'is_file' in spec and os.path.isfile(value) == spec['is_file']:
raise ScaperError('Value {} for parameter {} should be an existing file: {}'.format(
value, spec['name'], spec['is_file'])) # not good phrasing but you get the idea.
if 'allowed_choices' in spec is not None and value not in spec['allowed_choices']:
raise ScaperError('Value {} for parameter {} not in available values: {}'.format(
value, spec['name'], spec['allowed_choices']))
... # a whole suite of possible tests
class Distributions:
'''Distribution Factory'''
available = {}
@classmethod
def register(cls, distribution):
cls.available[distribution.name] = distribution
@classmethod
def from_tuple(cls, dist_tuple):
return cls.available[dist_tuple[0]](*dist_tuple[1:])
class Distribution:
def __init__(self):
raise NotImplementedError
def validate(self):
raise NotImplementedError
def __call__(self):
raise NotImplementedError
@Distributions.register
class Const(Distribution):
name = 'const'
def __init__(self, value):
self.value = value
def validate(self, spec):
_validate_value(spec, self.value)
def __call__(self):
return self.value
@Distributions.register
class Choose(Distribution):
name = 'choose'
def __init__(self, choices):
self.choices = choices
super().__init__()
def validate(self, spec):
for choice in self.choices:
_validate_value(spec, choice)
def __call__(self):
return random.choice(self.choices)
@Distributions.register
class Uniform(Distribution):
name = 'uniform'
def __init__(self, vmin, vmax):
self.min = vmin
self.max = vmax
def validate(self, spec):
_validate_value(spec, self.min)
_validate_value(spec, self.max)
def __call__(self):
return random.uniform(self.min, self.max)
@Distributions.register
class Normal(Distribution):
def __init__(self, mean, std):
self.mean = mean
self.std = std
def validate(self, spec):
if spec.min or spec.max:
warnings.warn(
'A "normal" distribution tuple for {} can result in '
'non-positives values, in which case the distribution will be '
're-sampled until a positive value is returned: this can result '
'in an infinite loop!'.format(spec.name),
ScaperWarning)
def __call__(self):
return random.normal(self.mean, self.std)
@Distributions.register
class Truncnorm(Distribution):
name = 'truncnorm'
def __init__(self, mean, std, vmin, vmax):
self.mean = mean
self.std = std
self.min = vmin
self.max = vmax
def validate(self, spec):
_validate_value(spec, self.min)
_validate_value(spec, self.max)
def __call__(self):
x = random.normal(self.mean, self.std)
x = max(x, self.min) if self.min is not None else x
x = min(x, self.max) if self.max is not None else x
return x
# default_event_validation_spec = dict(
# min=None, max=None,
# is_real=None, file_exists=None,
# allowed_distributions=None,
# allowed_choices=None,
# can_be_none=None
# )
event_validation_spec = {
'label': dict(allowed_distributions={'const', 'choose'},
allowed_choices=()),
'source_file': dict(is_file=True),
'time': dict(min=0),
'duration': dict(min=0, is_real=True),
'snr': dict(is_real=True),
'pitch_shift': dict(can_be_none=True, is_real=True),
'time_stretch': dict(can_be_none=True, is_real=True, min=0),
}
# TODO: figure out how to pass allowed_choices
# add in name as a field (for error reporting)
for name, spec in event_validation_spec.items():
spec['name'] = name
def sample_event_parameter(name, dist_tuple):
# get the validation spec for the event parameter
spec = dict(event_validation_spec[name], **kw)
# make sure the distribution is valid for this parameter.
if 'allowed_distributions' in spec and dist_tuple[0] not in spec['allowed_distributions']:
raise ScaperError('Invalid distribution {} for parameter {}.'.format(
dist_tuple[0], spec['name']))
# create, validate, and sample from the distribution
dist = Distributions.from_tuple(dist_tuple)
dist.validate(spec)
return dist()
def sample_event_spec(event_spec):
return {
sample_event_parameter(name, dist_tuple)
for name, dist_tuple in zip(event_spec._fields, event_spec)
}
It seems that on Windows Sox has trouble with temporary files due to permissions. Perhaps a parameter can be added to Scaper to set the temporary files folder. Sox has this option with the --temp argument.
WARNING:root:output_file: C:\Users\Martin\AppData\Local\Temp\tmp4q3noewg.wav already exists and will be overwritten on build
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev\pydevd.py", line 1599, in
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev\pydevd.py", line 1026, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.2.3\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "E:/Github/how-noisy/run/create_soundscapes.py", line 84, in
txt_path=txtfile)
File "e:\github\how-noisy\lib\scaper\scaper\core.py", line 1661, in generate
disable_sox_warnings=disable_sox_warnings)
File "e:\github\how-noisy\lib\scaper\scaper\core.py", line 1563, in _generate_audio
tfm.build(e.value['source_file'], tmpfiles[-1].name)
File "E:\Development\Python3\Anaconda3\lib\site-packages\sox\transform.py", line 443, in build
"Stdout: {}\nStderr: {}".format(out, err)
sox.core.SoxError: Stdout:
Stderr: sox FAIL formats: can't open output file `C:\Users\Martin\AppData\Local\Temp\tmp4q3noewg.wav': Permission denied
For events < ~0.5 seconds the LUFS calculation seems wrong (very low, fixed). Potential solution: if event is shorter than some threshold X, duplicate event (concatenate audio to self) prior to computing LUFS --> seems to give consistent values (tested on longer sounds vs concatenated versions of those sounds).
Hello,
When trying to generate a long-duration soundscape (1 hour), I get the error printed at the end of this message.
It seems that a .wav file is created for each background and foreground event, but the size of each one of these files is equal to the total size of the final file (in my case, 330MB for a 1 hour long .wav file). As I understand, this is an artefact of using SoX. Scaper must pad every foreground event to the duration of the full soundscape prior to mixing them all together.
Since there are many events being generated, I eventually run out of disk space (the default location for the temporary wav files is /tmp), and the process crashes (It is possible to specify the directory for the temporary files with the TMPDIR environment variable, but that's not a practical solution if there's insufficient disk space anywhere in the system).
There is probably no quick solution other than generating short soundscapes and then concatenating them externally.
Thank you,
Ohad
Error read as follows:
Traceback (most recent call last):
File "py/gen-monophonic.py", line 115, in
main (len(sys.argv), sys.argv)
File "py/gen-monophonic.py", line 109, in main
txt_path=text_file)
File "/usr/lib/python2.7/site-packages/scaper/core.py", line 1707, in generate
disable_sox_warnings=disable_sox_warnings)
File "/usr/lib/python2.7/site-packages/scaper/core.py", line 1575, in _generate_audio
tmpfiles_internal[-1].name)
File "/usr/lib/python2.7/site-packages/scaper/audio.py", line 102, in get_integrated_lufs
loudness_stats = r128stats(concat_file.name)
File "/usr/lib/python2.7/site-packages/scaper/audio.py", line 52, in r128stats
filepath, e.str()))
scaper.scaper_exceptions.ScaperError: Unable to obtain LUFS data for /tmp/tmpMYbHQg.wav, error message:
Unable to find LUFS summary, stats string:
ffmpeg version git-2018-11-01-6a034ad Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-11)
configuration: --prefix=/usr/local --extra-cflags=-I/usr/local/include --extra-ldflags=-L/usr/local/lib --bindir=/usr/local/bin --extra-libs=-ldl --enable-gpl --enable-nonfree --enable-libfdk_aac --enable-libmp3lame --enable-libvpx --enable-libfreetype --enable-libspeex
libavutil 56. 21.100 / 56. 21.100
libavcodec 58. 34.100 / 58. 34.100
libavformat 58. 19.102 / 58. 19.102
libavdevice 58. 4.107 / 58. 4.107
libavfilter 7. 39.100 / 7. 39.100
libswscale 5. 2.100 / 5. 2.100
libswresample 3. 2.100 / 3. 2.100
libpostproc 55. 2.100 / 55. 2.100
/tmp/tmpMYbHQg.wav: Invalid data found when processing input
generate_from_jams()
requires updating observations if a new fg/bg path is provided. Currently this is done by updating the value dict directly, which is a hack because in principle the observation object is meant to be immutable. Solution is to pop the observation to be modified and add a new one with the updated field values.
Right now if a soundscape is "empty" (no background or foreground events) a warning will be issued and no audio will be generated. Ideally we want to synthesize a silent file, but not sure yet whether this can be done in pysox?
Currently, there doesn't appear to be a way to seed scaper with something like random.seed(0) so that it produces the same mixtures given the same random seed and set of source files. Just starting this issue to discuss what changes would need to be done to accomplish it.
Thanks for the code, it is amazing.
When creating datasets, I'd like to have similar datasets with "one or two" arguments changing.
For now, I manually change the JAMS, however, wouldn't it be better to be able to modify some parameters of the jams before recreating the audio ?
Or is there an other way to do it ?
Examples (my use cases):
I create a dataset with a background SNR of 0dB.
I want the same dataset with now a background SNR of 6dB to know the influence of the background SNR on my application.
I create a dataset Scaper.
I want the same dataset but with foreground event duration reduced.
A useful feature would be for Scaper to be able to trim and copy over other annotations from the source files on demand.
Under some scenarios the user might want to create soundscapes with different durations from the same scaper object. Right now this is not supported and the soundscape duration must be set during initialization (furthermore, the background duration is set as soon as it's added based on the duration value provided during initialization).
This enhancement involves changing the soundscape duration from an object variable to a function argument (e.g. to generate()
) to support changing the soundscape duration on the fly. It would also require changing the duration of all background events on the fly.
This decreases the portability of the jams files. Maybe an audio directory can be defined somewhere, and all audio file locations are in reference to that?
Thoughts?
When I generate audio files from a jams files, the jams file metadata says the duration should be 10 seconds, but the generated audio is 12 seconds. I haven't investigated this much, but it seems like this may be due to the padding that sox uses when applying reverb.
It would be nice if Scaper could provide a way to add custom audio filters to the events and backgrounds. I imagine something like this:
add_event(label, source_file, source_time, event_time, event_duration, transformations)
Where transformations
is a list of functions or objects that transform the audio signal in sequence.
transformations = [ SomeTransform(), LowPass(4), TimeStretch('uniform', 0.8, 1.2), PitchShift('uniform', -2, 2) ]
What do you think?
I have a lot of different files for my background.
However, If I have a background files which is shorter than the duration specified for a sounscape, the end of my file has no background noise. Am I doing something wrong ?
If not, would it be possible to imagine one of this solution:
LUFS computed from complete source file, but if fg event taken from short segment, LUFS might be different. Proposed solution: apply all transformations (except for adding silence), save into temp file, compute LUFS (might need to concatenate to get at least 1s of audio), and then continue generation process.
Since audio trimmed independently of jams, when strict=True jam will have boundary events removed but audio signal won't. The solution is to regenerate the audio from the trimmed jammed file when strict=True. For now I'm removing the strict option from scaper.trim, such that the default/only behavior supported for now will be the strict=False behavior (i.e. boundary events will be truncated but not removed).
Scaper can generate the wrong number of audio samples (e.g. 220501 instead of 22050 for a 10s soudnscape of duration 10s) when some of the foreground sound events are time stretched. Issue doesn't always happens and seems to depend on sampling rate (e.g. happens for 22050 but not for 44100 on a local example).
Support specifying (distribution driven) MUDA transformations for foreground events (and background? and mix?), which will apply augmentation on the fly to FG source prior to mixing into scene.
Hello!
I've just pushed scaper v1.0.0rc1 to pypi: pip install scaper==1.0.0rc1
This major update supports jams>=0.3.2, uses the scaper
namespace instead of sound_event
, and drops the dependency on pandas.
@lostanlen @Elizabeth-12324 @bmcfee @pseeth if you have the time it'd be great if you could give this pre-release a quick whirl and let me know if there are any issues before I push v1.0.0 out. Ideally I'd like to release the formal v1.0.0 within a week from today.
Things to note:
scaper
namespace.sound_event
to scaper
. The file should then load fine.Any/all feedback is welcome.
Cheers!!
Currently test_generate_from_jams tests the output audio, but not the (optional) output jams file, need to add test.
I've been using a declarative YAML API for one of my projects and I think it could be useful to add support for something like it to the core. Here's some snippets from my config.
# default values
scaper:
fg_folder: 'data/bg'
bg_folder: 'data/bg'
ref_db: -25
duration: 60.0
n_soundscapes: 10
fade_in_len: 0.1
fade_out_len: 0.1
bg:
label: ['const', 'motor_normal']
source_file: ['choose', []]
source_time: ['uniform', 0, 900]
n_events: [1, 1] # uniform sample range. ignored if `events:` has elements
events: []
fg:
label: ['choose', []]
source_file: ['choose', []]
event_time: ['truncnorm', 30.0, 5.0, 0.0, 60.0]
event_duration: ['uniform', 6, 12]
source_time: ['const', 0]
snr: ['uniform', 0, 2]
pitch: ['uniform', -3, 3]
time_stretch: ['uniform', 0.8, 1.2]
n_events: [0, 0] # uniform sample range. ignored if `events:` has elements
events: []
# config for each experiment
# each inherits from `scaper:`
experiments:
plant_normal:
duration: 60
n_soundscapes: 200
bg_folder: 'bg'
bg:
label: ['const', 'plant']
source_time: ['uniform', 0, 200]
plant_fault:
extend: plant_normal
n_soundscapes: 200
fg_folder: 'bg'
fg:
# n_events: [1, 1]
label: ['const', 'scraping']
source_time: ['uniform', 0, 200]
snr: ['uniform', -40, 10]
event_duration: ['const', 10]
pitch:
time_stretch:
events: # these inherit from `fg:`
-
snr: ['const', -40]
event_time: ['const', 10]
-
snr: ['const', -30]
event_time: ['const', 30]
-
snr: ['const', -20]
event_time: ['const', 50]
And then the Python API could be something like this?
# load the config experiment
scaper_config = ScaperExperimentConfig(config.SCAPER, config.EXPERIMENTS)
params = scaper_config.load('plant_normal')
# create a scaper object and generate a bunch of soundscapes
sc = Scaper.from_config(params)
sc.generate(..., n_soundscapes=p['n_soundscapes'])
# or n_soundscapes could be handled internally in from_config
We should add any scaper parameters to the constructor. It would be a non-breaking change because you can still set the attributes manually. It's not a huge issue, it's just a bit of a pet peeve because it means you can't pass them using **kwargs
.
# currently
sc = scaper.Scaper(duration, fg_folder, bg_folder)
sc.sr = sr
sc.ref_db = ref_db
sc.protected_labels = []
sc.fade_in_len = fade_in_len
sc.fade_out_len = fade_out_len
# but should be this
sc = scaper.Scaper(duration, fg_folder, bg_folder
sr=sr,
ref_db=ref_db,
protected_labels=[],
fade_in_len=fade_in_len,
fade_out_len=fade_out_len)
Right now each foreground even can have very different (or no) reverberation, which results in a scene that does not sound natural. This could be alleviated by applying a filter to add reverberation (to each foreground event, or to the whole soundscape?).
If no bg is added, several issues may arise include the sox combiner crashing (list of one file), and potentially other problems (not inspecteD).
generate_from_jams
should have an option to generate audio separated by label for training source separation algs. You can probably just hack it by filtering the scaper jams files, so it isn't anything urgent, but I think it would be a valuable feature to add at some point.
Need to create documentation entries/examples for:
scaper.trim()
scaper.generate_from_jams()
n_events
, polyphony_max
and polyphony_gini
reverb
, fade_in_len
, fade_out_len
, allow_repeated_label
, allow_repeated_source
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.