Comments (19)
You are correct, that insofar as reading the file is concerned, two dimensional indexing is no advantage to just reading all the channels.
However, I think it is much more convenient and much easier to understand to write
wave = SoundFile('file.wav')[22050:66150,1]
instead of
wave = SoundFile('file.wav')[22050:66150][:,1]
In ac983e5 , I implemented two dimensional indexing simply as syntactic sugar that converts the first expression into the second. The whole numpy array is still read, and the second index is only applied after the whole data was read. Indeed, libsndfile does not provide a way to access single channels anyway, so this is probably the only possible solution.
from python-soundfile.
As for wavread-like functionality, I don't see much difference between
s = wavread('file.wav')
and
s = SoundFile('file.wav')[:]
or
s = SoundFile('file.wav').read()
Do you think wavread
-like functionality is really necessary?
from python-soundfile.
You are correct, that insofar as reading the file is concerned, two dimensional indexing is no advantage to just reading all the channels.
However, I think it is much more convenient and much easier to understand to write
wave = SoundFile('file.wav')[22050:66150,1]
instead of
wave = SoundFile('file.wav')[22050:66150][:,1]
True. But what about this:
import PySoundFile as sf
wave = sf.read('file.wav')[22050:66150,1]
In this hypothetical case, read()
is a function that returns the
whole file as NumPy array.
It would internally open the file, read from it and close the file.
Or what about this:
wave = sf.read('file.wav', start=22050, stop=66150)[:, 1]
This wouldn't be the most common use, though, the most common would be:
wave = sf.read('file.wav')
... which in my opinion easier to grasp then
wave = SoundFile('file.wav')[:]
or
wave = SoundFile('file.wav').read()
And, just to be clear, I don't suggest to abandon the class
SoundFile
entirely (just the indexing feature).
In more special cases of course the SoundFile
object can be used
directly, ideally within a context manager.
In ac983e5 , I implemented two dimensional indexing simply as syntactic sugar that converts the first expression into the second. The whole numpy array is still read, and the second index is only applied after the whole data was read. Indeed, libsndfile does not provide a way to access single channels anyway, so this is probably the only possible solution.
I guess this is better than before, because only allowing
one-dimensional indexing seemed odd.
But not caching the data also seems strange, especially if someone
wants to read 2 (or more) channels separately.
Each time a full np.ndarray
with all channels will be created and a
view for one channel will be returned.
If I'm not mistaken, the whole array will be kept in memory (via the
wave.base
reference) until the returned view is destroyed.
That would mean, for example, if I load 2 separate channels from a
7-channel file, 14 channels would be kept im memory, right?
So probably if a subset of channels is selected they should be copied
to a new (and smaller) array?
Which again would incur a performance overhead and additional memory
allocation which also would be very bad.
All in all, regardless of the chosen syntax, I think it is impossible
to get that right, so selecting individual channels should be left to
the user.
from python-soundfile.
I think I finally get your point. Just to make sure: You are saying that indexing a SoundFile object just like a Numpy array is unintuitive, and doing the same on a Numpy array would be better.
I can agree with that. However, I think indexing the SoundFile object is extremely convenient and one of the core features of PySoundFile. Also, as I said, there is no way to read single channels in portaudio, so we'll have to live with the inefficiencies this affords us.
That said, we could easily allocate a new array instead of returning the view. Did you really run into that problem though? Because I have quite a few friends who work with audio stuff and this is the least of their worries.
from python-soundfile.
Also, I think there is some confusion regarding the lifetime of SoundFile objects:
wave = SoundFile('file.wav')[:]
wave = SoundFile('file.wav').read()
are equivalent, and both SoundFile objects will deallocate right on that very line. They will not stay in memory beyond that and the file will not be kept open. I see no reason at all to change this to
wave = wavread()
If you really want wavread
, it's literally just lambda x: SoundFile(x).read()
.
from python-soundfile.
In order to be able to give examples for my answer here I tried the two-dimensional feature and promptly found a bug. In line 445 instead of
if second_frame:
you should write
if second_frame is not None:
Otherwise, whenever you request the first channel (second_frame == 0
), you'll get the whole array.
This is not a biggie, bugs happen all the time, but it illustrates my point: You are implementing stuff that's already implemented in NumPy!
I deliberately didn't open a new issue for that because I think the best fix is to remove the code completely.
About indexing being intuitive:
I do think indexing would be intuitive, but only if it is two-dimensional and only if it's done right.
And, as I mentioned above, I believe two-dimensional indexing cannot be done right (because of caching and copying issues which cause CPU and/or memory overhead).
Indexing on a NumPy array wouldn't be more intuitive, but it would be much easier for people to get what they expect. If they use indexing on a NumPy array, they know what they get (a view, a copy, ...), correctly indexing a SoundFile
object would be something they have to learn.
About indexing being convenient:
It doesn't matter if it is convenient or if it is a core feature.
As long as it cannot be implemented right (which it cannot, as I stated above), it shouldn't be implemented.
About returning new memory:
This would only in a few cases save memory, in others it would waste memory, in all cases it would waste CPU.
I didn't run into problems regarding memory usage, but I'm convinced users can handle their memory requirements better if they are able to use the ndarray
s they are used to instead of indexing into a SoundFile
object which they might not be very familiar with.
It is impossible with libsndfile to read single channels therefore PySoundFile shouldn't claim to do it either.
Regarding lifetime:
Of course the SoundFile
objects in your example don't persist after the line they're on, but the returned array does.
And if it is a view, it still needs the whole original array (which was called np_data
in read()
) and therefore the original CFFI buffer stays in memory (and can be accessed with wave.base
in your example).
If you read two channels from the same SoundFile
object (but indexing two times separately), each time a new CFFI buffer (holding all channels) will be created.
Example (with my strange test file, 15 frames, 7 channels):
from pysoundfile import SoundFile
with SoundFile('data/test_wav_pcm16.wav') as f:
sig1 = f[:, 2]
sig2 = f[:, 5]
sig1.size, sig2.size, sig1.base.size, sig2.base.size
>>> (15, 15, 105, 105)
The two buffers hold the same data (the whole file contents) but at separate memory locations:
all(sig1.base == sig2.base)
>>> True
sig1.base is sig2.base
>>> False
If PySoundFile would never return single channels (which I strongly suggest), this can still happen but then It's the user's fault.
Regarding the necessity of a wavread
-like functionality:
Yes! I really want (even need) such a thing!
I'm trying to convert people from using Matlab to using Python+NumPy+... and every obstacle will make it harder for them and therefore for me.
It might seem like a small thing but it is hard to explain why it is necessary to append [:]
or .read()
to the constructor and why there isn't a simpler way to do that (using a function).
Apart from that, reading and writing audio files is also very common in interactive sessions, and saving a few keystrokes (especially brackets and stuff that cannot be auto-completed) can make a big difference.
Of course it would be easy for me to implement it locally, but regardless how trivial it may be, I'll have to do this for each project, for each example I show someone, each time I quickly want to try something, ...
It would make my life (and that of my colleagues) easier if such convenience functions were included in PySoundFile.
Don't get me wrong, I definitely don't want a Matlab-clone, I just want a clear and easy high-level interface (in addition to a fully-fledged low-level interface).
In closing:
I will implement said convenience functions anyway and you can decide after I show them to you if you like them or not.
from python-soundfile.
Thank you very much for pointing out that bug! Quite a nasty one, that. I fondly remember working with Lua, where only false
and nil
evaluated to False
. That makes so much more sense...
from python-soundfile.
Regarding wavread
, I concede your point. It should not be called wavread
, though, since it can read so much more than wav
s. How about soundread
, audioread
, or just simply read
?
from pysoundfile import read as wavread
wavread('file.wav')
import pysoundfile as sf
sf.read('file.wav')
Looks good to me.
from python-soundfile.
Regarding indexing:
I understand everything you are saying, but I really want that feature. As you rightly point out, there are many issues with indexing in general, and two-dimensional indexing in particular, but that feature is just incredibly powerful for my particular use cases. It is implemented as a simple wrapper over read
, seek
and indexing the Numpy result, and those are the very steps I would have to take manually every time, otherwise.
In the end, I think this comes down to the same point you were making about wavread
: The functionality is trivial and could be easily implemented locally. But it saves a few key strokes and it makes things easier to explain. And just like wavread
is crucial for you, so is indexing for me. I think PySoundFile can accomodate both use cases, and make both of us happy.
from python-soundfile.
Responding as per your Email, Basti, I think I agree on all points here: a trivial read function is quite frankly practical, and if it's so simple why not include it? But the indexing is also quite practical, if, as it turns out, somewhat problematic. However, if I wanted to index separate non-neighbouring channels I would instinctively just get the entire numpy array and index from that, as opposed to the example given by mgeier. Put differently: I don't think I would use two-dimensional indexing on a SoundFile
object to do more than [a:b, c:d]
style indexing. Anything fancier requires a proper numpy array anyway, e.g., the appropriately termed "fancy indexing" (although, since you pass second_frame
to the numpy array, I believe you could do something like SoundFile(...)[(a, b, c), :]
to select separate channels).
from python-soundfile.
I guess both features (SoundFile
indexing and a high-level read()
function) don't interfere with each other, so it shouldn't be a problem for both to coexist.
@bastibe: just out of curiosity ... can you give an example where indexing into a SoundFile
is more powerful/easy/efficient/beautiful/... than, say, indexing into a NumPy array after it has been returned from a (still hypothetical) read()
function?
from python-soundfile.
@mgeier Basically, any time I want to access only part of a file:
with SoundFile as s:
data = s[44100:88200]
This will only read the second second of the file, without reading the second before or anything after.
You could also write it like this:
with SoundFile as s:
s.seek(44100)
data = s.read(44100)
But I think the first example is much cleaner and easier to understand. I had one application that was churning through some 3000 audio files on 8 simultaneus threads. This application was wicked fast and used next to no memory, mainly because PySoundFile allowed us to only load a tiny part of a file at a time and never having to do big allocations. Also, the algorithm could only ever process 1024 samples at a time anyway, so loading more than that would have been a waste.
On a different note, read
, seek
and write
work in relation to an invisible current read pointer. Consecutiive calls to read
seek
and write
will have different effects depending on that invisible pointer. I think that is very confusing. You can't just pass the SoundFile object around and expect random functions to read
and write
with any kind of confidence.
Indexing on the other hand is a deterministic operation that will always do the same thing. You can pass the SoundFile object into some random function and have it play back the file, while simultaneusly doing some signal processing with it on the main thread. This works even for very large files that you wouldn't want to load in their entirity.
On the other hand, there is a whole class of algorithms that just wants to step through a file one block after another. For that, read
, write
and seek
works perfectly: I think both APIs have their merit and use cases. I have used both techniques in different contexts to great effect.
from python-soundfile.
@bastibe: thanks for the examples
To make my counter-examples a bit less hypothetical, I implemented the suggested features in #18.
If you only want to read a certain part of a file, I would suggest the high-level function read()
:
import pysoundfile as sf
myarray = sf.read('myfile.wav', start=44100, stop=88200)
or
myarray = sf.read('myfile.wav', frames=44100, start=44100)
I think that's clearer (more explicit) and more flexible (either frames
or stop
can be specified) than your indexing example.
Admittedly, this (not explicitly closing the file on read shouldn't be a problem):
myarray = sf.open('myfile.wav')[44100:88200]
... is a little less typing than
myarray = sf.read('myfile.wav', start=44100, stop=88200)
... but if you want to avoid some typing, you can use positional parameters:
myarray = sf.read('myfile.wav',44100,44100)
... which is even one character less than the indexing solution, although I wouldn't recommend that outside of an interactive session because it will be confusing what the different numbers actually mean.
As for your multi-threading example: I suspect that it is totally not safe to access the same SoundFile
object from different threads (even for reading) exactly for the reason of the read/write pointer.
It's true that you internally re-set the pointer after the indexing operation, but the whole operation is totally not atomic. Correct me if I'm wrong, I don't know much about Python's threading mechanism.
If I'm not mistaken, your example is unsafe and therefore your point is moot.
If you are only reading, it should be perfectly safe to open the same file twice (into two different SoundFile
objects), which I would suggest in your multi-threading scenario.
If one of the threads is writing, much more care has to be taken anyway.
from python-soundfile.
Oh my, I finally get what you have been saying all along. All this time, you were talking about the read
function, and I was thinking you were talking about the read
method. See, I didn't get why you would want to introduce the keyword arguments for the method. For the function, they make a lot of sense! You are right, I am wrong. Sorry about that.
from python-soundfile.
Yes, and my examples open and close the file each time. If you want to read the file in 1024-frame-chunks, you can just use the read()
method repeatedly (s.read(1024)
) as you mentioned.
You still have to convince me that there is a situation where you want to jump around in an open file where read()
and seek_*()
is not enough and you really need indexing.
from python-soundfile.
When doing spectral processing, I frequently want to read consecutive, overlapping blocks of data. So the first block might be [0:256]
, the next [128:384]
, then [256:512]
and so on. You can implement this using read
and seek
like this:
while seek(0) < len(f)-256:
data = f.read(256)
f.seek(-128)
I think it is much cleaner to do something like
for idx in range(0, len(f)-256, 128):
data = f[idx:idx+256]
from python-soundfile.
Reading overlapping sections is indeed an interesting example!
I agree that you second example is better than the first, but what about this:
for idx in range(0, len(f) - 256, 128):
f.seek_absolute(idx)
data = f.read(256)
Sure, this is one more line, but I think it would be more explicit, less obscure and therefore easier to read and maintain.
from python-soundfile.
I disagree. I think indexing is much easier to understand than seek
and read
. Sorry.
from python-soundfile.
Fair enough.
I'm closing this issue, the actual changes were done by @bastibe in ac983e5.
from python-soundfile.
Related Issues (20)
- Feature request: Support interleaved stereo data HOT 4
- buffer_read doesn't work in basic test HOT 2
- Soundfile read/write wav is not symmetric with default arguments HOT 3
- module 'soundfile' has no attribute 'SoundFileRuntimeError' HOT 5
- Value Error: array is too big when loading .flac file
- soundfile crashes when trying to export long .ogg file HOT 1
- soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x0000022B81C21850>: Format not recognised. HOT 2
- Cannot suppress warnings when decoding mp3
- Cannot open Ogg FLAC file: unknown error in flac decoder HOT 1
- miniforge3/lib/python3.9/site-packages/soundfile.py", line 1229, in _init_virtual_io def vio_get_filelen(user_data): MemoryError: Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks
- Cannot write Ogg files over 96 seconds long at 44.1 kHz HOT 4
- soundfile.info does not work on mp3 HOT 2
- soundfile.LibsndfileError: <exception str() failed> HOT 6
- How to save left and right channel in separate files? HOT 1
- soundfile.LibsndfileError: Error : unknown error in flac decoder. HOT 2
- "Format not recognised." for m4a files HOT 2
- soundfile.LibsndfileError: Error opening <tempfile._TemporaryFileWrapper object at 0x7feadae59070>: Format not recognised. HOT 2
- Part of mp3 is loaded. HOT 2
- unknown error in flac decoder HOT 2
- Capture libmpg123/libsndfile warnings HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-soundfile.