Comments (13)
Interesting idea. On the one hand, (frames,)
is certainly easier to work with for mono signals. On the other hand, this would lead to different indexing styles depending on the number of channels in the signal. For this reason alone, I think it is more consistent to always use (frames, channels)
, even if channels
is 1
in some cases.
if signal.ndim == 1:
first_sample = signal[0]
else:
first_sample = signal[0, 1]
# vs
first_sample = signal[0, 1]
It also enables code like mono_signal = signal.sum(axis=1)
to work regardless of the number of channels.
Also, this makes determining the number of channels easier, in particular when reading the whole file on one big matrix:
signal = sf.read()
channels = signal.shape[1]
# vs
signal = sf.read()
channels = 1 if signal.ndim == 1 else signal.shape[1]
from python-soundfile.
Is this still up for discussion or should we close this?
from python-soundfile.
Please keep it open ...
I still think that mono files should be returned as one-dimensional arrays.
And as input, one-dimensional arrays should be supported anyway.
I was trying to find some examples within NumPy, but I still don't have really convincing arguments ...
np.arange(42)
returns a one-dimensional array.
Incidentally, your example signal.sum(axis=1)
also returns a one-dimensional array!
Are there any counter-examples?
My original argument was indexing, but in the meantime I found out there are two ways to do it: a[:, 0]
returns the first column as one-dimensional array, a[:, 0:1]
returns the same, but as two-dimensional array with one column.
Many functions in NumPy have a axis
argument, probably this should be also added to the read()
and write()
methods.
Regarding your examples:
I would check for one-dimensionality and reshape if needed. After that, there are no special cases anymore:
channels = np.asarray(channels)
# check if ndim < 1 or > 2?
if channels.ndim == 1:
channels.reshape(-1, 1)
# now continue as usual
channels = signal.shape[1]
Of course, an axis
argument would make this a bit more complicated.
from python-soundfile.
The point of signal.sum(axis=1)
is to return a one-dimensional array. For example signal.mean(axis=1)
can be used to downmix any signal to a mono array regardless of the number of channels. If we returned an array, the call would lead to an error.
I think it is very valuable that we return a predictable, consistent results. Returning different data types in different situations is confusing and might lead to errors. Thus, I prefer always returning a matrix to sometimes-a-matrix, sometimes-an-array.
I am not sure about the axis
argument. In the context of pysoundfile, channel
might make more sense. Of course, that argument would suffer from the very same issues that two dimensional indexing suffers from. But you are right, if we are to embrace those issues anyway, we might as well provide the same functionality for both indexing and read
.
from python-soundfile.
The point of
signal.sum(axis=1)
is to return a one-dimensional array. For examplesignal.mean(axis=1)
can be used to downmix any signal to a mono array regardless of the number of channels. If we returned an array, the call would lead to an error.
I'm not getting your point here ... the "mono array" is a one-dimensional array ... so should SoundFile.read()
also return a one-dimensional array for mono files (as I'm suggesting) or not?
I think it is very valuable that we return a predictable, consistent results.
I agree.
Returning different data types in different situations is confusing and might lead to errors.
But if NumPy itself returns arrays with different dimensionality in different situations, this is what a user would expect, isn't it?
Thus, I prefer always returning a matrix to sometimes-a-matrix, sometimes-an-array.
With "matrix" I guess you mean a 2-dimensional np.ndarray
and not a np.matrix
, right?
I still think there is nothing wrong with returning a 2-dimensional array for multi-channel files and a 1-dimensional array for mono files.
I am not sure about the
axis
argument. In the context of pysoundfile,channel
might make more sense.
What about channelaxis
?
Of course, that argument would suffer from the very same issues that two dimensional indexing suffers from. But you are right, if we are to embrace those issues anyway, we might as well provide the same functionality for both indexing and read.
I don't see the link to 2-dimensional indexing, can you please elaborate?
from python-soundfile.
All I'm saying is that if I have a piece of code that expects a matrix (meaning an np.ndarray
with two dimensions), very often that code will not work with a vector (meaning an np.ndarray
with one dimension). Thus, I think that always returning a matrix is preferable to returning sometimes a vector, sometimes a matrix.
from python-soundfile.
I still couldn't find any really convincing argument for either standpoint ...
But we could at least make the behavior explicit in the function signatures.
I suggest an argument channelaxis
with default value 0
which would mean the channels are stored in the columns of the arrays. This would also make sense for one-dimensional arrays where the only axis is 0
anyway.
An additional keyword argument force_2d
should be used to explicitly state if mono files are returned as one-dimensional arrays.
If we use force_2d=True
as default value, the default behavior wouldn't change at all, but I think it would be much clearer what happens with mono files and if someone needs it, they can be returned as one-dimensional arrays with force_2d=False
.
I think this would be a reasonable solution for this issue, but probably the names of my proposed arguments could be improved ...
There is still an annoying corner case when returning only one frame of a multi-channel file ... in this case even I would return a two-dimensional array, although I don't really know what the consistent behavior would be ... anyway, this should be extremely rare ...
from python-soundfile.
I am not sure about channelaxis
. I think the canonical term for matrices would be row-major and column-major, which would probably translate to channel-major and frame-major. In audio terms, frame-major is more often called interleaved and it is pretty ubiquitous. At the very least, we should change the name to one of these, though I don't know which one.
Given that interleaved data is so ubiquitous and that setting channelaxis
is more complicated than appending .T
to the call, I don't really see the value of adding it as a separate argument at all though.
Your idea about force_2d
is a very good one (although I would rather call it always_2d
). I agree that this is a very reasonable solution.
The single-frame-of-multi-channel is a great example! It neatly shows why returning a frames x channels matrix is better than returning an np.ndarray
with the minimal amount of dimensions. That frame would simply be returned as [[left, right]]
.
from python-soundfile.
I like the name always_2d
, I was thinking about names with "mono" in it, but I couldn't come up with anything useful.
Of course, channelaxis
could be replaced by appending .T
, but this wouldn't help making PySoundFile's behavior explicit. A user would have to know the behavior or look it up somewhere deep in the documentation.
It is an arbitrary choice that channels are represented as columns, I would like to make this choice explicit by the argument channelaxis=0
. And if someone prefers, they might as well represent channels as rows with channelaxis=1
(or channelaxis=-1
).
The alternative names mentioned above don't work because they all talk about memory order, but channelaxis
doesn't say anything about that.
Of course channelaxis=0
is more efficient with C-contiguous arrays and channelaxis=1
with Fortran-contiguous data, because in those cases the data is already interleaved in memory and can be handed to libsndfile without additional copy.
I was thinking about the argument axis
to fft()
and sum()
, but they have slightly different behavior and default values. And they are meant to also work with more than 2 dimensions, which we will never support.
So maybe an integer argument isn't the right thing here, a boolean argument should provide all options we need (exactly two).
But how to call it?
An argument channels_are_stored_in_columns=True
wouldn't work, because (except from being long and ugly) it doesn't make sense for one-dimensional arrays (which I still would like to support).
from python-soundfile.
You are right. We should advertise matrix orientation and not memory order. How about channel_major=True
then? Or even major_axis=channel
? Sometimes I wish Python had something like rubies :constants
or lisp 'symbols
.
from python-soundfile.
Well, since version 3.4, Python has Enum
s, which doesn't help at all if we want to support 2.6+ ... but in any case, we would still need a name for the argument.
I think everything with "major" in it suggests memory order. In channel_major
I cannot read any information about columns/rows, while in major_axis
I don't see the connection between axes and channels.
I think channels_in_second_axis=False
(or channels_in_first_axis=True
) would mean the exactly right thing, but probably it sounds too cryptic.
This would have the advantage that channels_in_second_axis=True
implies that we have to be dealing with a 2-dimensional array.
This could be shortened to channels_first=True
, but I guess this would be even harder to understand.
More suggestions?
from python-soundfile.
I'm closing this.
We decided to drop the axis
thing because we couldn't find a good name and it would have needed a lot of additional code with very questionable overall benefit.
The original question about mono files was resolved in #36 with always_2d
.
from python-soundfile.
Just for the record:
I stumbled upon an alternative name for always_2d
: numpy.loadtxt() uses ndmin
.
Our default would be ndmin=2
.
I still like always_2d
better, I'm mentioning this here only for completeness' sake.
They also have an alternative for the suggested option channelaxis
: they use unpack
, which is supposed to transpose the array and allow tuple unpacking to get the individual "channels".
Anyway, I don't think this would be helpful in our case.
Again, just for the record ...
from python-soundfile.
Related Issues (20)
- numpy is a dependency, but this is not declared HOT 3
- Feature request: Support interleaved stereo data HOT 4
- buffer_read doesn't work in basic test HOT 2
- Soundfile read/write wav is not symmetric with default arguments HOT 3
- module 'soundfile' has no attribute 'SoundFileRuntimeError' HOT 5
- Value Error: array is too big when loading .flac file
- soundfile crashes when trying to export long .ogg file HOT 1
- soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x0000022B81C21850>: Format not recognised. HOT 2
- Cannot suppress warnings when decoding mp3
- Cannot open Ogg FLAC file: unknown error in flac decoder HOT 1
- miniforge3/lib/python3.9/site-packages/soundfile.py", line 1229, in _init_virtual_io def vio_get_filelen(user_data): MemoryError: Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks
- Cannot write Ogg files over 96 seconds long at 44.1 kHz HOT 4
- soundfile.info does not work on mp3 HOT 2
- soundfile.LibsndfileError: <exception str() failed> HOT 6
- How to save left and right channel in separate files? HOT 1
- soundfile.LibsndfileError: Error : unknown error in flac decoder. HOT 2
- "Format not recognised." for m4a files HOT 2
- soundfile.LibsndfileError: Error opening <tempfile._TemporaryFileWrapper object at 0x7feadae59070>: Format not recognised. HOT 2
- Part of mp3 is loaded. HOT 2
- unknown error in flac decoder HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-soundfile.