iluvcapra / wavinfo Goto Github PK

View Code? Open in Web Editor NEW

30.0 3.0 7.0 16.67 MB

Probe WAVE Files for all metadata

Home Page: https://wavinfo.readthedocs.io/

License: MIT License

Python 83.26% Roff 16.74%

filmmaking audio-library wav python3 audio-applications metadata-extraction audio metadata

wavinfo's Introduction

wavinfo

See the note below about version 3.

The wavinfo package allows you to probe WAVE and RF64/WAVE files and extract extended metadata. wavinfo has an emphasis on film, video and professional music production but aspires to be the encyclopedic and final source for all WAVE file metadata.

Metadata Support

wavinfo reads:

All defined Broadcast-WAVE fields, including embedded program loudness, coding history and SMPTE UMID.
iXML production recorder metadata, including project, scene, and take tags, recorder notes and file family information.
- iXML STEINBERG sound library attributes.
All known RIFF INFO metadata fields.
Audio Definition Model (ADM) track metadata and schema, including channel, pack formats, object, content and programme, including Dolby Digital Plus and Dolby Atmos dbmd metadata for re-renders and mixdowns.
Wave embedded cue markers, cue marker labels, notes and timed ranges as used by Zoom, iZotope RX, etc.
The wav format is also parsed, so you can access the basic sample rate and channel count information.

How To Use

The entry point for wavinfo is the WavInfoReader class.

from wavinfo import WavInfoReader

path = '../tests/test_files/A101_1.WAV'

info = WavInfoReader(path)

adm_metadata = info.adm
ixml_metadata = info.ixml

The package also installs a shell command:

$ wavinfo test_files/A101_1.WAV

Version 3 Coming Soon!

Version 3 is under active development and will be released in the near future. Version 3 will support editing of Broadcast-WAVE and INFO metadata, with more formats to be added.

There will be some minor breaking changes with the interface which is why I'm bumping to version 3, these will be documented and should be easy to update for.

Contributions!

Any new or different kind of metadata you find, or any new or different use of exising metadata you encounter, please submit an Issue or Pull Request!

Other Resources

For other file formats and ID3 decoding, look at audio-metadata.

wavinfo's People

Contributors

Stargazers

Watchers

Forkers

smithbro2115 2mmi elibroftw soundappraisal runngezhang kakyoism dannyniu

wavinfo's Issues

Support RF64 files

See #2

Add str methods to Classes

When I do print(object) where object is a WavInfoChunkReader or WavInfoReader, the output should be more informative. I was thinking you should print a dictionary of their attributes.

Wavinfo doesn't work with filehandles

I have a soundfile in memory which I received over a socket as a bytestring: data. Calling io.BytesIO(data) I can convert this to a filehandle like streamobject.

I would like to pass this to wavinfo like this:

soundbytes = io.BytesIO(data)
wav_info = WavInfoReader(soundbytes)

and then process the output:

info_dict = {"channels": wav_info.fmt.channel_count,
             "bits_per_sample": wav_info.fmt.bits_per_sample,
             "size_in_frames": wav_info.data.frame_count,
             "size_in_bytes": wav_info.data.byte_count,
            }

The changes needed to make this possible are modest, see pull request: #11 and keep the filename based functionality untouched.

While working on this it turned out that the repr function was not functional due to mixing of old and new style string formatting. In the pull request it is changed to new style string formatting.

IXML on Zoom F8

I came across your project whilst looking for ways to read iXML data from WAVs for a task I need to do. It’s really good, however, I tried it on some files recorded from a ZOOM F8 Multitrack recorder and it always raise an exception parsing the xml. I ran the tests that came with the project and all is good, but it still failed with these files as below:

Traceback (most recent call last):
File "/Users/declan/git/WAV/wavinfo/wavinfo/wave_reader.py", line 162, in
info = WavInfoReader(path)
File "/Users/declan/git/WAV/wavinfo/wavinfo/wave_reader.py", line 62, in init
self.ixml = self._get_ixml(f)
File "/Users/declan/git/WAV/wavinfo/wavinfo/wave_reader.py", line 141, in _get_ixml
return WavIXMLFormat(ixml_string)
File "/Users/declan/git/WAV/wavinfo/wavinfo/wave_ixml_reader.py", line 15, in init
self.parsed = ET.parse(xmlBytes)
File "/Users/declan/pybuild/py27/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "/Users/declan/pybuild/py27/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/Users/declan/pybuild/py27/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/Users/declan/pybuild/py27/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 66, column 9

On further inspection it would seem that the Zoom recorder uses all the 5226 bytes allocated and fills them with 0, so I have added the following line in wav_reader.py at line 138:

ixml_string = ixml_data[0:ixml_data.find(chr(0))]
Not sure what the consequences of this are but seems to work and doesn’t break the test cases.

I’m also adding in some more properties to extract track details such as channel number and name.

Let me know if this is of interest and whether I’m on the right track & if you want me to formally add the changes to a branch and add new test cases etc.

Broadcast wav files throw errors

Hi, I am encountering various errors when trying to open broadcas wav files created with the professional DAW systems Pyramix and Sequoia.

Sequoia uses the RF64 header and calling WavInfoReader on these simply leads to:

Traceback (most recent call last):
  File "C:/Users/xxx/PycharmProjects/test/scratch.py", line 3, in <module>
    info = WavInfoReader("source files/seq_test.wav")
  File "C:\Users\xxx\PycharmProjects\test\venv\lib\site-packages\wavinfo\wave_reader.py", line 41, in __init__
    self.main_list = chunks.children
AttributeError: 'ChunkDescriptor' object has no attribute 'children'

Whereas Pyramix seems to write only the standard RIFF header. Nevertheless trying to parse one of it's files throws:

Traceback (most recent call last):
  File "C:/Users/xxx/PycharmProjects/test/scratch.py", line 3, in <module>
    info = WavInfoReader("source files/test.wav")
  File "C:\Users\xxx\PycharmProjects\test\venv\lib\site-packages\wavinfo\wave_reader.py", line 51, in __init__
    self.ixml   = self._get_ixml(f)
  File "C:\Users\xxx\PycharmProjects\test\venv\lib\site-packages\wavinfo\wave_reader.py", line 129, in _get_ixml
    return WavIXMLFormat(ixml_string)
  File "C:\Users\xxx\PycharmProjects\test\venv\lib\site-packages\wavinfo\wave_ixml_reader.py", line 11, in __init__
    self.parsed = ET.parse(xmlBytes)
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\xml\etree\ElementTree.py", line 1197, in parse
    tree.parse(source, parser)
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\xml\etree\ElementTree.py", line 598, in parse
    self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 17, column 9

ProTools created wav's however seem to be interpreted just fine.

Incompatible dependencies with PyTest

First of all, thanks for this little tool! It proves very useful in our projects.

However, we now have a dependency conflict problem, see the error report from Poetry below

  SolverProblemError

  Because no versions of ear match <1.0.0 || >1.0.0,<1.0.1 || >1.0.1,<1.1.0 || >1.1.0,<1.1.1 || >1.1.1,<1.1.2 || >1.1.2,<1.2.0 || >1.2.0,<2.0.0 || >2.0.0
   and ear (1.0.0) depends on attrs (>=17.4,<18.0), ear (<1.0.1 || >1.0.1,<1.1.0 || >1.1.0,<1.1.1 || >1.1.1,<1.1.2 || >1.1.2,<1.2.0 || >1.2.0,<2.0.0 || >2.0.0) requires attrs (>=17.4,<18.0).
  And because ear (1.0.1) depends on attrs (>=17.4,<18.0)
   and ear (1.1.0) depends on attrs (>=17.4,<18.0), ear (<1.1.1 || >1.1.1,<1.1.2 || >1.1.2,<1.2.0 || >1.2.0,<2.0.0 || >2.0.0) requires attrs (>=17.4,<18.0).
  And because ear (1.1.1) depends on attrs (>=17.4,<18.0)
   and ear (1.1.2) depends on attrs (>=17.4,<18.0), ear (<1.2.0 || >1.2.0,<2.0.0 || >2.0.0) requires attrs (>=17.4,<18.0).
  And because ear (1.2.0) depends on attrs (>=17.4,<18.0)
   and ear (2.0.0) depends on attrs (>=17.4,<18.0), every version of ear requires attrs (>=17.4,<18.0).
  And because wavinfo (1.6.3) depends on ear (*)
   and pytest (6.2.0) depends on attrs (>=19.2.0), wavinfo (1.6.3) is incompatible with pytest (6.2.0).
  So, because MyProj depends on both pytest (6.2.0) and wavinfo (1.6.3), version solving failed.

I understand that it's mainly because ear has not been updated for a while.
But can we do something here?

Thanks again!

struct.error: unpack requires a buffer of 4 bytes

Hi.

Maybe this is a bug ?
I have some files that throw this error.

Traceback (most recent call last):
File "C:/Users/Tommi/PycharmProjects/netmix/FileScanner/filescanner.py", line 19, in
info = WavInfoReader("lyde\DRCD-04_02_01.wav", bext_encoding="latin1")
File "C:\Python37\lib\site-packages\wavinfo\wave_reader.py", line 50, in init
chunks = parse_chunk(f)
File "C:\Python37\lib\site-packages\wavinfo\riff_parser.py", line 63, in parse_chunk
return parse_list_chunk(stream=stream, length=size, rf64_context=rf64_context)
File "C:\Python37\lib\site-packages\wavinfo\riff_parser.py", line 35, in parse_list_chunk
child_chunk = parse_chunk(stream, rf64_context= rf64_context)
File "C:\Python37\lib\site-packages\wavinfo\riff_parser.py", line 49, in parse_chunk
size = struct.unpack('<I',sizeb)[0]
struct.error: unpack requires a buffer of 4 bytes

Right now i did a ugly hack in riff_parser.py at line 49 :)
size = struct.unpack('<I',sizeb)[0] if sizeb else 0

Here is the wav file zippet (to big to post on github):
https://drive.google.com/file/d/13GCkLJHvux7LBG0Z_fk_4FRcbRgm_LgC/view?usp=sharing

Thanks, Tommi

(feature request) : `cue ` chunks

Apparently 'cue' chunks are defined since the first 1991 specs from IBM & MS : https://www.aelius.com/njh/wavemetatools/doc/riffmci.pdf . Some recorders (at least some Tascam and Zoom handhelds) use this feature for markers : during recording, one can press a button that creates a mark at the current time.

It would be nice for wavinfo to be able to parse this chunk type, and ideally iterate over a list of cue points (would be useful for e.g. splitting a file, which is my usecase).

AttributeError: 'WavBextReader' object has no attribute 'originator_date'

Hi, I am trying to get the recorded date and time of my audio file (original date and time). I have installed the pip install wavinfo==1.0 on the condo virtual environment. I am trying the following code:

print(info.bext.description)
print("----------")
print("Originator:", info.bext.originator)
print("Originator Ref:", info.bext.originator_ref)
print("Originator Date:", info.bext.originator_date)
print("Originator Time:", info.bext.originator_time)
print("Time Reference:", info.bext.time_reference)
print(info.bext.coding_history)

my error is as follows:
AttributeError Traceback (most recent call last)
in
----> 1 print(info.bext.description)
2 print("----------")
3 print("Originator:", info.bext.originator)
4 print("Originator Ref:", info.bext.originator_ref)
5 print("Originator Date:", info.bext.originator_date)

AttributeError: 'WavBextReader' object has no attribute 'description'

Please let me know if I am missing something or else, how to fix this attribute error.

Thanks a lot!

Strong growth of the number of dependencies

The new release shows a strong growth of the number of dependencies in requirements.txt. For our use case this is a potential problem, as we prefer to keep the number of packages installed on our microphone processing software limited. We like to install with the minimal possible set needed for running and testing.

Also the requirements pins all the version to a specific version, rather than specifying a minimal version.

Is it possible to limit the number of dependencies and make them more flexible?

Failed to decode non-ASCII text

Repro

In Reaper, import audio files that embed UTF-8 file paths containing non-ASCII characters
Render files with bext enabled in Render dialog: This embeds the path to the Reaper project (/path/to/路径/我的.rpp) into bext
Use wavinfo to create a reader,

reader = wavinfo.WaveInfoReader('/path/to/my.wav')

Observed

This generates errors

Unhandled exception:
Traceback (most recent call last):
  File "F:\my.py", line 44, in main
    self._extract_rpp_path()
  File "F:\my.py", line 55, in _extract_rpp_path
    reader = wavinfo.WavInfoReader(self.args.path)
  File "F:\.venv\lib\site-packages\wavinfo\wave_reader.py", line 59, in __init__
    self.bext = self._get_bext(f, encoding=bext_encoding)
  File "F:\.venv\lib\site-packages\wavinfo\wave_reader.py", line 117, in _get_bext
    return WavBextReader(bext_data, encoding) if bext_data else None
  File "F:\.venv\lib\site-packages\wavinfo\wave_bext_reader.py", line 26, in __init__
    self.description = sanitize_bytes(unpacked[0])
  File "F:\.venv\lib\site-packages\wavinfo\wave_bext_reader.py", line 22, in sanitize_bytes
    decoded = trimmed.decode(encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 35: ordinal not in range(128)

Expected

wavinfo should support arbitrary text codec or at least UTF-8

Raspbian: libxslt.so.1: cannot open shared object file: No such file or directory

from wavinfo import WavInfoReader

Received the following fault on a clean install Raspbian.

uname -a
Linux xxxxx 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pi/.local/lib/python3.7/site-packages/wavinfo/__init__.py", line 7, in <module>
    from .wave_reader import WavInfoReader
  File "/home/pi/.local/lib/python3.7/site-packages/wavinfo/wave_reader.py", line 13, in <module>
    from .wave_ixml_reader import WavIXMLFormat
  File "/home/pi/.local/lib/python3.7/site-packages/wavinfo/wave_ixml_reader.py", line 2, in <module>
    from lxml import etree as ET
ImportError: libxslt.so.1: cannot open shared object file: No such file or directory