Giter VIP home page Giter VIP logo

crowsetta's Introduction



A core package for acoustic communication research in Python

Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. Build Status Documentation Status DOI PyPI version PyPI Python versions codecov

All Contributors

There are many great software tools for researchers studying acoustic communication in animals1. But our research groups work with a wide range of different data formats: for audio, for array data, for annotations. This means we write a lot of low-level code to deal with those formats, and then our code for analyses is tightly coupled to those formats. In turn, this makes it hard for other groups to read our code, and it takes a real investment to understand our analyses, workflows and pipelines. It also means that it requires significant work to translate from a pipeline or analysis worked out by a scientist-coder in a Jupyter notebook into a generalized, robust service provided by an application.

In particular, acoustic communication researchers working with the Python programming language face these problems. How can our scripts and libraries talk to each other? Luckily, Python is a great glue language! Let's use it to solve these problems.

The goals of VocalPy are to:

  • make it easy to work with a wide array of data formats: audio, array (spectrograms, features), annotation
  • provide classes that represent commonly-used data types: audio, spectograms, features, annotations
  • provide classes that represent common processes and steps in pipelines: segmenting audio, computing spectrograms, extracting features
  • make it easier for scientist-coders to flexibly and iteratively build datasets, without needing to deal directly with a database if they don't want to
  • make it possible to re-use code you have already written for your own research group
  • and finally:
    • make code easier to share and read across research groups, by providing these classes, and idiomiatic ways of coding with them; think of VocalPy as an interoperability layer and a common language
    • facilitate collaboration between scientist-coders writing imperative analysis scripts and research software engineers developing libraries and applications

A paper introducing VocalPy and its design has been accepted at Forum Acusticum 2023 as part of the session "Open-source software and cutting-edge applications in bio-acoustics", and will be published in the proceedings.

Features

Data types for acoustic communication data: audio, spectrogram, annotations, features

The vocalpy.Sound data type

  • Works with a wide array of audio formats, thanks to soundfile.
  • Also works with the cbin audio format saved by the LabView app EvTAF used by many neuroscience labs studying birdsong, thanks to evfuncs.
>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/source/audio_wav_annot_birdsongrec/Bird0/Wave/')
>>> wav_paths = voc.paths.from_dir(data_dir, 'wav')
>>> audios = [voc.Sound.read(wav_path) for wav_path in wav_paths]
>>> print(audios[0])
vocalpy.Sound(data=array([3.0517...66210938e-04]), samplerate=32000, channels=1),
path = tests / data -
for -tests / source / audio_wav_annot_birdsongrec / Bird0 / Wave / 0.wav)

The vocalpy.Spectrogram data type

  • Save expensive-to-compute spectrograms to array files, so you don't regenerate them over and over again
>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/generated/spect_npz/')
>>> spect_paths = voc.paths.from_dir(data_dir, 'wav.npz')
>>> spects = [voc.Spectrogram.read(spect_path) for spect_path in spect_paths]
>>> print(spects[0])
vocalpy.Spectrogram(data=array([[3.463...7970774e-14]]), frequencies=array([    0....7.5, 16000. ]), times=array([0.008,...7.648, 7.65 ]), 
path=PosixPath('tests/data-for-tests/generated/spect_npz/0.wav.npz'), audio_path=None)

The vocalpy.Annotation data type

  • Load many different annotation formats using the pyOpenSci package crowsetta
>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/')
>>> notmat_paths = voc.paths.from_dir(data_dir, '.not.mat')
>>> annots = [voc.Annotation.read(notmat_path, format='notmat') for notmat_path in notmat_paths]
>>> print(annots[1])
Annotation(data=Annotation(annot_path=PosixPath('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/gy6or6_baseline_230312_0809.141.cbin.not.mat'), 
notated_path=PosixPath('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/gy6or6_baseline_230312_0809.141.cbin'), 
seq=<Sequence with 57 segments>), path=PosixPath('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/gy6or6_baseline_230312_0809.141.cbin.not.mat'))

Classes for common steps in your pipelines and workflows

A Segmenter for segmentation into sequences of units

>>> import evfuncs
>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/')
>>> cbin_paths = voc.paths.from_dir(data_dir, 'cbin')
>>> audios = [voc.Sound.read(cbin_path) for cbin_path in cbin_paths]
>>> segment_params = {'threshold': 1500, 'min_syl_dur': 0.01, 'min_silent_dur': 0.006}
>>> segmenter = voc.Segmenter(callback=evfuncs.segment_song, segment_params=segment_params)
>>> seqs = segmenter.segment(audios, parallelize=True)
[  ########################################] | 100% Completed | 122.91 ms
>>> print(seqs[1])
Sequence(units=[Unit(onset=2.19075, offset=2.20428125, label='-', audio=None, spectrogram=None),
                Unit(onset=2.35478125, offset=2.38815625, label='-', audio=None, spectrogram=None),
                Unit(onset=2.8410625, offset=2.86715625, label='-', audio=None, spectrogram=None),
                Unit(onset=3.48234375, offset=3.49371875, label='-', audio=None, spectrogram=None),
                Unit(onset=3.57021875, offset=3.60296875, label='-', audio=None, spectrogram=None),
                Unit(onset=3.64403125, offset=3.67721875, label='-', audio=None, spectrogram=None),
                Unit(onset=3.72228125, offset=3.74478125, label='-', audio=None, spectrogram=None),
                Unit(onset=3.8036875, offset=3.8158125, label='-', audio=None, spectrogram=None),
                Unit(onset=3.82328125, offset=3.83646875, label='-', audio=None, spectrogram=None),
                Unit(onset=4.13759375, offset=4.16346875, label='-', audio=None, spectrogram=None),
                Unit(onset=4.80278125, offset=4.814, label='-', audio=None, spectrogram=None),
                Unit(onset=4.908125, offset=4.922875, label='-', audio=None, spectrogram=None),
                Unit(onset=4.9643125, offset=4.992625, label='-', audio=None, spectrogram=None),
                Unit(onset=5.039625, offset=5.0506875, label='-', audio=None, spectrogram=None),
                Unit(onset=5.10165625, offset=5.1385, label='-', audio=None, spectrogram=None),
                Unit(onset=5.146875, offset=5.16203125, label='-', audio=None, spectrogram=None),
                Unit(onset=5.46390625, offset=5.49409375, label='-', audio=None, spectrogram=None),
                Unit(onset=6.14503125, offset=6.1565625, label='-', audio=None, spectrogram=None),
                Unit(onset=6.31003125, offset=6.346125, label='-', audio=None, spectrogram=None),
                Unit(onset=6.38996875, offset=6.4018125, label='-', audio=None, spectrogram=None),
                Unit(onset=6.46053125, offset=6.4796875, label='-', audio=None, spectrogram=None),
                Unit(onset=6.83525, offset=6.8643125, label='-', audio=None, spectrogram=None)], method='segment_song',
         segment_params={'threshold': 1500, 'min_syl_dur': 0.01, 'min_silent_dur': 0.006},
         audio=vocalpy.Sound(data=None, samplerate=None, channels=None), path=tests / data -
for -tests / source / audio_cbin_annot_notmat / gy6or6 / 032312 / gy6or6_baseline_230312_0809.141.cbin), spectrogram=None)

A SpectrogramMaker for computing spectrograms

>>> import vocalpy as voc
>>> wav_paths = voc.paths.from_dir('wav')
>>> audios = [voc.Sound(wav_path) for wav_path in wav_paths]
>>> spect_params = {'fft_size': 512, 'step_size': 64}
>>> spect_maker = voc.SpectrogramMaker(spect_params=spect_params)
>>> spects = spect_maker.make(audios, parallelize=True)

Datasets you flexibly build from pipelines and convert to databases

  • The vocalpy.dataset module contains classes that represent common types of datasets
  • You make these classes with outputs of your pipelines, e.g. a list of vocalpy.Sequences or vocalpy.Spectrograms
  • Because of the design of vocalpy, these datasets capture key metadata from your pipeline:
    • parameters and data provenance details; e.g., what parameters did you use to segment? What audio file did this sequence come from?
  • Then you can save the dataset along with metadata to databases, or later load from databases

A SequenceDataset for common analyses of sequences of units

>>> import evfuncs
>>> import vocalpy as voc
>>> data_dir = 'tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/'
>>> cbin_paths = voc.paths.from_dir(data_dir, 'cbin')
>>> audios = [voc.Sound.read(cbin_path) for cbin_path in cbin_paths]
>>> segment_params = {
  'threshold': 1500,
  'min_syl_dur': 0.01,
  'min_silent_dur': 0.006,
}
>>> segmenter = voc.Segmenter(
  callback=evfuncs.segment_song,
  segment_params=segment_params
)
>>> seqs = segmenter.segment(audios)
>>> seq_dataset = voc.dataset.SequenceDataset(sequences=seqs)
>>> seq_dataset.to_sqlite(db_name='gy6or6-032312.db', replace=True)
>>> print(seq_dataset)
SequenceDataset(sequences=[Sequence(units=[Unit(onset=2.18934375, offset=2.21, label='-', audio=None, spectrogram=None),
                                           Unit(onset=2.346125, offset=2.373125, label='-', audio=None,
                                                spectrogram=None), Unit(onset=2.50471875, offset=2.51546875,
                                                                        label='-', audio=None, spectrogram=None),
                                           Unit(onset=2.81909375, offset=2.84740625, label='-', audio=None,
                                                spectrogram=None),
                                           ...
                                          >>>  # test that we can load the dataset
                                          >>> seq_dataset_loaded = voc.dataset.SequenceDataset.from_sqlite(
  db_name='gy6or6-032312.db')
                                                                    >>> seq_dataset_loaded == seq_dataset
True

Installation

With pip

$ conda create -n vocalpy python=3.10
$ conda activate vocalpy
$ pip install vocalpy

With conda

$ conda create -n vocalpy python=3.10
$ conda activate vocalpy    
$ conda install vocalpy -c conda-forge

For more detail see Getting Started - Installation

Support

To report a bug or request a feature (such as a new annotation format), please use the issue tracker on GitHub:
https://github.com/vocalpy/vocalpy/issues

To ask a question about vocalpy, discuss its development, or share how you are using it, please start a new topic on the VocalPy forum with the vocalpy tag:
https://forum.vocalpy.org/

Contribute

Code of conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Contributing Guidelines

Below we provide some quick links, but you can learn more about how you can help and give feedback
by reading our Contributing Guide.

To ask a question about vocalpy, discuss its development, or share how you are using it, please start a new "Q&A" topic on the VocalPy forum with the vocalpy tag:
https://forum.vocalpy.org/

To report a bug, or to request a feature, please use the issue tracker on GitHub:
https://github.com/vocalpy/vocalpy/issues

CHANGELOG

You can see project history and work in progress in the CHANGELOG

License

The project is licensed under the BSD license.

Citation

If you use vocalpy, please cite the DOI:
DOI

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Ralph Emilio Peterson
Ralph Emilio Peterson

🤔 📓 📖 🐛 💻
Tetsuo Koyama
Tetsuo Koyama

📖

This project follows the all-contributors specification. Contributions of any kind welcome!

Footnotes

  1. For a curated collection, see https://github.com/rhine3/bioacoustics-software.

crowsetta's People

Contributors

allcontributors[bot] avatar nickledave avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

crowsetta's Issues

add `format2seq_func` parameter to `seq2csv`

so that user can avoid writing their own format2csv function

The argument will be a function such as notmat2seq, and if not None then seq2csv will take the seq argument and run it through this format2seq_func like so:

def seq2csv(seq, ..., format2seq_func=None):
     if format2seq_func is not None:
        seq = format2seq_func(seq)

add `header_segment_map` parameter to `csv2seq` function

so if user has a csv with header different from segment fields, can just provide mapping (i.e. a dict) that specifies which header fields (csv columns) correspond to Segment attributes

so with this header
Onsets, Offsets, Filename, SegmentLabel
you'd use

header_segment_map = {
    'Onsets': 'onsets_s',
    'Offsets': 'offsets_s',
    'Filename': 'file',
    'SegmentLabel': 'label'
    }
crowsetta.csv.csv2seq(csv_filename='my.csv', header_segment_map=header_segment_map)

add `from_excel` function / module

mainly as easier way to get stuff out of SAP?
Would be a convenience wrapper around csv2seq that knows to use Excel dialect and look for SAP field names

make Annotation class

that can have a Stack attribute or a Sequence attribute

mainly because it feels weird and counterintuitive to write

annot : crowsetta.Sequence

in docstrings. No-one will get why an annotation is a Sequence.

should have a mandatory annot_file attribute
and optional audio_file' and spect_file` attributes

have Sequence.segments return a "pretty printed" version?

Seems like __repr__ should be something like

Sequence(segments=15)

and then a pretty_print method would give something like

Sequence with 15 segments:
    Segment 1: label='a', onset_Hz=16000, offset_Hz=17500, onset_s=None, offset_s=None, file='0.wav'
    Segment 2: label='b', onset_Hz=18000, offset_Hz=19500, onset_s=None, offset_s=None, file='0.wav'
   ...

rename 'Annotation' to `Vocalization'

because it's not really an "annotation"

it's the high-level abstract object that lets us associate an annotation file with the sequence of annotated segments within that file, and the file that the annotation annotates, e.g. an audio file

so it should be something like:
Vocalization, with attributes 'sequence', 'annot_pathand (optionally)source_path`

allow for user-defined `tiers` for a Segment, like Praat?

Praat allows for multiple user-defined tiers per segment, e.g. "phoneme", "syllable", "word", "sentence".

http://www.fon.hum.uva.nl/praat/manual/Intro_7__Annotation.html

Not sure if that would be easy to add for Crowsetta.
I was thinking it would require the ability to dynamically add attributes to the Segment class, but I guess there could be an optional tiers attribute that's a dict mapping an annotation to each tier for any instance of a Segment.
But even then seq2csv would have to be able to handle mapping these extra tiers. I guess that's not too painful though if we're iterating over Segments anyway. Just would have to make sure all Segments have the same tiers.

add logo

  • "crowsetta stone" image?
    • in doc/index.rst & README.md
  • maybe also image showing GUI with labeling | Filenames | Sequence objects | csv output

add Stack class

programatically instantiated attrs class where each attribute is a Sequence.
A Stack is made up of 2 or more Sequences

change default value for `koumura2annot.Wave` parameter

causing vak to crash because the default is written relative to the current working directory, ./Wave

This only works if user is in the right place

Instead default should be written relative to the Annotation.xml path, which will always be in the parent directory of the Wave directory, unless someone was actually using the same format somewhere outside this dataset. In which case they could specify the correct location with the non-default Wave argument

Add utils module

with utility functions for labels and annotations
for labels just steal from vak
annotations utilities would be e.g. duration of all annotations

add formats module, info about each format in its modules docstring that format uses?

in spirit of DRY, instead of having a separate dict in the data module,
the top-level docstring for each format's module should have this metadata,
and there should be a formats module that knows how to parse this

Better if this could be linked with the internal config.ini somehow.

Maybe a Makefile that generates the config.ini?

Or ... each formats module has its own config_dict at the top, and then that gets used through an entry point maybe?

change annot_file / audio_file attributes of annotation to be Path objects

to not get

TypeError: ("'annot_file' must be <class 'str'> (got PosixPath('/home/ildefonso/Documents/repos/coding/birdsong/tweetynet/tests/test_data/mat/llb3_annot_subset.mat') that is a <class 'pathlib.PosixPath'>).", Attribute(name='annot_file', default=NOTHING, validator=<instance_of validator for type <class 'str'>>, repr=True, eq=True, order=True, hash=None, init=True, metadata=mappingproxy({}), type=None, converter=None, kw_only=False), <class 'str'>, PosixPath('/home/ildefonso/Documents/repos/coding/birdsong/tweetynet/tests/test_data/mat/llb3_annot_subset.mat'))

fix circular import bug in .formats

  • should import within function show() when called
  • have a similar function load() that does this and then show() calls load() if formats not loaded
  • and then build these function calls into Transcriber
  • how to test?

add `seqID` attribute to `Segment`

This will get used when one annotation file contains multiple sequences, and/or each sequence does not correspond to one audio file.
E.g., in the Koumura data set there are multiple sequences per audio file.
Similarly, canary song can be annotated by phrase and the user might want to preserve this annotation.

add `unique_labels` function?

all_labels = [a_seq.labels.tolist() for a_seq in seq]
all_labels = [label for labellist in all_labels for label in labellist]
uniq_labels = set(all_labels)
return uniq_labels

`koumura2annot` throws an error when annot_file is a Path not a str

Traceback (most recent call last):
  File "/home/art/anaconda3/envs/vak-dev/bin/vak", line 11, in <module>
    load_entry_point('vak', 'console_scripts', 'vak')()
  File "/home/art/Documents/repos/coding/birdsong/vak/src/vak/__main__.py", line 43, in main
    config_file=args.configfile)
  File "/home/art/Documents/repos/coding/birdsong/vak/src/vak/cli/cli.py", line 18, in cli
    prep(toml_path=config_file)
  File "/home/art/Documents/repos/coding/birdsong/vak/src/vak/cli/prep.py", line 162, in prep
    logger=logger)
  File "/home/art/Documents/repos/coding/birdsong/vak/src/vak/io/dataframe.py", line 124, in from_files
    annot_list = scribe.from_file(annot_file=annot_file)
  File "/home/art/anaconda3/envs/vak-dev/lib/python3.6/site-packages/crowsetta/koumura.py", line 53, in koumura2annot
    if not annot_file.endswith('.xml'):
AttributeError: 'PosixPath' object has no attribute 'endswith'

add csv as a format?

so user of e.g. vak can specify 'csv' as format

this would be a place where the 'to_annot` would have to return a list of annotations, though (see #54 )

have to_annot functions only return single annot

  • less testing required for different returned types
  • clearer what expected type returned is for downstream user -- won't have to test all their code for e.g. Annot + list of Annots

users will be able to e.g. write a list comprehension so it's not actually that useful to include this extra functionality

add "why" and "how" at top of docs

  • why:

    • club project for ppl studying vocalizations
    • tool for munging datasets of vocalizations that have annotated segments
      • so that when working with the dataset, there is no need to be aware of where different files are,
        e.g., the annotation file or files, the audio files, etc.
    • assumes you care about the "segments" part
      • need to include illustration of annotated segments right at top of docs
  • how:

    • Python classes that faciliate representing these datasets
      • a Vocalization that consists of its annotation and the files associated with it
    • end product: a .csv / DataFrame where each row is an annotated segment

make `user_config` less fragile

module crashes with a relative path like ./mymodule.py

to_csv and to_format have to be 'None' (if not using), not

None

which is annoying to type

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.