joeweiss / birdnetlib Goto Github PK
View Code? Open in Web Editor NEWA python api for BirdNET-Lite and BirdNET-Analyzer
Home Page: https://joeweiss.github.io/birdnetlib/
License: Apache License 2.0
A python api for BirdNET-Lite and BirdNET-Analyzer
Home Page: https://joeweiss.github.io/birdnetlib/
License: Apache License 2.0
It is currently possible to pass a custom species file path to Analyzer
, similarly to how BirdNET-Analyzer handles species lists.
This feature would make it easier to pass this dynamically via python, or via an API like AudioSpotter.
Currently, only DirectoryMultiProcessingAnalyzer
uses BirdNet-Analyzer by default. All the other helpers use older (deprecated) BirdNET-Lite as their default.
I am trying to use birdnetlib
as a library to identify birds instead of using the scripts included in BirdNet. I am having trouble pointing birdnetlib
to the models and labels included in BirdNET.
I am on Ubuntu 22.04.
sudo apt-get update
sudo apt-get upgrade
<Python 3.10 was already installed>
pip3 install tensorflow
pip3 install librosa resampy
pip3 install birdnetlib
sudo apt-get install ffmpeg
git clone https://github.com/kahst/BirdNET-Analyzer.git
I tried following BirdNet's recommendation of installing tflite-runtime
but got
>>> from birdnetlib.analyzer import Analyzer
Traceback (most recent call last):
File "/home/<user>/.local/lib/python3.10/site-packages/birdnetlib/analyzer.py", line 8, in <module>
import tflite_runtime.interpreter as tflite
ModuleNotFoundError: No module named 'tflite_runtime'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/<user>/.local/lib/python3.10/site-packages/birdnetlib/analyzer.py", line 10, in <module>
from tensorflow import lite as tflite
ModuleNotFoundError: No module named 'tensorflow'
So this is why I just installed the other option of tensorflow
.
from birdnetlib import Recording
from birdnetlib.analyzer import Analyzer
from datetime import datetime
model="<path>/BirdNET-Analyzer/checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_Model_FP32.tflite"
labels="<path>/BirdNET-Analyzer/checkpoints/V2.4/BirdNET_GLOBAL_6K_V2.4_Labels.txt"
analyzer=Analyzer(
classifier_labels_path=labels,
classifier_model_path=model
)
recording = Recording(
analyzer,
"2023-07-03 19_27.wav",
lat=42,
lon=71,
date=datetime(year=2023, month=7, day=4),
min_conf = 0
)
But the analysis fails:
>>> recording.analyze()
read_audio_data
read_audio_data: complete, read 16 chunks.
analyze_recording 2023-07-03 19_27.wav
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/<user>/.local/lib/python3.10/site-packages/birdnetlib/main.py", line 61, in analyze
self.analyzer.analyze_recording(self)
File "/home/<user>/.local/lib/python3.10/site-packages/birdnetlib/analyzer.py", line 206, in analyze_recording
pred = self.predict_with_custom_classifier(c)[0]
File "/home/<user>/.local/lib/python3.10/site-packages/birdnetlib/analyzer.py", line 316, in predict_with_custom_classifier
C_INTERPRETER.invoke()
File "/home/<user>/.local/lib/python3.10/site-packages/tensorflow/lite/python/interpreter.py", line 917, in invoke
self._interpreter.Invoke()
RuntimeError: tensorflow/lite/kernels/concatenation.cc:159 t->dims->data[d] != t0->dims->data[d] (1 != 0)Node number 92 (CONCATENATION) failed to prepare.
This worked fine with the built-in model and labels in birdnetlib
using just analyzer=Analyzer()
, but those are apparently not correct for my area as the results were not accurate.
How do I use birdnetlib
with BirdNet's models and labels? Have I done something wrong here?
Thanks for any help! Excited to try and get this system going.
I'd like to be able to change the BirdNET overlap parameter in addition to changing parameters like min_conf. I am using the DirectoryAnalyzer class defined in batch.py but overlap doesn't appear as an option? Is there a way of doing this and can you provide an example please?
Thanks!
Some of the code uses terse naming and not commented. Need to improve the readability a bit.
2.4 is out. kahst/BirdNET-Analyzer@b32cdc5
Also, I'm considering an eventual separation of the library from the models. Perhaps something like birdnet-analyzer-model==2.5.0
would be better as a dependency of birdnetlib
rather than including the models files in the library itself.
That would allow users to easily pin their model during a long-term analyzation project rather pin the library itself. I've had questions offline about that already, and I've pointed users to the new "custom classifier" options.
I would have to mirror the models (under BirdNET's current NC license) in a new package.
Any thoughts?
I am encountering a runtime error when attempting to use a custom model trained with birdnet-analyzer and performing analysis with birdnetlib. The specific error message I am receiving is:
RuntimeError: tensorflow/lite/kernels/concatenation.cc:162 t->dims->data[d] != t0->dims->data[d] (1 != 0)Node number 92 (CONCATENATION) failed to prepare.
from birdnetlib.analyzer import Analyzer
from birdnetlib import Recording
model_path = '/path_to_file/test_2.tflite'
labels_path = '/path_to_file/test_2_Labels.txt'
analyzer = Analyzer(classifier_labels_path=labels_path, classifier_model_path=model_path)
recording = Recording(
analyzer,
'path_to_audio_file',
min_conf=0.1
)
recording.analyze()
Operating System: Ubuntu 23.10
Python Version: 3.11.5
Birdnetlib Version: 0.12.3
Thank you for your attention to this matter. Let me know if you need further details or clarification.
The most obvious name is still available on PyPI:
Mostly due to this documented limitation with Docker:
https://docs.docker.com/desktop/mac/apple-silicon/
In addition, filesystem change notification APIs (inotify) do not work under qemu emulation. Even when the containers do run correctly under emulation, they will be slower and use more memory than the native equivalent.
As part of this, implement a lint checker with Black in Github actions. It should run on all pull requests.
Hello,
I've been using BirdNET to run classifiers for months now as I work though building some custom models. My computer restarted, and now when I try to open the GUI (using python3 py.gui like usual), it brings up error lines in the codes including
Traceback (most recent call last):
File "/Users/me/BirdNET-Analyzer/gui.py", line 969, in
build_single_analysis_tab()
File "/Users/me/BirdNET-Analyzer/gui.py", line 717, in build_single_analysis_tab
) = species_lists(False)
File "/Users/me/BirdNET-Analyzer/gui.py", line 655, in species_lists
species_file_input = gr.File(file_types=[".txt"], info="Path to species list file or folder.", visible=False)
File "/Users/me/anaconda3/envs/birdnet-analyzer/lib/python3.10/site-packages/gradio/component_meta.py", line 157, in wrapper
return fn(self, **kwargs)
TypeError: File.init() got an unexpected keyword argument 'info'
Have there been updates? Why would this all of a sudden have issues?
Thanks in advance.
The BirdNET-Analyzer 2.4 model is 13MB larger than the 2.3 model file, which pushed birdnetlib
over PyPi's 100MB total limit.
I've requested PyPi raise the limit to 200MB.
pypi/support#2912
The 0.7.0 release is blocked until this is resolved.
For the moment, you can install 0.7.0 with:
pip install https://github.com/joeweiss/birdnetlib/archive/main.zip
... or by including the following in your requirements.txt file:
birdnetlib @ https://github.com/joeweiss/birdnetlib/archive/main.zip
I love the idea of DirectoryAnalyzer and DirectoryWatcher. But as long as they only run on one CPU core, it's not usable for me.
I helped myself with multiprocessing, but that should only be a temporary solution as long as it isn't built directly into birdnetlib.
import os
from multiprocessing import Pool
from birdnetlib import Recording
from birdnetlib.analyzer import Analyzer
from datetime import datetime
def analyze_file(file_path):
try:
analyzer = Analyzer()
recording = Recording(
analyzer,
file_path,
lat=35.4244,
lon=-120.7463,
date=datetime(year=2022, month=5, day=10),
min_conf=0.25,
)
recording.analyze()
print(f"Processed {file_path}")
return recording
except Exception as e:
print(f"Error processing {file_path}: {e}")
return []
def analyze_directory(dir_path):
pool = Pool()
recordings = []
for subdir, dirs, files in os.walk(dir_path):
for file in files:
if file.lower().endswith(".wav"):
file_path = os.path.join(subdir, file)
recordings.append(pool.apply_async(analyze_file, args=(file_path,)))
pool.close()
pool.join()
return recordings
if __name__ == "__main__":
dir_path = "./workdir/"
recordings = analyze_directory(dir_path)
print(f"Finished analyzing {len(recordings)} files")
I'd like an example of a python script that looks at a directory or series of directories and runs and saves the results.
This issue does not include a command-line interface.
A fully functional CLI is outside the scope of this project.
Currently (as of 0.12.3), segmenting large audio files (> 1h) into more manageable segments is left to the user. If a user was to attempt to process a large audio file, the entire file would be pulled into memory before analyzing. This can lead to OOM killer events, or process crashes.
The library needs to have a method for processing very large audio files.
Related to #72, I'm proposing to relicense this repo and project as Apache 2.0. This would encourage the inclusion and usage of birdnetlib
in other projects without imposing the copyleft conditions of the GPL-3.0 license.
Note: This would not apply to the models that are included in this repo, as they are included for redistribution under BirdNET-Analyzer and BirdNET-Lite's CC-NC-BY-SA 4.0 license.
The plan is to change the license in the 0.9.0 release in August. I'll leave this issue open for discussion until then.
Action should run tests before publishing.
species = SpeciesList()
species_list = species.return_list(
lon=-120.7463, lat=35.4244, date=datetime(year=2022, month=5, day=10)
)
print(species_list) # [('Haemorhous mexicanus', 'House Finch'), ('Aphelocoma californica', 'California Scrub-Jay') ...]
I have a number of scripts that would need to be able to process in memory wavs instead of wavs saved to disk so it would be nice to add an argument flag for passing the audio vector itself.
I would like to merge this repository with the main BirdNET repository for easier maintenance of both parts. However, the GPL v3 license does not allow that without relicensing BirdNET-Analyzer under GPL v3, too. Therefore, please relicense this code under the term of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
https://github.com/joeweiss/birdnetlib/tree/main/src/birdnetlib/models
The README should appear at the above directory.
The BirdNET models here are Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.
I'm trying to pass just a custom classifier label path to the Analyzer constructor but not a custom model and it was failing. I found this in the Analyzer init function in the analyzer.py file
self.classifier_model_path = classifier_model_path
self.classifier_labels_path = classifier_labels_path
self.use_custom_classifier = (
self.classifier_labels_path and self.classifier_labels_path
)
classifier_labels_path is checked both times instead of self.classifier_model_path and self.classifier_labels_path
Now that training is documented as part of Birdnet-Analyzer, birdnetlib should have an option to use a custom classifier.
Thank you for this great project :)
Sadly, I can't clone this repo under Windows, because Windows does not allow colons in file names.
error: invalid path 'tests/test_files/2022-08-15-birdnet-21:05:51.wav'
DirectoryWatcher should process new files only when they close.
When using DirectoryMultiProcessingAnalyzer
, it's not clear why it internally uses the MultiProcessRecording
class. This seems to just return a subset of methods/attrs of a passed-in Recording
instance (config
, detections
, path
etc), meaning that access to other Recording
methods is lost, e.g. extract_detections_as_audio
.
Rather than iterating over the list of Recording
instances, generating a list of MultiProcessRecording
instances, would it not be better to just return the initial list, as happens in DirectoryAnalyzer
?
I'm happy to work out a PR for this, if you think the above suggestion is a good idea.
Python 3.7 will reach end of life on 2023-06-27.
The plan is to drop support for 3.7 on the next release after that date.
I have a working test case for the custom classifier functions, but I don't have a custom classifier itself for testing.
If you have a classifier, a suitable .wav or .flac file, and you're willing for your work to be used in birdnetlib's test cases, please let me know.
As per the project license, any shared files will be released as part of this project under the open source GPL-3 license.
def on_analyze_complete(recording):
print(recording.path)
print(recording.detections)
analyzer = LiteAnalyzer()
dir = "."
watcher = DirectoryWatcher(
dir, analyzer=analyzer, lon=-120.7463, lat=35.4244, week=18, min_conf=0.4
)
watcher.on_analyze_complete = on_analyze_complete
watcher.watch()
recording.detections
are intended to be saved in databases and for template renders. It makes sense to convert those values to floats rather than leave that task to the user. Django's JSONField
in particular can't save the detections as-is without pre-conversion of the numpy float32s to floats.
Note, this is a fix, so wouldn't be a breaking change.
Confirmed on Ubuntu and MacOS. Spectrogram extraction gets slower as python iterates through the detection list.
Watchers can and will be used in long-running processes. Therefore, we need a method for using the current date with each new recording.
Hi. Thanks for the work on this library - it's fantastic.
I have a raspberry pi zero out in the garden sending wav data back to a pi 4 running the library. Rather than copying files around, I made a few updates to the library so it can handle in memory buffers of WAV data. The changes let me run it in a simple TCP server that receives a constant stream of WAVs.
There is an example of how it works here:
https://github.com/elementechemlyn/birdnetlib/blob/wavbuffer/examples/simple_tcp_server.py
The changes are pretty rudimental but are non-breaking and have been working well for me over the last few months. If you think they might be useful for others and would like a pull request then let me know.
Cheers,
E.
This is partially implemented in Analyzer.return_predicted_species_list
method here.
Current implementation only uses the uncommon week_48
date format. Also, the current implementation initializes the entire Analyzer model, which has some unnecessary startup processes as related to species list generation.
This needs to be added to the test and release actions.
This would help mostly with reproducibility, but also aid in benchmarking the various model's performance for specific species.
analyzer22 = Analyzer(version=2.2)
analyzer23 = Analyzer(version=2.3)
# continue with processing audio ...
This is a work-in-progress. Expect a PR soon.
When providing the lat/long and datetime, Analyzer filters results by an BirdNET determined expected species list. This filters out all non-bird sounds like engine, coyote, frogs, etc.
It would be helpful to have a detection return with a "in_species_list" boolean or something similar, rather then filter out non-avian sounds entirely.
Perhaps it should also include the occurrence frequency value that's output by BirdNET's species.py function.
[{'common_name': 'House Finch',
'confidence': 0.5744,
'end_time': 12.0,
'in_species_list': true,
'in_species_list_freq': 0.10,
'scientific_name': 'Haemorhous mexicanus',
'start_time': 9.0,
'label': 'Haemorhous mexicanus_House Finch'},
{'common_name': 'Coyote',
'confidence': 0.4496,
'end_time': 9.0,
'in_species_list': false,
'in_species_list_freq': null,
'scientific_name': 'Canis latrans',
'start_time': 6.0,
'label': 'Coyote_Coyote'}
{'common_name': 'House Finch',
'confidence': 0.4496,
'end_time': 15.0,
'in_species_list': true,
'in_species_list_freq': 0.40,
'scientific_name': 'Haemorhous mexicanus',
'start_time': 12.0,
'label': 'Haemorhous mexicanus_House Finch'}]
There is concern that the handler should be an example rather than part of the library. It's probably more important to provide working code as an example rather than prescribe a preferred schema for the database.
I would like the ability to watch a folder, and automatically have more than one analyzer run the files and return results. Watcher should provide method for returning recording detections from each analyzer, and for returning when all analyzers are complete.
Context: The BirdNET analyzers use a week
value that's ranged 1-48.
Week of the year when the recording was made. Values in [1, 48] (4 weeks per month). Set -1 for year-round species list.
ref
Some users find this ambiguous (is there a preferred python function for converting a date to week-48 format?).
Also, since it's a non-standard interpretation of "weeks", I'd like to standardize with an eye towards the API staying the same for future-proofing. I'd hate to lock in a 48 week standard that has to be reversed if another analyzer ever becomes available.
Options are:
date
argument, and use that to convert to BirdNET's required 48-week based value in the backgroundweek_48
parameter as well to provide consistency to BirdNET usersI'm leaning on option 2 with option 3 implemented as a fallback for users that adhere to the legacy BirdNET usage.
After installing birdnetlib with tensor flow, librosa and all other needed dependencies. I've run into an error I've attached below. This is happening when I'm trying to import birdnetlib into my python file. I'm using tensorflow version 2.15.0. Does anyone know a work around for this. Am I using the wrong version? When I look at the tf documentation I cant even find a module called distribute.load_context :
ModuleNotFoundError: No module named tensorflow.python.distribute.load_context
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.