orcasound / aifororcas-livesystem Goto Github PK
View Code? Open in Web Editor NEWReal-time AI-assisted killer whale notification system (model and moderator portal) :star:
Home Page: http://orcahello.ai4orcas.net/
License: MIT License
Real-time AI-assisted killer whale notification system (model and moderator portal) :star:
Home Page: http://orcahello.ai4orcas.net/
License: MIT License
As of Sept 2022, the data schematic in the aifororcas-livesystem README indicates the inference model is deployed via Azure Container Instance service. It currently looks like this:
To be more accurate, the current image should be modified to instead indicate that the inference system is deployed via Azure Kubernetes Service (AKS).
I'm not sure where this graphic lives (in Microsoft hackathon Teams workspace?) or in what software it was created. Ideally it would be stored within the REPO in a format that could be edited easily in the future, rather than only as an image file.
After updating the file, the README should be proof-read to ensure deployment descriptions are consistent with the new version of the schematic.
Check out Azure CosmosDB Python SDK
Prakruti asked @scottveirs, @dbaing17, and others to work out how many times SRKWs were missed by the live inference system over the first year of deployment and beta-testing.
Tasks:
Questions:
Historically, troubleshooting for inference system/notification system failures involved manual steps to identify failures. Past hackathon focused on utilizing Azure Dashboards to surface some metrics from Log Analytics. However, Azure Dashboards is difficult for non-technical observers to use.
I'd like to look into setting up something separate from Azure for monitoring purposes. It can either be a self-developed application or an existing monitoring solution (prometheus?). It should show at minimum:
The system normally processes one minute at a time. However, sometimes data are missing, so that only 40-50 seconds are received. When this happens, the bars indicating when detections occurred are misaligned. This is a rare problem and has little impact on the function of the system, so is a low priority from my perspective, but could be important for retraining runs.
Due to issues that I am working on resolving with Microsoft support, I am attempting to redeploy using https://github.com/orcasound/aifororcas-livesystem/blob/main/inference_system/deploy-aci.yaml. I noticed that the config file is not up to date (i.e., two containers are specified and three are in the currently running instance of ACI).
Can you update deploy-aci.yaml?
Add functionality to notify moderators and subscribers via SMS. Maybe use azure communication services?
In at least one case during first year of deployment, the metadata for a candidate stated four detections were contained in the candidate. Yet, upon examination of the model predictions via the "Details" button, there appears to be only one detection (overlaid white bounding lines) near the end of the 60-second clip.
I've noticed this only once (see candidate in question), but tagged it as related to this potential "bug" within the CosmoDB of annotations.
Put in unit tests that will run for each PR.
Put in a build pipeline that will run for each PR.
Automated deployment of moderator portal after PR completes.
The inference system currently runs in Azure Container Instances. It restarts every once in a while, which is expected. However, logs are printed to stdout and lost when the container restarts.
We would like to save logs externally so they are available beyond container restarts.
Since approximately the last Microsoft hackathon in October there has been a degradation in performance of the inference system. Over this period (Nov through Feb) it feels like the model has been generating more false positives that don't seem to have any sort of signal reminiscent of a killer whale call. I have been wondering if the model has become more sensitive to low-frequencies, possibly due to being re-trained with data that included humpback non-song vocalizations.
However, there are cases where I can't hear (or see in the spectrogram) any tonal, whale-like signals at either high or low frequencies. These false positives are disconcerting and are difficult for moderators to process in large numbers (because they are pretty boring to review! ;). Most often, these candidates have pretty low average confidence (just above the 50% threshold), but here are a couple of recent examples where the average confidence was above 70%!
My impression is that this issue is by far most prevalent for the Orcasound Lab node. In fact, there have been remarkably few candidates for review from Port Townsend and Bush Point in comparison over the last few months. I'm not sure if the differences are related to this performance issue, but here are the tallies for Jan 01 - Feb 28, 2022:
It took a while to settle in on this apparent change in performance of the inference system, in part because after the hackathon we migrated to a new Azure subscription and experimented with different modes of deploying the code. Additionally, the winter storms caused a suite of physical changes in the hydrophones and deployment conditions at the Orcasound node during this period. A confounding factor is that SRKW transits of the Orcasound hydrophones are decreasing during the same period (which is the normal wintertime pattern), so we have few opportunities to observe the model performance when SRKW calls are present. Nevertheless, these false positives associated only with typical background noise are a significant issue and may indicate a bug and/or a step backwards in model performance, at least for the Orcasound Lab node.
For the last week or so, I've added the fail
tag to candidates which only seem to contain "typical" noise for a particular hydrophone location. These can be explored via the OrcaHello Dashboard by searching the tag cloud for fail
...
[On my iPhone in Chrome (& Safari and Firefox) I can log into the moderator site, but playback on the default or detailed view of a candidate makes no sound. This makes it impossible to review candidates when I’m away from my laptop, other than visually inspecting the spectrograms.
However, the play button icon does toggle to pause, and the vertical white lines indicating predictions in the detail view are rendered. See screen shot:
This is not expected behavior for WAV format playback on iOS, at least in Chrome — https://en.m.wikipedia.org/wiki/HTML5_audio#Supported_audio_coding_formats](https://en.m.wikipedia.org/wiki/HTML5_audio#Supported_audio_coding_formats)
Get @paulcretu to teach us (at least @scottveirs ) how to archive merged or abandoned feature branches as he has done in the orcasite
repo, e.g. https://github.com/orcasound/orcasite/tags
The output of this function should be graphs embedded in a markdown file as describerd in the readme.md
A lot of this likely exists, you'd have to look into https://github.com/orcasound/aifororcas-livesystem/blob/main/ModelEvaluation/readme.md and https://github.com/orcasound/aifororcas-livesystem/blob/main/ModelEvaluation/score.py
This "scoring" procedure will be used to create the body of the PR that we will eventually check in.
You'll like need to check in with Aayush for the forward inference part of this.
On some days there are many candidates containing only false positives (e.g. 20-40 candidates containing only Pigeon Guillemot bird calls). In such cases -- when the moderator is confident that no SRKW were present at the time and location of the false positives -- it would be able to "Select all" within a page view or a time interval, and then annotate that temporal batch of candidates with the same labels and comment.
To ensure high-quality annotation of acoustic bouts containing SRKW, each candidate in a bout should be processed separately, rather than in a batch. This would increase odds that every true positive is confirmed, false negatives are flagged, and that call types and "special" signals like whistles or buzzes will be annotated accurately.
For this event, I received no notification as a moderator, nor did the OrcaHello system seem to create any candidates that are visible within the moderator portal. Having reviewed much of the continuous recording, I believe there are many SRKW calls that would have been detected by the system as it was performing in late 2020 and early 2021.
Orcasound Lab (Haro Strait)
11/2/2021 | 14:05:20
11/2/2021 | 16:15
Greater than 2 hours of SRKW signals at intermediate to high SNR. Signals included an unusually wide variety of SRKW calls; Monika Wieland estimated hearing about ~2/3 of the SRKW repertoire, whereas we typically hear <1/4. There were
TBD
I will provide a link to a blog post presenting the continuous recording (HLS segments transcoded to .ogg and .mp3). For now, the start/stop date-times are listed in the shared Orcasound annotation candidate spreadsheet.
Audio like chain clinking, boat noises or high-frequency can sometimes confuse our system.
This feature request is for a moderator to launch an automated re-training of a model when we find new instances of false positives.
Scoped to only inanimate false positives for now though the actual mechanism would be independent of what data is considered false positive.
Ideally, this feature enables the creation of a PR that updates the model.
Port Townsend inference system container (orcaconservancycr.azurecr.io/live-inference-system:11-15-20.FastAI.R1-12.PortTownsend.v0) crashes with the below message:
Listening to location https://s3-us-west-2.amazonaws.com/streaming-orcasound-net/rpi_port_townsend
Traceback (most recent call last):
File "./src/LiveInferenceOrchestrator.py", line 158, in <module>
clip_path, start_timestamp, current_clip_end_time = hls_stream.get_next_clip(current_clip_end_time)
File "/usr/src/app/src/hls_utils/HLSStream.py", line 94, in get_next_clip
num_segments_in_wav_duration = math.ceil(self.polling_interval/stream_obj.target_duration)
TypeError: unsupported operand type(s) for /: 'int' and 'NoneType'
There is a class to do this, see DateRangeHLS
https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem/src/hls_utils
See https://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/src/PrepareDataForPredictionExplorer.py for usage.
Add support + error handling to make sure only date ranges < a certain time period are allowed.
Add support for randomly sampling audio segments within the date range.
It seems the delay is slowly increasing between a push of the "Submit" button and the refreshed moderator UI view (next candidate ready for annotation). As of today (8/19/21) it is about 9 seconds. This makes bulk annotation much less efficient than it could be...
FWIW: this delay is based on a MacBook Pro (OSX 10.15.5) running the Chrome browser -- Version 92.0.4515.159 (Official Build) (x86_64).
Add dashboard for Azure Functions execution count for SendModeratorEmail and SendSubscriberEmail. This will allow us to have a "single pane of glass" to identify failures in the pipeline.
Call with Michelle on Sat Sep 3 regarding late 2021 deployment via Kubernetes.
Add unit testing to Moderator Frontend projects and integrate into the GitHub Actions workflows (or create new workflows as needed).
Also, take the opportunity to refactor https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem/src/hls_utils
Convert code from notebook to script. Ask is a function that can be invoked.
When sorting all in ascending or descending order, year is not included in the sort. I.e., December, 2020 is treated as after January, 2021.
Due to the large number of detections accumulated over the last year, it would be helpful to be able to jump to a specific date and time rather than starting at the beginning or end and working toward the middle.
Do not attempt this till some other dependencies are resolved.
Potentially can use Github API or FastAI's ghapi.
Other thing is to figure out how to authenticate to be able to submit a PR without checking in keys.
Pipeline for notification currently builds. We also want it to automatically deploy to Azure.
Location of page could be in this repo as a markdown README.md file or on a website, hopefully somewhere here?
https://ai4orcas.net/portfolio/orcahello-live-inference-system/
Unit tests that run for new PRs
Build pipelines for new PRs
Automated deployments for a new container once a PR is complete using Github actions.
SendGrid presents us with a lovely dashboard showing email metrics:
However, the dashboard only lives inside SendGrid, which makes accessing the dashboard more difficult. We would like to replicate the dashboard in Azure.
This should be possible by using Event Webhooks.
Convert notebook to script containing at least two functions blenddataset() and finetune()
Also, include code to "blend" data from old dataset + new as a separate function.
Please liberally add comments in the script apart from writing the reasoning in the notebook.
MIT?
The email notifications are working intermittently. After the first round of failures, it restarted, but has failed again. It failed on 9/15, resumed working on 9/16, and failed again on 9/17. It seemed to work flawlessly for almost a year until these failures.
Change base image in Dockerfile https://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/Dockerfile
Test that everything works
For this brief (<10 min-long) event, I received no notification as a moderator, nor did the OrcaHello system seem to create any candidates that are visible within the moderator portal.
Having reviewed much of the continuous recording, I believe there are many SRKW calls that would have been detected by the system as it was performing in late 2020 and early 2021. The signal to noise ratio was intermediate and many calls were 100% recognizable as those from SRKWs.
Which hydrophone?
Orcasound Lab (Haro Strait)
When did the KW event start?
11/9/2021 | 15:48:00
When did the KW event end?
11/9/2021 | 15:57:00
What was the nature of the event?
About 8 minutes of SRKW signals at intermediate to high SNR. Signals included an unusually wide variety of SRKW calls. An Orcasound listener annotated the live audio data at 15:58:16. I reviewed continuous recording and heard lots of calls, nearly continuous sometimes and often overlapping. I noted a surprisingly high % of S10 excitement calls, some S01s, S16s, S17s?, and 3 S04s at the end.
Was the container running? (If you have this info?)
TBD
Here is a link to a the continuous recording (HLS segments transcoded to .ogg and .mp3). There is also a set of preliminary annotations there in Audacity label track
format.
We would like to reduce the number of notifications subscribers receive.
Some state tracking might be necessary across function runs.
Moderator emails should still go out immediately for every potential detected orca clip.
A good question was raised on a call with Canadian open source collaborators today (HALLO project, #ai4orcas-hallo in Orcasound Slack), some of whom have been experimenting with different window durations in developing a binary classifier for SRKW+Bigg's+NRKW+offshore ecotypes of killer whales in the NE Pacific (with habitat in BC, Canada, coastal environments):
Why did Pod.Cast and OrcaHello elect to use a 2.45 second window?
It would be ideal to recall the rationale and add it to the README.MD file.
On the call, I said I thought it was due to the statistics of SRKW call duration, but I'm not seeing the 2.45 second (or 2450 millisecond) value in Orcasound's shared spreadsheet of SRKW.
As a bioacoustician end-user (OrcaHello moderator persona),
In the default moderator view of candidates and the Detail view of candiates with overlain detections,
I would like to see the axes of the spectrogram, specifically frequency (in Hz) and time (in seconds) with ticks and labels.
For eg. if the container images in https://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/deploy-aci.yaml are changed, we could use a Github Action to automatically re-deploy.
We moved the inference system from ACI to AKS after several long and unexplained failures of the former. However, the current documentation for inference system deployment still refers to ACI.
From logs, the inference system is attempting to pull audio from the Bush Point hydrophone from https://s3-us-west-2.amazonaws.com/streaming-orcasound-net/rpi_bush_point but fails.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.