orcasound / aifororcas-livesystem Goto Github PK

Real-time AI-assisted killer whale notification system (model and moderator portal) :star:

Home Page: http://orcahello.ai4orcas.net/

License: MIT License

C# 58.19% HTML 6.56% CSS 22.49% JavaScript 0.85% Python 11.82% Dockerfile 0.09%

artificial-intelligence audio-analysis audio-processing bioacoustics deep-learning inference machine-learning marine marine-biology neural-network

aifororcas-livesystem's Issues

Update data flow schematic for README

As of Sept 2022, the data schematic in the aifororcas-livesystem README indicates the inference model is deployed via Azure Container Instance service. It currently looks like this:

To be more accurate, the current image should be modified to instead indicate that the inference system is deployed via Azure Kubernetes Service (AKS).

I'm not sure where this graphic lives (in Microsoft hackathon Teams workspace?) or in what software it was created. Ideally it would be stored within the REPO in a format that could be edited easily in the future, rather than only as an image file.

After updating the file, the README should be proof-read to ensure deployment descriptions are consistent with the new version of the schematic.

Add a python function that given a date range + tag, queries CosmosDB for audio clips

Check out Azure CosmosDB Python SDK

Ascertain % of known SRKW transits missed by OrcaHello

Prakruti asked @scottveirs, @dbaing17, and others to work out how many times SRKWs were missed by the live inference system over the first year of deployment and beta-testing.

Tasks:

Review and consistently complete Orcasound event spreadsheet (including tally of when humans and/or machines detected SRKWs)
Double-check that all known SRKW events are included (using Orca Network sighting reports and OBI/etc FB accounts, and any other available "ground truth" opportunities).
Compute percentages of total transits detected/missed by OrcaHell (and the human listening network, and both combined).

Questions:

On 8/28/21 around 22:35 did OrcaHello miss Bigg's transit of Orcasound Lab, or had instance failed at this point in August? (Not Bigg's but may be symptomatic of system destabilizing...)
On 9/7/21 did OrcaHello miss Bush Point event due to high water noise levels, or was system down (after 9/5 detections)?

Add heartbeat/monitoring dashboard for inference system

Historically, troubleshooting for inference system/notification system failures involved manual steps to identify failures. Past hackathon focused on utilizing Azure Dashboards to surface some metrics from Log Analytics. However, Azure Dashboards is difficult for non-technical observers to use.

I'd like to look into setting up something separate from Azure for monitoring purposes. It can either be a self-developed application or an existing monitoring solution (prometheus?). It should show at minimum:

Heartbeats from inference system instances
Line chart for Cosmos DB read/write metrics
Line chart for Azure function executions
Line chart for SendGrid emails sent

incomplete data problem

The system normally processes one minute at a time. However, sometimes data are missing, so that only 40-50 seconds are received. When this happens, the bars indicating when detections occurred are misaligned. This is a rare problem and has little impact on the function of the system, so is a low priority from my perspective, but could be important for retraining runs.

Update inference_system/deploy-aci.yaml

Due to issues that I am working on resolving with Microsoft support, I am attempting to redeploy using https://github.com/orcasound/aifororcas-livesystem/blob/main/inference_system/deploy-aci.yaml. I noticed that the config file is not up to date (i.e., two containers are specified and three are in the currently running instance of ACI).

Can you update deploy-aci.yaml?

Add SMS to notification system

Add functionality to notify moderators and subscribers via SMS. Maybe use azure communication services?

Add subscribe/unsubscribe functionality for moderators and subscribers
Figure out how to store user info properly in Azure Tables
Send SMS messages as appropriate

Candidate metadata states 4 detections, but only one displayed

In at least one case during first year of deployment, the metadata for a candidate stated four detections were contained in the candidate. Yet, upon examination of the model predictions via the "Details" button, there appears to be only one detection (overlaid white bounding lines) near the end of the 60-second clip.

I've noticed this only once (see candidate in question), but tagged it as related to this potential "bug" within the CosmoDB of annotations.

CI/CD for Moderator FrontEnd

Put in unit tests that will run for each PR.
Put in a build pipeline that will run for each PR.
Automated deployment of moderator portal after PR completes.

Shared logging for inference system

The inference system currently runs in Azure Container Instances. It restarts every once in a while, which is expected. However, logs are printed to stdout and lost when the container restarts.

We would like to save logs externally so they are available beyond container restarts.

False positives with no discernible signal

Since approximately the last Microsoft hackathon in October there has been a degradation in performance of the inference system. Over this period (Nov through Feb) it feels like the model has been generating more false positives that don't seem to have any sort of signal reminiscent of a killer whale call. I have been wondering if the model has become more sensitive to low-frequencies, possibly due to being re-trained with data that included humpback non-song vocalizations.

However, there are cases where I can't hear (or see in the spectrogram) any tonal, whale-like signals at either high or low frequencies. These false positives are disconcerting and are difficult for moderators to process in large numbers (because they are pretty boring to review! ;). Most often, these candidates have pretty low average confidence (just above the 50% threshold), but here are a couple of recent examples where the average confidence was above 70%!

My impression is that this issue is by far most prevalent for the Orcasound Lab node. In fact, there have been remarkably few candidates for review from Port Townsend and Bush Point in comparison over the last few months. I'm not sure if the differences are related to this performance issue, but here are the tallies for Jan 01 - Feb 28, 2022:

343 Orcasound Lab (97%)
003 Port Townsend (01%)
008 Bush Point (02%)

It took a while to settle in on this apparent change in performance of the inference system, in part because after the hackathon we migrated to a new Azure subscription and experimented with different modes of deploying the code. Additionally, the winter storms caused a suite of physical changes in the hydrophones and deployment conditions at the Orcasound node during this period. A confounding factor is that SRKW transits of the Orcasound hydrophones are decreasing during the same period (which is the normal wintertime pattern), so we have few opportunities to observe the model performance when SRKW calls are present. Nevertheless, these false positives associated only with typical background noise are a significant issue and may indicate a bug and/or a step backwards in model performance, at least for the Orcasound Lab node.

For the last week or so, I've added the fail tag to candidates which only seem to contain "typical" noise for a particular hydrophone location. These can be explored via the OrcaHello Dashboard by searching the tag cloud for fail...

Restrict write to AI For Orcas CosmosDB

https://aifororcasdetections.azurewebsites.net/index.html

Create master script RunRetraining.py --date-range

Moderator candidates silent on iOS playback

[On my iPhone in Chrome (& Safari and Firefox) I can log into the moderator site, but playback on the default or detailed view of a candidate makes no sound. This makes it impossible to review candidates when I’m away from my laptop, other than visually inspecting the spectrograms.

However, the play button icon does toggle to pause, and the vertical white lines indicating predictions in the detail view are rendered. See screen shot:

This is not expected behavior for WAV format playback on iOS, at least in Chrome — https://en.m.wikipedia.org/wiki/HTML5_audio#Supported_audio_coding_formats](https://en.m.wikipedia.org/wiki/HTML5_audio#Supported_audio_coding_formats)

Archive old feature branches

Get @paulcretu to teach us (at least @scottveirs ) how to archive merged or abandoned feature branches as he has done in the orcasite repo, e.g. https://github.com/orcasound/orcasite/tags

Functions to perform forward pass on the test set, create a submission and score it.

The output of this function should be graphs embedded in a markdown file as describerd in the readme.md

A lot of this likely exists, you'd have to look into https://github.com/orcasound/aifororcas-livesystem/blob/main/ModelEvaluation/readme.md and https://github.com/orcasound/aifororcas-livesystem/blob/main/ModelEvaluation/score.py

This "scoring" procedure will be used to create the body of the PR that we will eventually check in.

You'll like need to check in with Aayush for the forward inference part of this.

Add Root README.md and placeholder CONTRIBUTING.md

Batch annotation

On some days there are many candidates containing only false positives (e.g. 20-40 candidates containing only Pigeon Guillemot bird calls). In such cases -- when the moderator is confident that no SRKW were present at the time and location of the false positives -- it would be able to "Select all" within a page view or a time interval, and then annotate that temporal batch of candidates with the same labels and comment.

To ensure high-quality annotation of acoustic bouts containing SRKW, each candidate in a bout should be processed separately, rather than in a batch. This would increase odds that every true positive is confirmed, false negatives are flagged, and that call types and "special" signals like whistles or buzzes will be annotated accurately.

OrcaHello false negative event: SRKWs at Orcasound Lab on 11/2/21

For this event, I received no notification as a moderator, nor did the OrcaHello system seem to create any candidates that are visible within the moderator portal. Having reviewed much of the continuous recording, I believe there are many SRKW calls that would have been detected by the system as it was performing in late 2020 and early 2021.

Which hydrophone?

Orcasound Lab (Haro Strait)

When did the KW event start?

11/2/2021 | 14:05:20

When did the KW event end?

11/2/2021 | 16:15

What was the nature of the event?

Greater than 2 hours of SRKW signals at intermediate to high SNR. Signals included an unusually wide variety of SRKW calls; Monika Wieland estimated hearing about ~2/3 of the SRKW repertoire, whereas we typically hear <1/4. There were

Was the container running? (If you have this info?)

TBD

I will provide a link to a blog post presenting the continuous recording (HLS segments transcoded to .ogg and .mp3). For now, the start/stop date-times are listed in the shared Orcasound annotation candidate spreadsheet.

Automated re-training with known negative samples: Launch script

Audio like chain clinking, boat noises or high-frequency can sometimes confuse our system.
This feature request is for a moderator to launch an automated re-training of a model when we find new instances of false positives.

Scoped to only inanimate false positives for now though the actual mechanism would be independent of what data is considered false positive.

Ideally, this feature enables the creation of a PR that updates the model.

Port Townsend inference system container crashes

Port Townsend inference system container (orcaconservancycr.azurecr.io/live-inference-system:11-15-20.FastAI.R1-12.PortTownsend.v0) crashes with the below message:

Listening to location https://s3-us-west-2.amazonaws.com/streaming-orcasound-net/rpi_port_townsend
Traceback (most recent call last):
  File "./src/LiveInferenceOrchestrator.py", line 158, in <module>
    clip_path, start_timestamp, current_clip_end_time = hls_stream.get_next_clip(current_clip_end_time)
  File "/usr/src/app/src/hls_utils/HLSStream.py", line 94, in get_next_clip
    num_segments_in_wav_duration = math.ceil(self.polling_interval/stream_obj.target_duration)
TypeError: unsupported operand type(s) for /: 'int' and 'NoneType'

Add a python function that given a date-range, downloads audio in a range from Orcasound's S3

There is a class to do this, see DateRangeHLS
https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem/src/hls_utils

See https://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/src/PrepareDataForPredictionExplorer.py for usage.

Add support + error handling to make sure only date ranges < a certain time period are allowed.
Add support for randomly sampling audio segments within the date range.

Move all trained models to Blob Storage, move dataset to Azure Blob Storage, figure out versioning scheme and add a function to download models given creds

Moderator UI slow upon submission

It seems the delay is slowly increasing between a push of the "Submit" button and the refreshed moderator UI view (next candidate ready for annotation). As of today (8/19/21) it is about 9 seconds. This makes bulk annotation much less efficient than it could be...

FWIW: this delay is based on a MacBook Pro (OSX 10.15.5) running the Chrome browser -- Version 92.0.4515.159 (Official Build) (x86_64).

Add dashboard for notification system metrics

Add dashboard for Azure Functions execution count for SendModeratorEmail and SendSubscriberEmail. This will allow us to have a "single pane of glass" to identify failures in the pipeline.

Review live-system deployment

Call with Michelle on Sat Sep 3 regarding late 2021 deployment via Kubernetes.

Unit testing for Moderator Frontend

Add unit testing to Moderator Frontend projects and integrate into the GitHub Actions workflows (or create new workflows as needed).

Package hls_utils

Also, take the opportunity to refactor https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem/src/hls_utils

Function to pre-process data FastAI style before training

Convert code from notebook to script. Ask is a function that can be invoked.

Sorting problem

When sorting all in ascending or descending order, year is not included in the sort. I.e., December, 2020 is treated as after January, 2021.

Due to the large number of detections accumulated over the last year, it would be helpful to be able to jump to a specific date and time rather than starting at the beginning or end and working toward the middle.

Dockerize Master script – RunRetraining.py –date-range etc.

Do not attempt this till some other dependencies are resolved.

Add monitoring for prediction system

Perhaps emit "predicted" events to App Insights?
Build dashboard

Add python function create_pull_request() to programmatically create a PR on Github

Potentially can use Github API or FastAI's ghapi.
Other thing is to figure out how to authenticate to be able to submit a PR without checking in keys.

CD for notification system

Pipeline for notification currently builds. We also want it to automatically deploy to Azure.

Add a how-to page with pictures to onboard new moderators

Location of page could be in this repo as a markdown README.md file or on a website, hopefully somewhere here?
https://ai4orcas.net/portfolio/orcahello-live-inference-system/

CI/CD for deploying a new model, ML model inference etc.

Unit tests that run for new PRs
Build pipelines for new PRs
Automated deployments for a new container once a PR is complete using Github actions.

Add dashboard for sendgrid metrics

SendGrid presents us with a lovely dashboard showing email metrics:

However, the dashboard only lives inside SendGrid, which makes accessing the dashboard more difficult. We would like to replicate the dashboard in Azure.

This should be possible by using Event Webhooks.

Python function to finetune existing FastAI model with new data.

Convert notebook to script containing at least two functions blenddataset() and finetune()
Also, include code to "blend" data from old dataset + new as a separate function.
Please liberally add comments in the script apart from writing the reasoning in the notebook.

Update dataset with new train, test, val negative samples after a model updating PR is completed.

Add a license

MIT?

Improve availability of prediction system

The email notifications are working intermittently. After the first round of failures, it restarted, but has failed again. It failed on 9/15, resumed working on 9/16, and failed again on 9/17. It seemed to work flawlessly for almost a year until these failures.

Reducing container image size by changing to alpine base image

Change base image in Dockerfile https://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/Dockerfile

Test that everything works

OrcaHello false negative event: SRKWs at Orcasound Lab on 11/9/21

For this brief (<10 min-long) event, I received no notification as a moderator, nor did the OrcaHello system seem to create any candidates that are visible within the moderator portal.

Having reviewed much of the continuous recording, I believe there are many SRKW calls that would have been detected by the system as it was performing in late 2020 and early 2021. The signal to noise ratio was intermediate and many calls were 100% recognizable as those from SRKWs.

Which hydrophone?
Orcasound Lab (Haro Strait)

When did the KW event start?
11/9/2021 | 15:48:00

When did the KW event end?
11/9/2021 | 15:57:00

What was the nature of the event?
About 8 minutes of SRKW signals at intermediate to high SNR. Signals included an unusually wide variety of SRKW calls. An Orcasound listener annotated the live audio data at 15:58:16. I reviewed continuous recording and heard lots of calls, nearly continuous sometimes and often overlapping. I noted a surprisingly high % of S10 excitement calls, some S01s, S16s, S17s?, and 3 S04s at the end.

Was the container running? (If you have this info?)
TBD

Here is a link to a the continuous recording (HLS segments transcoded to .ogg and .mp3). There is also a set of preliminary annotations there in Audacity label track format.

Change subscriber notifications to be sent less frequently

We would like to reduce the number of notifications subscribers receive.

Subscribers should get a notification when orcas are first confirmed.
Subscribers should continue to get notifications every 15 minutes while orcas are still there.
Subscribers should get a notification when orcas are no longer there.

Some state tracking might be necessary across function runs.

Moderator emails should still go out immediately for every potential detected orca clip.

Readme Links need updating

Some of the links under the Contributing section of the readme require updating

Add rationale for 2.45 second window to README

A good question was raised on a call with Canadian open source collaborators today (HALLO project, #ai4orcas-hallo in Orcasound Slack), some of whom have been experimenting with different window durations in developing a binary classifier for SRKW+Bigg's+NRKW+offshore ecotypes of killer whales in the NE Pacific (with habitat in BC, Canada, coastal environments):

Why did Pod.Cast and OrcaHello elect to use a 2.45 second window?

It would be ideal to recall the rationale and add it to the README.MD file.

On the call, I said I thought it was due to the statistics of SRKW call duration, but I'm not seeing the 2.45 second (or 2450 millisecond) value in Orcasound's shared spreadsheet of SRKW.

Add axes to spectrograms

As a bioacoustician end-user (OrcaHello moderator persona),
In the default moderator view of candidates and the Detail view of candiates with overlain detections,
I would like to see the axes of the spectrogram, specifically frequency (in Hz) and time (in seconds) with ticks and labels.

When PR containing metrics of automated training is completed, it should trigger deployment of containers with the new trained model.

For eg. if the container images in https://github.com/orcasound/aifororcas-livesystem/blob/main/InferenceSystem/deploy-aci.yaml are changed, we could use a Github Action to automatically re-deploy.

Update inference system deployment documentation

We moved the inference system from ACI to AKS after several long and unexplained failures of the former. However, the current documentation for inference system deployment still refers to ACI.

Inference system is not pulling audio from Bush Point hydrophone

From logs, the inference system is attempting to pull audio from the Bush Point hydrophone from https://s3-us-west-2.amazonaws.com/streaming-orcasound-net/rpi_bush_point but fails.

orcasound / aifororcas-livesystem Goto Github PK

aifororcas-livesystem's Issues

Recommend Projects

Recommend Topics

Recommend Org