orcasound / orcanode Goto Github PK

Software for live-streaming and recording lossy (AAC) or lossless compressed audio (HLS, DASH, FLAC) via AWS S3 buckets. :star:

License: GNU Affero General Public License v3.0

Shell 25.10% Dockerfile 7.44% C 35.65% Python 31.81%

audio-streaming audio-recorder hls hls-stream hls-live-streaming hls-server dash mpeg-dash aws-s3 python boto mseed

orcanode's Introduction

Orcasound's orcanode code for live-streaming audio data

The orcanode software repository contains audio tools and scripts for capturing, reformatting, transcoding and uploading audio data at each node of a network. Orcanode live-streaming should work on Intel (amd64) or Raspberry Pi (arm32v7) platforms using any soundcard. The most common hardware used by Orcasound is the Pisound HAT on a Raspberry Pi (3B+ or 4) single-board computer.

There is a base set of tools and a couple of specific projects in the node and mseed directories. The node directory is for new locations streaming within the Orcasound listening network (primary nodes).

The mseed directory has code for converting audio data in the mseed format to the live-streaming audio format used by primary nodes. This conversion code is mainly used for audio data collected by the Ocean Observatories Initiative or OOI network. See the README in the mseed directory for more info. Transcoding from other audio formats should likely go in new directories by encoding scheme, similar to the mseed directory...

You can also gain some bioacoustic context for the project in the orcanode wiki.

Background & motivation

This code was developed for live-streaming from source nodes in the Orcasound hydrophone network (WA, USA). Thus, the repository names begin with "orca"! Our primary motivation is to make it easy for community scientists to listen for whales via the Orcasound web app using their favorite device/OS/browser.

We also aspire to use open source software as much as possible. We rely heavily on FFmpeg. One of our long-term goals is to stream lossless FLAC-encoded audio within DASH segments to a player that works optimally on as many listening devices as possible. For now (2018-2023) we have found the best end-to-end performance across the broadest range of web browsers is acheived by streaming AAC-encoded audio within HLS segments.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See the deployment section (below) for notes on how to deploy the project on a live system like live.orcasound.net.

If you want to set up your hardware to host a hydrophone within the Orcasound network, take a look at how to join Orcasound and our prototype built from a Raspberry Pi with the Pisound ADC HAT.

The general scheme is to acquire audio data from a sound card within a Docker container via ALSA or Jack and FFmpeg, and then stream the audio data with minimal latency to cloud-based storage (as of Oct 2021, we use AWS S3 buckets). Errors/etc are logged to LogDNA via a separate Docker container.

Prerequisites

An ARM or X86 device with a sound card (or other audio input devices) connected to the Internet (via wireless network or ethernet cable) that has Docker-compose installed and an AWS account with some S3 buckets set up.

Installing

Create a base docker image for your architecture by running the script in /base/rpi or /base/amd64 as appropriate. You will need to create a .env file as appropriate for your projects. Common to to all projects are the need for AWS keys

AWSACCESSKEYID=YourAWSaccessKey
AWSSECRETACCESSKEY=YourAWSsecretAccessKey
 
SYSLOG_URL=syslog+tls://syslog-a.logdna.com:YourLogDNAPort
SYSLOG_STRUCTURED_DATA='logdna@YourLogDNAnumber key="YourLogDNAKey" tag="docker"

(You can request keys via the #hydrophone-nodes channel in the Orcasound Slack. As of October, 2023, we are continuing to use AWS S3 for storage and LogDNA for live-logging and troubleshooting.)

Here are explanations of some of the .env fields:

NODE_NAME should indicate your device and it's location, ideally in the form device_location (e.g. we call our Raspberry Pi staging device in Seattle rpi_seattle.
NODE_TYPE determines what audio data formats will be generated and transferred to their respective AWS buckets.
AUDIO_HW_ID is the card, device providing the audio data. Note: you can find your sound device by using the command "arecord -l". For Raspberry Pi hardware with pisound just use AUDIO_HW_ID=pisound
CHANNELS indicates the number of audio channels to expect (1 or 2).
FLAC_DURATION is the amount of seconds you want in each archived lossless file.
SEGMENT_DURATION is the amount of seconds you want in each streamed lossy segment.

Supported combinations

NODE ARCHITECTURE	node	mseed
arm32v7	Y	N
amd64	Y	Y

NODE ARCHITECTURE	hls-only	research	dev-virt
arm32v7	Y	Y	N
amd64	Y	N	Y

NODE Hardware	hls-only	research
RPI4	Y	Y
RPI3 B-	Y	N

Running local tests

In the repository directory (where you also put your .env file) first copy the compose file you want to docker-compose.yml. For example if you are raspberry pi and you want to use the prebuilt image then copy docker-compose.rpi-pull.yml to docker-compose. Then run docker-compose up -d. Watch what happens using htop. If you want to verify files are being written to /tmp or /mnt directories, get the name of your streaming service using docker-compose ps (in this case orcanode_streaming_1) and then do docker exec -it orcanode_streaming_1 /bin/bash to get a bash shell within the running container.

Running an end-to-end test

Once you've verified files are making it to your S3 bucket (with public read access), you can test the stream using a browser-based reference player. For example, with Bitmovin HLS/MPEG/DASH player you can use select HLS and then paste the URL for your current S3-based manifest (.m3u8 file) to listen to the stream (and observe buffer levels and bitrate in real-time).

Your URL should look something like this:

https://s3-us-west-2.amazonaws.com/dev-streaming-orcasound-net/rpi_seattle/hls/1526661120/live.m3u8

For end-to-end tests of Orcasound nodes, this schematic describes how sources map to the dev, beta, and live subdomains of orcasound.net --

(Google draw source and archived schematics) -- and you can monitor your development stream via the web-app using this URL structure:

dev.orcasound.net/dynamic/node_name

For example, with node_name = rpi_orcasound_lab the test URL would be dev.orcasound.net/dynamic/rpi_orcasound_lab.

Deployment

If you would like to add a node to the Orcasound hydrophone network, read through our Administrative Handbook and then contact [email protected] if you have any questions.

Built With

FFmpeg - Uses ALSA to acquire audio data, then generates lossy streams and/or lossless archive files
rsync - Transfers files locally from /tmp to /mnt directories
s3fs - Used to transfer audio data from local device to S3 bucket(s)

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.

Authors

Steve Hicks - Raspberry Pi expert - Steve on Github
Paul Cretu - Lead developer - Paul on Github
Scott Veirs - Project manager - Scott on Github
Val Veirs - Hydrophone expert - Val on Github

See also the list of orcanode contributors who have helped this project and the Orcasound Hacker Hall of Fame who have advanced both Orcasound open source code and the hydrophone network in the habitat of the endangered Southern Resident killer whales.

License

This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE.md file for details

Acknowledgments

Thanks to the backers of the 2017 Kickstarter that funded the development of this open source code.
Thanks to the makers of the Raspberry Pi, the Pisound HAT (Blokas in Lithuania), and the manufacturers who supply us with long-lasting, cost-effective hydrophones.
Thanks to the many friends and backers who helped improve maintain nodes and improve the Orcasound app.

orcanode's People

Contributors

Stargazers

Watchers

Forkers

meressep dfo-mpo bangpradyumna kunakl07 aanyap valentina-s pramitsahoo evanjscallan karan2704 paramsrinivas vedek whitewolf47

orcanode's Issues

Support hydrophone nodes with intermittent network access

This is basically getting the s3_upload branch to work well enough to move it to master. The use case is a node is offline, and the node stores locally the files, and then when reconnected uploads them in the background at a lower priority than streaming the audio

Support pulling from docker hub in addition to building localling for amd64 and arm32v7 nodes

There are several ways to do this, I'm taking the approach of instead of calling docker-compose build and docker-compose up using the -f option to select the correct compose file to have the desired outcome, like docker-compose -f docker-compose-rpi-build.yml and having seperate compose files for eatch option

Cloud-based deployment of OOI hydrophone streaming

2022 GSoC project by @karan2704 developed code to convert audio data in mseed format from the public Ocean Observatories Initiative(OOI) hydrophones into a format suitable to stream from the Orcasound website. This project aims to deploy the streaming on the cloud for continuous listening, test the robustness of the system, and integrate with the Orcasound website.

Test deployment can be run on AWS instance, however, as part of the project it would be helpful to also investigate how to minimize the costs by using other services.

Expected outcomes: Increase access to NSF-funded audio data from hydrophones in killer whale habitat on the outer coast, opening the door to wintertime acoustic detections of endangered orcas.

Required skills: Python, Docker, Cloud Computing

Bonus skills: Audio Processing, javascript, GIthub Actions

Mentors: Valentina, Karan

Difficulty level: Medium

Project Size: 175 or 350 h

Resources:

Orcanode repo
Orcanode wiki
Karan's blog post
Karan's final report

Getting Started:
Run the docker setup locally.
Run the github workflows on your fork.

Points to consider in the proposal:
What cloud tools you can use to continuously pull the data?
How do you optimize the performance? Minimize cost?
How do you set up the cloud infrastructure so that it is scriptable/cloud-agnostic?

Add support for decentralized storage

In yesterday's standup, motivated by expiration of AWS and Azure credits this year, we had a lively discussion of potentially shifting to decentralized storage of Orcasound audio data. This could initially happen for just compressed, lossy bouts for human listening -- aka a playlist of Greatest Hits of Orcasound. But it might some day also be a cost-effective and secure solution for the full streaming archive, currently about 3.5 TB or 250MB/node/year for only HLS data. (Here is a calculator for estimating data storage costs within AWS S3 for Orcasound).

Here are some related follow-up links that @prafulfillment offered in the Orcasound Slack:

https://opendata.stackexchange.com/questions/4080/what-are-some-opendata-torrents-to-seed
difference between IPFS & Torrents. TL;DR — IPFS can share a massive dataset with peers accessing only small parts whereas BitTorrent needs the entire library downloaded to swarm (at least from the comments)
#208 Comparison of IPFS and BitTorrent for Archives
how to from Wikimedia Foundation: https://github.com/ipfs/distributed-wikipedia-mirror

mseed restart time is long

There doesn't seem to be a reliable way to make ffmpeg loop forever, so we need to have a cron job to restart mseed occasionally. The mseed start time is really long, part of this is due to the five minute file size time but it could probably been improved from where it is

Make sampling rate for audio configurable by environment variable for stream.sh

The pisound board needs the sampling rate (-ar in ffmpeg) set to 192000. This will break x86 so we need to make is configurable by environment variable

Broken segments

While working on Orcasound data I've encountered a few issues, all examples are taken from the rpi_bush_point/hls/1624818619:

some segments have silence in them
live1049.ts
some segments are super short, less than one-tenth of a second
The first one is live221.ts, 0.085333s long but there are more later

some segments seem to have broken headers?
live808.ts seems to be broken, can't play it at all.
And here's ffmpeg error I'm getting on it when trying to convert to .wav

ERROR:root:
ERROR:root:ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
[mpeg @ 0x555c91ce6000] Format mpeg detected only with low score of 25, misdetection possible!
[mp2 @ 0x555c91d0c500] Header missing
    Last message repeated 1 times
[mpeg @ 0x555c91ce6000] decoding for stream 0 failed
[mpeg @ 0x555c91ce6000] Could not find codec parameters for stream 0 (Audio: mp2, 0 channels, s16p): unspecified frame size
Consider increasing the value for the 'analyzeduration' and 'probesize' options
Input #0, mpeg, from 'bush_point/1624818619/live808.ts':
  Duration: 00:00:03.57, start: 8081.641067, bitrate: 3 kb/s
    Stream #0:0[0x1c0]: Audio: mp2, 0 channels, s16p
Stream mapping:
  Stream #0:0 -> #0:0 (mp2 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
[mp2 @ 0x555c91d0ca00] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp2 @ 0x555c91d0ca00] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
Finishing stream 0:0 without any data written to it.
[abuffer @ 0x555c91cc4280] Value inf for parameter 'time_base' out of range [0 - 2.14748e+09]
    Last message repeated 3 times
[abuffer @ 0x555c91cc4280] Error setting option time_base to value 1/0.
[graph_0_in_0_0 @ 0x555c91cdeb40] Error applying options to the filter.
Error configuring filter graph
Conversion failed!

test-engine-live error in fsunlink in packager.js after 21 hours streaming to S3

We get an intermittent error in test engine live where it tries to unlink a null reference to a file. Here is the log file:

The packager has processed: /tmp/dash_output_dir/audio_1319.ts.mp4. Last run was 9370ms ago. Average time between runs: 14997.639878695985ms.
packager.js:25
TypeError: path must be a string or Buffer
fs.js:1107
at TypeError (native)
at Object.fs.unlink (fs.js:1107:11)
at /root/test-engine-live-tools/lib/packager.js:30:10
at Array.forEach (native)
at /root/test-engine-live-tools/lib/packager.js:29:96
at ChildProcess. (/root/test-engine-live-tools/lib/packager.js:57:80)
at emitTwo (events.js:106:13)
at ChildProcess.emit (events.js:191:7)
at Process.ChildProcess._handle.onexit (internal/child_process.js:219:12)

This issue actually has happened on both Udoo and Raspberry Pi test nodes, and in two different locations.

Reproduction:

On Raspberry Pi using armv32 branch commit commit 9428511
with appropriate .env file do docker-compose up to start streaming to AWS
After approximately 21 hours you will get this error.

Improve CONTRIBUTE and make consistent with other Orcasound repos

Could give guidance in the contribute doc to join Slack or send email to dev list, but longer-term solution likely is to point to organization-wide guidance, e.g. at Github.com/orcasound.

Playback pauses due to buffer level going to zero

Describe the bug
As a live listening app user monitoring Bush Point hydrophone after troubleshooting a power loss, I experienced pauses or silent gaps in the playback on my Macbook within the Chrome browser.

To Reproduce
Steps to reproduce the behavior:

Go to Bitmovin player aimed at Bush Point data from today
Click on 'Play' (and you may need to unmute the player)
Listen for a few minutes and monitor the buffer level plot
Notice that the audio playback pauses for a few seconds whenever the buffer level decreases to zero.

Expected behavior
Continuous playback (assuming that the audio data is continuous!)

Screenshots
Here's a shot of the buffer level time series:

And here are two shots showing that when I'm experiencing the pauses within the live app (as oppose to the Bitmovin test player), the HLS segments seem to be available in S3 and downloaded with less than 1 min latency by my browser.

Desktop (please complete the following information):

Hardware: MacBook Pro (16-inch, 2019)
OS: OSX v.12.6
Browser Chrome Version 107.0.5304.87 (Official Build) (x86_64)

Additional context
I think this has been going on for a while (6 months off and on?), and maybe mostly for Bush Point -- but why that would be I do not know! It may be related to a lurking, hard-to-reproduce behavior of the the live app UI... possible playback performance differences based on which feed you load first, or perhaps the sequence of feeds you visit, i.e. if you load Orcasound Lab first sometimes, it seems the playback button does not function.

Add /tmp/flac and /tmp/hls to stream.sh

ffmpeg cannot create these directories by default, so we should always make them. They need to go here.

Set up temporary directories and symbolic links

mkdir -p /tmp/dash_segment_input_dir
mkdir -p /tmp/flac
mkdir -p /tmp/hls

Occasional extra (empty) datetime-named folder in the streaming bucket

Since starting to use rsync+s3fs to transfer files written by ffmpeg up to the S3, we occasionally get an extra (empty) datetime-named folder in the streaming bucket. Based on an analysis of ~20 successive folder names in mid-May I think that when the cronjob reboots the Rpi there is a short (~87sec) period when the container is started and the folder is created, but the container restarts before other processes get going, so that folder ends up being empty. The container re-start leads to a second run of the .sh script and therefore a second folder gets created and ffmpeg+rsync somehow get up in time to start filling it without the container restarting.

Interestingly, this doesn't happen every reboot. It's usually ~every other restart. Once I saw three restarts in a row without any intervening empty folders...

Maybe this would go away if we set the Docker container policy to not restart?

Or maybe it will go away when we get resin.io to handle (daily?) container restarts and no longer need to use a cron job on the Rpi...

Start FLAC files on the nearest minute?

Rather than starting data collection as soon as a node has booted up and started its container, maybe the script should wait until the next round minute?

Handle adding .ts files to existing m3u8 stream for delayed OOI Streaming with timestamp prefix

We need to cover the case of append new data to an older m3u8 file. i.e you convert 8 hours of delayed data, and then 8 hours later you do it again. There is a option for "hls_flags" and one flag is "append_list". I have not played with it, but it seems like a good place to start. So in terms of the timestamp it could be it might some like
OO-HYVM2--YDH-2022-06-27T10:00:00.000000_001
OO-HYVM2--YDH-2022-06-27T10:00:00.000000_002
and then 8 hours later
OO-HYVM2--YDH-2022-06-27T18:00:00.000000_001
OO-HYVM2--YDH-2022-06-27T18:00:00.000000_002

Restart docker every hour to improve reliability

We intermittently get some different errors after running more than 24 hours. In order to allow longer term testing we should restart docker once an hour while these are being investigated.

FLAC filenames or metadata should provide start time precision greater than seconds

For both localization of underwater sound and accurate annotation of the audio data, it would be ideal to have more than 1-second precision when indicating the start time of a FLAC file. Can ffmpeg add precision (say up to milliseconds, or ideally microseconds) when generating the filename?

Measure power draw against benchmark

In considering incorporation of Orcasound software in audio data acquisition systems that do not have access to shore power, measurements of the power demand of Orcasound hardware solutions would be valuable. Based on on-going benchmark studies like this one on PID Ramble, Raspberry Pi zeros can draw as little as 0.5W while a Raspberry Pi 4 with all 4 processors at work can draw a bit more than 5W.

LogDNA broken

Not sure why but the logging containers became disconnected from app.logdna.com Orcasound account sometime in late 2019...

Instructions to set up environment to run mseed docker on Lightsail

@karan2704 I believe you had to set up a few things: share some notes on the process.

Add support for hls loopback

It would be nice test hls audio without having to connect to S3 and have orcasound S3 AWS keys. One way to do this is to replace
s3_upload with something like

sleep 20
ffplay -nodisp /tmp/$NODE_NAME/hls/$timestamp/live.m3u8

keyed off a .env variable, perhaps the existing one NODE_LOOPBACK could have some additional value like HLS_LOOPBACK to support switching this on and off. This would be for all nodes types, dev/research/virtual/mseed

s3_upload branch (bug?): Occasionally Unix new datetime fails to upload

This has only happened ~twice during experimental deployment of boto-based Python s3_upload branch, but player fails to play (no response on pressing play button at live.orcasound.net/ ) because Unix datetime stamped directory is created on Raspberry Pi4 but not transferred to S3 streaming bucket. Thus player finds the latest.txt file, but throws and error when trying to access the latest .m3u8 file.

Possible bug may be in the inotify logic and/or rare issues with boto failing?

Intermittent failure of s3fs to mount /mnt directories because they are non-empty

First noticed this today (5/22/18) when Orcasound player failed to load latest HLS stream directory. Though the container restarts with ffmpeg and rsync running, s3fs fails to start as expected, so no files get transferred to S3 bucket.

LogDNA suggests it has happened intermittently since 5/17/22:

May 17 11:31:19 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-archive-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.
May 17 16:30:56 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-archive-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.
May 17 17:30:56 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-archive-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.
May 18 02:30:58 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-streaming-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.
May 18 06:31:47 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-streaming-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.
May 19 09:30:57 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-streaming-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.
May 19 22:31:44 rpi_seattle orcanode_streaming_1 err s3fs: MOUNTPOINT directory /mnt/dev-archive-orcasound-net/ is not empty. if you are sure this is safe, can use the 'nonempty' mount option.```

Not sure why the error wasn't logged this morning. The latest datetime stamped directory in the S3 streaming bucket was written around 9:30. Previous directories formed and were populated successfully at 8:30, 7:30, 6:30... and there doesn't look to be anything suspicious in the m3u8 or .ts files within the previous (8:30) directory.

mseed transcription dies during internet outages

The https request times out and there is no code to handle the exception gracefuly.

Replace figure with Orcasound Software Evolution Model

The figure goes right to left and warps my brain to read it 🤪. I created a version left to right here in .png and here in diagram format if one wants to use the opportunity to update it. I can submit PR to update it but the image is pulled from Orcasound website so best to update it there.

Move towards human-readable timestamps in audio filenames and/or directory names

In the long run, it would be valuable to stream and archive the Orcasound acoustic data with a NIST-synchronized timebase encoded in both the FLAC files and possibly also the HLS/DASH stream manifest and/or segments. If adjacent hydrophones (within earshot of each other) are synchronized with millisecond to microsecond precision, then we will be able to localize sounds with an accuracy that will help us learn more about biology: e.g. direction a soniferous animal is moving, location of a sound source, or identity of a signaler.

To this end, the shell script might be adapted (along with changes to how the player stays current) from its current syntax --

timestamp=$(date +%s)

-- to syntax such as:

timestamp=$(date +\%Y-\%m-\%d)

code source and snippet:

$ rsync -avz --delete --backup --backup-dir="backup_$(date +%Y-%m-%d)" /source/path/ /dest/path
By using $(date +%Y-%m-%d) I’m telling it to use today’s date in the folder name.

What is nature of time stamp for a .ts segment?

Is the creation time for an HLS segment shown in the S3 streaming bucket the time the object was written to S3 or is it the time the segment was initiated by ffmpeg on the node computer?

The answer would be important for anyone trying to use the lossy data for localization (which is a worse choice than the FLAC), or for an accurate assessment of audio data gaps.

Support logical names for soundcards

We see sometimes on reboots the hw ID for sound cards move, i.e. instead of 0,1 it becomes 1,1 and ffmpeg can't open it. We should be able to use logical names like pisound from .env

Orcanode setup on windows

While trying to build the mseed Dockerfile after building the base(amd64) I get the following error which apparently has something to do with obspy and its dependencies:

OS: Windows 10 pro

Log and monitor computer engineering metrics

In ordring more Raspberry Pi and Pisound boards today, and thinking about how to mount Orcasound single-board computers inside off-the-shelf weatherproof enclosures, we thought of a general feature request for the orcanode software: log and possible monitor metrics related to the performance of the hardware and the environmental conditions they experience.

For example, if we put a box in a sunny location, does the ambient temperature cause performance problems as the Raspberry Pi temperature rises? Or do hot (or damp!?) conditions cause components to fail earlier than under cooler (dryer) conditions?

Specifically, it might be valuable to log the following:

Temperature of the Raspberry Pi
Humidity (maybe with a USB humidity sensor?)
SD card performance and conditions (how much free space available after a power outage slows/stops upload bandwidth)
Threading of cores on the Rpi 4?

Occasional ALSA buffer xruns and file fragments in S3 buckets

Noticed a few errors via logDNA while testing arm32v7 branch with lagged m3u8 file. Possible fixes:

To reduce the xruns, add back in to the ffmpeg call this flag: -thread_queue_size 1024

To avoid copying file fragments (from rsync tmp writes, I think) via s3fs to the buckets, add to the relevant rsync calls these flags: --exclude='*.flac.*' and --exclude='*.ts.*'

Log hydrophone changes, map to L/R channels

As a system administrator, bioacoustic scientist, or developer
I want to be able to map the left and/or right channel of an audio data file (streamed HLS/DASH and/or archived FLAC) back to the make and model of the hydrophone(s), and ideally a unique identifier like a serial number.

Possible solutions:

Add .env variables for channel 1 vs 2 source description, ideally in a standard format like Make_Model_SN (e.g. Labcore_40_0023 or CRT_SQ26-08_SN003).
upload the .env file (minus any credentials) along with the latest.txt file OR log any metadata changes and/or field notes in some other mechanism, like a database with API, so it can be associated with the raw audio data and/or utilized dynamically in apps like the Orcasound player (context for hydrophone feed in v3 drawer).
An alternative to (2) above might be to include such variables within a swarm management tool and log any changes to them in a way that can be associated with the raw audio data...

Improve handling gaps in mseed data

In addition to being delayed is appears there can be big gaps in mseed OOI data. From hackathon testing we noticed this pretty large gap in mseed data that put the conversion code in very long wait on startup.

pull_1 | filename: OO-HYVM2--YDH-2021-05-07T14:40:00.000000.mseed file delay: 1 day, 7:46:38.620951
pull_1 | filename: OO-HYVM2--YDH-2021-05-07T14:45:00.000000.mseed file delay: 1 day, 7:41:38.620951
pull_1 | filename: OO-HYVM2--YDH-2021-05-08T00:16:40.205688.mseed file delay: 22:09:58.415263
pull_1 | filename: OO-HYVM2--YDH-2021-05-08T00:20:00.000000.mseed file delay: 22:06:38.620951
pull_1 | filename: OO-HYVM2--YDH-2021-05-08T00:25:00.000000.mseed file delay: 22:01:38.620951

Add support for OOI fixed start time

For testing (manual as well as CI type testing) it would be nice to be able to have a switch (environment variable perhaps) that would provide a fixed time instead of the current time as a starting point. This would allow reproducible testing.

mseed HLS generation has PES packet size mismatch

The mseed transcoding for mpegts results in occasional PES packet size mismatch and downstream errors. This seems to be an issue in that even though we are only using an audio stream there are only certain allowed packet sizes and the fixed file size means there is some partial packet potentially at the of the file. It seems like it might also be a function of the frame rate.

[mpegts @ 0x55c5fb8f4f40] PES packet size mismatch
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Sample rate index in program config element does not match the sample rate index configured by the container.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err decode_pce: Input buffer exhausted before END element found
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Error while decoding stream #0:0: Operation not permitted
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Sample rate index in program config element does not match the sample rate index configured by the container.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err decode_pce: Input buffer exhausted before END element found
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Error while decoding stream #0:0: Operation not permitted
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err size=N/A time=03:49:54.56 bitrate=N/A speed= 1x
[aac @ 0x55c5fb6a9d00] Queue input is backward in time
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557920, current: 1241543520; changing to 1241557921. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557921, current: 1241544960; changing to 1241557922. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557922, current: 1241546400; changing to 1241557923. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557923, current: 1241547840; changing to 1241557924. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557924, current: 1241549280; changing to 1241557925. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557925, current: 1241550720; changing to 1241557926. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557926, current: 1241552160; changing to 1241557927. This may result in incorrect timestamps in the output file.
Apr 29 21:10:36 rpi_steve_test mseed_stream_1 err Non-monotonous DTS in output stream 0:0; previous: 1241557927, current: 1241553600; changing to 1241557928. This may result in incorrect timestamps in the output file.

Feature request: set input channel # to reduce CPU and upload bandwidth (and have Orcasite make stereo from mono)

Streaming/archiving only a single channel could be a good idea, e.g.

for nodes at which stereo signals aren't a priority (single hydrophone; or no localization, only redundancy from dual hydrophones)
for nodes at which one of the two signals fails (e.g. due to Neptune/gremlins)
for nodes that are limited by upload bandwidth
to halve storage costs and space requirements

If the Orcasite player could generate a stereo stream from a mono stream, then it could do so if it

detects a mono stream, or
learns that a stream is mono (e.g via an API call to Resin about the node?).

mseed doesn't parse OOI url correctly out of .env file

Currenty the url is hard coded in the script

Add support to stream.sh to enable/disable flac archive based on environment variable

We want all nodes to stream hls and dash, but for high performance networks we would like to also create and archive flac. We need different ffmpeg and S3 setup options for each

add portaudio support to docker base image

This was requested since portaudio is used for edge computing on rpi and it's hard to add to docker

Allow different architectures to be built from same branch

Rather that having a single branch per architecture, we propose two step build where we create a base image per architecture (amd64, rpi, and arm64), and a second build for the features (hls node, flac node, mseed node). The base images should be available from dockerhub

Number of channels is hard-coded in stream.sh for node type = research

Noticed 2 channels in .mp3 file from Sunset Bay (concatenated .ts segments, transcoded), when there is only one hydrophone deployed and I'd set CHANNELS=1 in the .env file. Looked at current master branch in node/stream.sh and saw that $CHANNELS is used within the ffmpeg call for node type hls-only but for node type research it is still hardcoded as -ac 2 ...

## Streaming HLS with FLAC archive 
	nice -n -10 ffmpeg -f jack -i ffjack \
       -f segment -segment_time "00:00:$FLAC_DURATION.00" -strftime 1 "/tmp/$NODE_NAME/flac/%Y-%m-%d_%H-%M-%S_$NODE_NAME-$SAMPLE_RATE-$CHANNELS.flac" \
       -f segment -segment_list "/tmp/$NODE_NAME/hls/$timestamp/live.m3u8" -segment_list_flags +live -segment_time $SEGMENT_DURATION -segment_format \
       mpegts -ar $STREAM_RATE -ac 2 -acodec aac "/tmp/$NODE_NAME/hls/$timestamp/live%03d.ts

the -ac 2 part should be replaced with -ac $CHANNELS and then some tests should be run to ensure that the streamed audio sounds ok (single channel .ts segments play as stereo) and that we're being efficient with S3 storage (i.e. FLAC file has only one channel).

On Rpi 3B+ sometimes upon re/boot sound card,device doesn't equal 1,0

If .env file has specified hardware device as 1,0 this behavior means there are no data generated even though ffmpeg and s3fs appear to be running (via htop).

Intermittently getting write errors in test-engine-live

Intermittently we get write error in test engine live stream, usually after 24 hours. To reproduce the problem you just stream a long time. It happens both inside and outside the streaming docker container, but doesn't happen writing locally (i.e. no S3 enabled).

Here is a partial log and screenshot of debugger showing where it happens:

segment @ 0x275d4e0] Opening '/tmp/dash_segment_input_dir/audio_7757.ts' for writing
[mpegts @ 0x275cd00] frame size not set
Packaging single segment: audio_7755.ts
Available segments for DASHing: audio_7756.ts,audio_7757.ts
Packaging single segment: audio_7756.ts
Processing segments: /tmp/dash_output_dir/audio_7756.ts.mp4
Creating/updating DASH context. Thu Jan 11 2018 14:30:15 GMT+0000 (UTC)
Available segments for DASHing: audio_7757.ts,audio_7756.ts
[DASH] Generating segments and MPD 116223 seconds too late
DASH-ing file: 15.02s segments 15.00s fragments no sidx used
Splitting segments at GOP boundaries
DASHing file /tmp/dash_output_dir/audio_7756.ts.mp4
[segment @ 0x275d4e0] Opening '/tmp/dash_segment_input_dir/audio_7758.ts' for writing
[mpegts @ 0x275cd00] frame size not set
Packaging single segment: audio_7756.ts
Available segments for DASHing: audio_7757.ts,audio_7758.ts
Packaging single segment: audio_7757.ts
Processing segments: /tmp/dash_output_dir/audio_7757.ts.mp4
Creating/updating DASH context. Thu Jan 11 2018 14:30:29 GMT+0000 (UTC)
Available segments for DASHing: audio_7758.ts,audio_7757.ts
[DASH] Generating segments and MPD 116237 seconds too late
[DASH] Generating MPD at time 2018-01-11T14:30:30.253Z
[DASH] Current Period Duration: 116114speed= 1x
DASH-ing file: 15.02s segments 15.00s fragments no sidx used
Splitting segments at GOP boundaries
Results in: /tmp/dash_output_dir/audio_7756.ts.mp4
The packager has processed: /tmp/dash_output_dir/audio_7756.ts.mp4. Last run was 17897ms ago. Average time between runs: 15000.390020629131ms.
DASHing file /tmp/dash_output_dir/audio_7757.ts.mp4
[DASH] Generating MPD at time 2018-01-11T14:30:36.849Z
[DASH] Current Period Duration: 116114
mv: cannot stat '/tmp/dash_output_dir/live.mpd.tmp': No such file or directory
[DASH] Error moving file /tmp/dash_output_dir/live.mpd.tmp to /tmp/dash_output_dir/live.mpd: I/O Error
Error DASHing file: I/O Errorrate=N/A speed= 1x
Error while processing segments: 1
Waiting for the debugger to disconnect...
[segment @ 0x275d4e0] Opening '/tmp/dash_segment_input_dir/audio_7759.ts' for writing
[mpegts @ 0x275cd00] frame size not set
[segment @ 0x275d4e0] Opening '/tmp/dash_segment_input_dir/audio_7760.ts' for writing

Add mseed streaming support for OOI Network

OOI Network has mseed format hydrophone data that we would like to be able to add as a node for Orcasound

Handle lag/slow data case on OOI network

Even though we read the data with a delay sometimes data can come in later. So if data come in later (say within 24 hours) we would still like to have that data put into the delayed livestream so it can be listened to or possibly analyzed. This would mean keeping track of the gaps for one day say and if new data comes in then transcoding it and inserting it into the m3u8 file, or as a minimum just posting the .ts files and maybe logging. Here are some definitions we can use to cover this case.

not delayed: now - file-timestamp < delay
gap: file-timestamp - (prev-file-timestamp + prev-file-duration) > 1 minute
microgap: 1 minute > file-timestamp - (prev-file-timestamp + prev-file-duration) > 0
lag: file with timestamp in range of a gap some time greater than delay. Maybe 24 hours
recovered: file with timestamp after some time greater than 24 hours

use the mseed branch for this

header docstrings
doctrings to individual functions

Error using emdem/raspi-logspout

When regresssion testing some other changes I got this error

logspout_1 | !! x509: certificate signed by unknown authority

Not sure what the fix is but the workaround is to comment out this container in the docker-compose.yml file