Comments (23)
Is the local timezone preferred? It is only one but it may not be obvious to a non-local person.
Right now we use Unix time so I'd prefer to stay with UTC. Not specifying time zone implies local time so fully compliant ISO 8601 UTC time without colons would look like 20210812T205209+0000
, 20210812T205209+00
or 20210812T205209Z
.
I personally don't care that much about strict standard adherence in this case and would prefer something more readable but still in UTC.
from orcanode.
As a comment to @scottveirs suggestion regarding filename convention and time synchronization: A change in filename convention is usually a small step, from a coding perspective. Synchronizing recording periods gave our coding team some headaches because we also wanted to be sure that all files have a predictable length (those with different length were re-named so that a search algorithm could filter them). However, in our experience the benefits outweigh the costs. a) It is a virtual requirement to x-correlate and localize transient signals. b) any match between a timestamp and a corresponding audio file can be made instantaneously.
from orcanode.
It would be even better, as Paul pointed out on Slack recently, to get rid of the datetime-stamped S3 objects (akin to directories) and just store all data under a nodename with each data filename incorporating a NIST-synchronized timestamp.
We could get HLS segments to match the filename format of the FLAC files, which in the archive-orcasound-net bucket currently look something like:
2020-12-09_23-22-16_rpi_orcasound_lab--2.flac
Or we could align with ONC or OOI filename formats:
OOI: OO-HYVM2--YDH-2017-08-21T00_02_42.437000.mseed
ONC: ICLISTENHF1293_20171226T145827.651Z.wav
from orcanode.
I have this working now based on Pauls suggestion for using " -strftime 1"and modifying stream.sh (for research) to this "/tmp/$NODE_NAME/hls/$timestamp/%Y-%m-%d_%H-%M-%S.ts" filename.
from orcanode.
I think the more standard format is %Y-%m-%dTH:%M:%S.ts
i.e. colons for the hours, and T instead of the _ (space is also used but bad for filenames). Also, what about milliseconds?
I agree timezone indication will be good since I am never sure it is Greenwich time or local time.
ISO-8601 | 2021-08-11T18:01:50+00:00 |
---|---|
UTC | 2021-08-11T18:01:50Z |
@Molkree you want to add your comments on the format?
from orcanode.
I tried %Y-%m-%dTH:%M:%S.ts instead of %Y-%m-%d_%H-%M-%S.ts" and I could not get the player to work. Not sure if it's unhappy with the : or the TH (probably the :) but ffmpeg does write the files fine. I can look into milliseconds, but I think the rpi's time in probably only accurate to maybe 10 ms? It uses NTP to sync time.
from orcanode.
ISO 8601 is a good idea, the full thing with timezone is %Y-%m-%dT%H:%M:%S%z
. The problem is colons won't work on some filesystems (Windows), not sure if that has anything to do with it not working for you @mcshicks. I would propose something like %Y-%m-%d_%H-%M-%S_%Z
(2021-08-12_20-52-09_UTC
). It's readable, portable, and easy-ish to translate into ISO 8601.
The timezone could be easier to translate with %z
(e.g. 2021-08-12_20-52-09+0000
) since you wouldn't have to look up the abbreviation (like PDT in 2021-08-12_20-52-09_PDT
). But there might be some cases where the +
is a problem, and with negative offsets, it's a bit confusing to have the -
(2021-08-12_20-52-09-0700
). It would be nicest to get 2021-08-12_20-52-09Z
for UTC and +0000
offset notation for other timezones but that doesn't seem to be an option with strftime
.
from orcanode.
The problem is colons won't work on some filesystems (Windows)
Haha, actually you can't even upload such files using actions/upload-artifact#35 in GitHub workflows. I used colons at first but then changed it to this %Y-%m-%dT%H-%M-%S-%f
Haven't thought about timezone, I just used UTC everywhere I believe. If you do add it to the filename I'd also prefer +0000
. If extra -
at the end looks confusing, can always add delimiter like TZ
or something (2021-08-12_20-52-09TZ-0100
).
from orcanode.
I did not think about the colons. The OOI Archive has them but I guess this causes issues for some users.
The format without dashes and colons %Y%m%dT%H%M%S%Z
is also supported by ISO 8601. I wonder if that can be run by the player? It may be less human readable but is also machine readable. I am more biased toward using something standard. The fractions are expected to be delimited with dots (or commas) to distinguish 01.05 (1 h 3 min) vs 01:05 (1h 5min). If there are no dashes before, maybe then the -/+ timezone will be more obvious. Is the local timezone preferred? It is only one but it may not be obvious to a non-local person.
from orcanode.
@tsuize @veirs this is the HLS timestamp issue I was seeking on today's call. I think we should tackle this formatting decision this winter, adjust the orcanode
code accordingly, and then fix everything that we're going to break, including at least:
- The
orcasite
player code - The ingestion of live HLS data by
aifororcas-livesystem
(within Azure) - Scripts and packages that retrieve HLS data for particular time ranges
- Likely the mseed transcoding tools built by @karan2704 and @mcshicks?
from orcanode.
After looking at MBARI's Pacific Sound open data registry a bit, they seem to be using something like this:
2017-06-13T16:00:00
and John Ryan confirms via Slack that this is relying on the convention of scientific timestamps being assumed to be in the UTC time zone.
Personally, I find the ambiguity unnerving enough that I think it's worth resolving with the extra 3 characters +00
...
So, I'd propose one of the following options:
20170613T160000+00
20170613-160000+00
which I find just barely human-readable enough2017-06-13T16-00-00+00
2017-06-13_16-00-00+00
which I feel is the most human-readable while avoiding colons:
Or just use Modified Julian Date (MJD) for the filenames and utilize existing packages to decode into human-readable formats if/when necessary.
Opinions?
from orcanode.
Also, we should test whether we can ensure ffmpeg
can write a file with data starting at YYMMDD-HHMMSS precisely (to the nearest 10 or 100 microseconds). Otherwise we may need or want to add precision within the filename, i.e. precision high enough for any future localization efforts (e.g. 10 or 100 microseconds?).
from orcanode.
@ben-hendricks shared on a call today that the BC Hydrophone Network uses a custom driver to generate timestamps from their icListen hydrophones in this format:
ICLISTENHF1281_20190704T085500.000Z_20190704T090000.000Z.flac
Where 1291
is the instrument ID (serial number?) and the .000
suffix is precision in seconds.
The archived format for processed calibrated noise level files assumes the user knows the timestamp is in UTC time zone, so ends up as (or close to?):
1281_20190704T085500.wav
from orcanode.
Also, we should test whether we can ensure
ffmpeg
can write a file with data starting at YYMMDD-HHMMSS precisely (to the nearest 10 or 100 microseconds). Otherwise we may need or want to add precision within the filename, i.e. precision high enough for any future localization efforts (e.g. 10 or 100 microseconds?).
Related to this @ben-hendricks also made a good point that -- if possible -- it's ideal to have different nodes start their recordings on the minute (or they use a 5-minute interval) so that file names and time intervals end up being consistent across the network. This allows a direct request for a matching file, rather than a search through ~20k files for the desired matching time period from another location (e.g. for localization).
from orcanode.
Great advice @ben-hendricks . Thanks for sharing insights from the BC Hydrophone Network!
I've created two orcanode
issues based on your input:
from orcanode.
@ben-hendricks shared on a call today that the BC Hydrophone Network uses a custom driver to generate timestamps from their icListen hydrophones in this format:
ICLISTENHF1281_20190704T085500.000Z_20190704T090000.000Z.flac
Where
1291
is the instrument ID (serial number?) and the.000
suffix is precision in seconds.The archived format for processed calibrated noise level files assumes the user knows the timestamp is in UTC time zone, so ends up as (or close to?):
1281_20190704T085500.wav
These details ^^^ from Ben may be of interest @valentina-s @savageGrant @CaseCal @mitchhaldeman
from orcanode.
@ben-hendricks Can you confirm/deny that the .000
part of the ICLISTEN file name is precision in seconds (rather an indication of zero hours offset from UTC (Z) time)?
from orcanode.
Thanks @scottveirs and @ben-hendricks, this is helpful and timely as we're juts developing our file naming and access tool.
I notice in that example that the .flac file contains a start and end time, while the wav file has just a start time. Is there any standard or preference to including only start time, start time and end time, or start time and duration? Especially as we gear towards efficient storage in our own project, we may not have conveniently sized archive file durations.
My though is having start time and end time makes it the easiest to scan files for a specific timestamp or period, but it also starts to become somewhat verbose.
from orcanode.
from orcanode.
@ben-hendricks Can you confirm/deny that the
.000
part of the ICLISTEN file name is precision in seconds (rather an indication of zero hours offset from UTC (Z) time)?
Thanks to facilitation by @ben-hendricks , Tom Dakin confirms via email:
Yes the .000 are milliseconds.
from orcanode.
Noting that MANTA (Matlab-based noise analysis software) says this about datetime formats:
The preferred time/date format in the filename is yyyymmdd_HHMMSS (HHMMSS.FFF is also acceptable).
The date/time information can be located at any position within the filename. To aid users in renaming their acoustic data files to be compatible with MANTA software, a file renaming tool (Sox-o-matic) is available from The Cornell Lab of Ornithology Center for Conservation Bioacoustics:
Sox-o-matic Wiki: https://bitbucket.org/CLO-BRP/sox-o-matic/wiki/Home
Sox-o-matic Software download: https://www.birds.cornell.edu/ccb/sox-o-matic/
from orcanode.
Also, we should test whether we can ensure
ffmpeg
can write a file with data starting at YYMMDD-HHMMSS precisely (to the nearest 10 or 100 microseconds). Otherwise we may need or want to add precision within the filename, i.e. precision high enough for any future localization efforts (e.g. 10 or 100 microseconds?).
from orcanode.
Comparing readability of these two options, for fun:
20190704T085500.000Z
(BCHN format)
20190704_092314.000Z
(Proposed Orcasound format)
And noting that OOI added a lot of precision beyond MBARI, but neither added a Z
or +00
...
2017-06-13T16:00:00
(MBARI format, relying on convention of scientific timestamps defaulting to UTC time zone)
2021-08-04T00:20:00.000015
(OOI)
from orcanode.
Related Issues (20)
- Support hydrophone nodes with intermittent network access
- Improve CONTRIBUTE and make consistent with other Orcasound repos
- Error using emdem/raspi-logspout HOT 1
- Number of channels is hard-coded in stream.sh for node type = research
- Orcanode setup on windows
- Handle lag/slow data case on OOI network
- Handle adding .ts files to existing m3u8 stream for delayed OOI Streaming with timestamp prefix
- Add support for OOI fixed start time
- Create Git Actions unit testing for OOI function HOT 1
- Add support for decentralized storage
- Add docstrings to `mseedpull.py` and `ooipull.py` HOT 1
- Instructions to set up environment to run mseed docker on Lightsail
- Log hydrophone changes, map to L/R channels HOT 2
- What is nature of time stamp for a .ts segment?
- Start FLAC files on the nearest minute?
- Replace figure with Orcasound Software Evolution Model HOT 1
- Cloud-based deployment of OOI hydrophone streaming
- Measure power draw against benchmark
- Log and monitor computer engineering metrics HOT 1
- Playback pauses due to buffer level going to zero HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orcanode.