Giter VIP home page Giter VIP logo

Comments (23)

Molkree avatar Molkree commented on May 24, 2024 1

Is the local timezone preferred? It is only one but it may not be obvious to a non-local person.

Right now we use Unix time so I'd prefer to stay with UTC. Not specifying time zone implies local time so fully compliant ISO 8601 UTC time without colons would look like 20210812T205209+0000, 20210812T205209+00 or 20210812T205209Z.

I personally don't care that much about strict standard adherence in this case and would prefer something more readable but still in UTC.

from orcanode.

ben-hendricks avatar ben-hendricks commented on May 24, 2024 1

As a comment to @scottveirs suggestion regarding filename convention and time synchronization: A change in filename convention is usually a small step, from a coding perspective. Synchronizing recording periods gave our coding team some headaches because we also wanted to be sure that all files have a predictable length (those with different length were re-named so that a search algorithm could filter them). However, in our experience the benefits outweigh the costs. a) It is a virtual requirement to x-correlate and localize transient signals. b) any match between a timestamp and a corresponding audio file can be made instantaneously.

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

It would be even better, as Paul pointed out on Slack recently, to get rid of the datetime-stamped S3 objects (akin to directories) and just store all data under a nodename with each data filename incorporating a NIST-synchronized timestamp.

We could get HLS segments to match the filename format of the FLAC files, which in the archive-orcasound-net bucket currently look something like:
2020-12-09_23-22-16_rpi_orcasound_lab--2.flac

Or we could align with ONC or OOI filename formats:

OOI: OO-HYVM2--YDH-2017-08-21T00_02_42.437000.mseed
ONC: ICLISTENHF1293_20171226T145827.651Z.wav

from orcanode.

mcshicks avatar mcshicks commented on May 24, 2024

I have this working now based on Pauls suggestion for using " -strftime 1"and modifying stream.sh (for research) to this "/tmp/$NODE_NAME/hls/$timestamp/%Y-%m-%d_%H-%M-%S.ts" filename.

from orcanode.

valentina-s avatar valentina-s commented on May 24, 2024

I think the more standard format is %Y-%m-%dTH:%M:%S.ts
i.e. colons for the hours, and T instead of the _ (space is also used but bad for filenames). Also, what about milliseconds?
I agree timezone indication will be good since I am never sure it is Greenwich time or local time.

ISO-8601 2021-08-11T18:01:50+00:00
UTC 2021-08-11T18:01:50Z

@Molkree you want to add your comments on the format?

from orcanode.

mcshicks avatar mcshicks commented on May 24, 2024

I tried %Y-%m-%dTH:%M:%S.ts instead of %Y-%m-%d_%H-%M-%S.ts" and I could not get the player to work. Not sure if it's unhappy with the : or the TH (probably the :) but ffmpeg does write the files fine. I can look into milliseconds, but I think the rpi's time in probably only accurate to maybe 10 ms? It uses NTP to sync time.

from orcanode.

paulcretu avatar paulcretu commented on May 24, 2024

ISO 8601 is a good idea, the full thing with timezone is %Y-%m-%dT%H:%M:%S%z. The problem is colons won't work on some filesystems (Windows), not sure if that has anything to do with it not working for you @mcshicks. I would propose something like %Y-%m-%d_%H-%M-%S_%Z (2021-08-12_20-52-09_UTC). It's readable, portable, and easy-ish to translate into ISO 8601.

The timezone could be easier to translate with %z (e.g. 2021-08-12_20-52-09+0000) since you wouldn't have to look up the abbreviation (like PDT in 2021-08-12_20-52-09_PDT). But there might be some cases where the + is a problem, and with negative offsets, it's a bit confusing to have the - (2021-08-12_20-52-09-0700). It would be nicest to get 2021-08-12_20-52-09Z for UTC and +0000 offset notation for other timezones but that doesn't seem to be an option with strftime.

from orcanode.

Molkree avatar Molkree commented on May 24, 2024

The problem is colons won't work on some filesystems (Windows)

Haha, actually you can't even upload such files using actions/upload-artifact#35 in GitHub workflows. I used colons at first but then changed it to this %Y-%m-%dT%H-%M-%S-%f

Haven't thought about timezone, I just used UTC everywhere I believe. If you do add it to the filename I'd also prefer +0000. If extra - at the end looks confusing, can always add delimiter like TZ or something (2021-08-12_20-52-09TZ-0100).

from orcanode.

valentina-s avatar valentina-s commented on May 24, 2024

I did not think about the colons. The OOI Archive has them but I guess this causes issues for some users.
The format without dashes and colons %Y%m%dT%H%M%S%Z is also supported by ISO 8601. I wonder if that can be run by the player? It may be less human readable but is also machine readable. I am more biased toward using something standard. The fractions are expected to be delimited with dots (or commas) to distinguish 01.05 (1 h 3 min) vs 01:05 (1h 5min). If there are no dashes before, maybe then the -/+ timezone will be more obvious. Is the local timezone preferred? It is only one but it may not be obvious to a non-local person.

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

@tsuize @veirs this is the HLS timestamp issue I was seeking on today's call. I think we should tackle this formatting decision this winter, adjust the orcanode code accordingly, and then fix everything that we're going to break, including at least:

  • The orcasite player code
  • The ingestion of live HLS data by aifororcas-livesystem (within Azure)
  • Scripts and packages that retrieve HLS data for particular time ranges
  • Likely the mseed transcoding tools built by @karan2704 and @mcshicks?

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

After looking at MBARI's Pacific Sound open data registry a bit, they seem to be using something like this:

2017-06-13T16:00:00

and John Ryan confirms via Slack that this is relying on the convention of scientific timestamps being assumed to be in the UTC time zone.

Personally, I find the ambiguity unnerving enough that I think it's worth resolving with the extra 3 characters +00...

So, I'd propose one of the following options:

  1. 20170613T160000+00
  2. 20170613-160000+00 which I find just barely human-readable enough
  3. 2017-06-13T16-00-00+00
  4. 2017-06-13_16-00-00+00 which I feel is the most human-readable while avoiding colons :

Or just use Modified Julian Date (MJD) for the filenames and utilize existing packages to decode into human-readable formats if/when necessary.

Opinions?

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

Also, we should test whether we can ensure ffmpeg can write a file with data starting at YYMMDD-HHMMSS precisely (to the nearest 10 or 100 microseconds). Otherwise we may need or want to add precision within the filename, i.e. precision high enough for any future localization efforts (e.g. 10 or 100 microseconds?).

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

@ben-hendricks shared on a call today that the BC Hydrophone Network uses a custom driver to generate timestamps from their icListen hydrophones in this format:

ICLISTENHF1281_20190704T085500.000Z_20190704T090000.000Z.flac

Where 1291 is the instrument ID (serial number?) and the .000 suffix is precision in seconds.

The archived format for processed calibrated noise level files assumes the user knows the timestamp is in UTC time zone, so ends up as (or close to?):

1281_20190704T085500.wav

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

Also, we should test whether we can ensure ffmpeg can write a file with data starting at YYMMDD-HHMMSS precisely (to the nearest 10 or 100 microseconds). Otherwise we may need or want to add precision within the filename, i.e. precision high enough for any future localization efforts (e.g. 10 or 100 microseconds?).

Related to this @ben-hendricks also made a good point that -- if possible -- it's ideal to have different nodes start their recordings on the minute (or they use a 5-minute interval) so that file names and time intervals end up being consistent across the network. This allows a direct request for a matching file, rather than a search through ~20k files for the desired matching time period from another location (e.g. for localization).

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

Great advice @ben-hendricks . Thanks for sharing insights from the BC Hydrophone Network!

I've created two orcanode issues based on your input:

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

@ben-hendricks shared on a call today that the BC Hydrophone Network uses a custom driver to generate timestamps from their icListen hydrophones in this format:

ICLISTENHF1281_20190704T085500.000Z_20190704T090000.000Z.flac

Where 1291 is the instrument ID (serial number?) and the .000 suffix is precision in seconds.

The archived format for processed calibrated noise level files assumes the user knows the timestamp is in UTC time zone, so ends up as (or close to?):

1281_20190704T085500.wav

These details ^^^ from Ben may be of interest @valentina-s @savageGrant @CaseCal @mitchhaldeman

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

@ben-hendricks Can you confirm/deny that the .000 part of the ICLISTEN file name is precision in seconds (rather an indication of zero hours offset from UTC (Z) time)?

from orcanode.

CaseCal avatar CaseCal commented on May 24, 2024

Thanks @scottveirs and @ben-hendricks, this is helpful and timely as we're juts developing our file naming and access tool.

I notice in that example that the .flac file contains a start and end time, while the wav file has just a start time. Is there any standard or preference to including only start time, start time and end time, or start time and duration? Especially as we gear towards efficient storage in our own project, we may not have conveniently sized archive file durations.

My though is having start time and end time makes it the easiest to scan files for a specific timestamp or period, but it also starts to become somewhat verbose.

from orcanode.

ben-hendricks avatar ben-hendricks commented on May 24, 2024

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

@ben-hendricks Can you confirm/deny that the .000 part of the ICLISTEN file name is precision in seconds (rather an indication of zero hours offset from UTC (Z) time)?

Thanks to facilitation by @ben-hendricks , Tom Dakin confirms via email:

Yes the .000 are milliseconds.

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

Noting that MANTA (Matlab-based noise analysis software) says this about datetime formats:

The preferred time/date format in the filename is yyyymmdd_HHMMSS (HHMMSS.FFF is also acceptable).

The date/time information can be located at any position within the filename. To aid users in renaming their acoustic data files to be compatible with MANTA software, a file renaming tool (Sox-o-matic) is available from The Cornell Lab of Ornithology Center for Conservation Bioacoustics:

Sox-o-matic Wiki: https://bitbucket.org/CLO-BRP/sox-o-matic/wiki/Home

Sox-o-matic Software download: https://www.birds.cornell.edu/ccb/sox-o-matic/

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

Also, we should test whether we can ensure ffmpeg can write a file with data starting at YYMMDD-HHMMSS precisely (to the nearest 10 or 100 microseconds). Otherwise we may need or want to add precision within the filename, i.e. precision high enough for any future localization efforts (e.g. 10 or 100 microseconds?).

See Steve's thoughts in this other orcanode issue for more info about achieving high precision with ffmpeg...

from orcanode.

scottveirs avatar scottveirs commented on May 24, 2024

Comparing readability of these two options, for fun:

20190704T085500.000Z (BCHN format)
20190704_092314.000Z (Proposed Orcasound format)

And noting that OOI added a lot of precision beyond MBARI, but neither added a Z or +00...

2017-06-13T16:00:00 (MBARI format, relying on convention of scientific timestamps defaulting to UTC time zone)
2021-08-04T00:20:00.000015 (OOI)

from orcanode.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.