Giter VIP home page Giter VIP logo

openmetrics's Introduction

openmetrics's People

Contributors

andysim3d avatar baloo avatar beorn7 avatar brian-brazil avatar caniszczyk avatar cw9 avatar hacklschorsch avatar ide-rea avatar joewrightss avatar kevinschweikert avatar leecalcote avatar lucperkins avatar lukmdo avatar mattbostock avatar millermatt avatar mrueg avatar mtwo avatar mxinden avatar pauldix avatar pcgeek86 avatar plant99 avatar richih avatar robskillington avatar sinkingpoint avatar sphinxknight avatar sumeer avatar superq avatar theneva avatar tomwilkie avatar vpranckaitis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openmetrics's Issues

Be explicit about scraping, streaming and push

In #11 we mention scraping but I can imagine that someone might want to use the exposition format for streaming (or pushing) metrics.

We should consider such use cases, their implications and whether the specification should allow for them.

Start timestamp for points in a cumulative timeseries

Continuing the discussion from this week's meeting. This is not restricted to COUNTER, since histograms can also be cumulative (which is another discussion -- there is a difference between "type" (int, histogram etc) and "kind" (gauge, cumulative) of a metric).

In this model a cumulative point is a tuple (t_start, t, v) where t is the time when value v was sampled and t_start is when the timeseries had value 0.
The motivation for the start timestamp is that it is completely describes the cumulative point and does not assume anything about when the last point was collected. This guards against various inaccuracies caused by the monitoring backend losing points or the source being monitored restarting (or clearing its metric state to save memory), and allows for pushing metrics to the monitoring backend.

For example, consider a source generating counter points at timestamp 0min, 10min, 20min, with values 5, 10, 15 and then it crashes, and finally restarts at 57m, and reports a point with value 17 at 60min. We don't want to incorrectly assume that the counter has increased from 15 to 17 over the interval [20m, 60m]. With a perfect failure detector (which of course doesn't exist) a backend could narrow that interval to [50m, 60m] but it is still less accurate than the real t_start=57min, which is easy for the source to report.

Add considerations about metric evolution?

@sumeer said in https://github.com/RichiH/OpenMetrics/issues/2#issuecomment-316855917

"There is still the issue that different versions of the same library may have incompatible metric types if the developer is not careful, but that should be less common, especially if the standard provides guidelines on how users should evolve their metric (such guidelines are beneficial even in a backend that doesn't enforce the type at data ingestion time, for queries to work sanely across library versions)."

Specify metadata extendability

Elsewhere we're talking about enum and bool metadata, and there'll probably be others.

We should probably specify if additional metadata fields are permitted, what to do if you come across them, what names are reserved etc.

Specify suffixes convention

Presently there is a Prometheus convention that all Counter types should end in _total.
This should probably be a SHOULD, with verbage around enforcement.

We should also mention that _sum, _count and _bucket are reserved suffixes for their respective types and should not be used elsewhere.

Create a Glossary

Since different platforms and users have various terminology we need a glossary for OpenMetrics that delineates exactly what each term means.

Clarify what metrics are vs. event logging and other systems

Even in our own discussions we're getting confused about what is and is not in scope for a metrics-based transport format. I expect this to continue to be an issue (it still is within the Prometheus ecosystem, even after a few years of being quite clear about it).

I think we should have an opening section covering the idea that metrics are regular snapshots of state, not statsd style samples, profiling or other event loggingy stuff.

Support for geo-time series metrics?

Paul briefly touched on it but there is a huge amount of interest in geo-time series info, i.e. numeric values with coordinates. We should be forward thinking and account for this. Probably a first level data type ala histograms would be fine.

Specify grouping/ordering of timeseries in metrics

Currently Prometheus only specifies that histogram buckets must be increasing.

I think we should further specify that a given metric (i.e. same labels) must be together, as that makes things easier for those who support first-class metrics.

New types

Add type support for values?

  • UNIT64 seems like a given
  • bool can be handled by longer numbers, Prometheus is good at compressing down
    • This is implementation-specific for Prometheus, so we might want to have the type nonetheless
    • bools can be cast as ENUM if people want to see true/false on the UI
  • Strings? Will be linked to new issue by Sumeer
  • ENUMs: https://github.com/RichiH/OpenMetrics/issues/3

Document that NaN does not mean "missing"

This has been a confusion a few times in Prometheus, as we're one of the very few systems that support non-real floating point values. NaN should only be used where it makes mathematical sense.

Specify how to handle missing time series in metrics

We know Summarys will permit missing quantiles.

We should also specify what happens if a Summary is missing _count/_sum or if a Histogram is missing _sum as that can happen with some other monitoring systems. I believe we should not expose them. It should also be made clear that NaN should only be used where it makes mathematical sense, not as a general "missing" signal.

Namespacing

Do we need namespacing? If yes, what form should it take?

  • Schemes
    • Java scheme - works, but long
    • "Play nice" - Won't work
    • Current Prometheus style, i.e. prefix your library name (grpc_, snmp_, etc)
  • Possible formats (Java as example)
    • metric{__namespace__=”io.prometheus.foo.bar...”} 1
    • io.prometheus.foo.bar.metric{} 1

Exposition format: charset in content-type

text/plain; version=0.0.4 should probably be extended by the charset: text/plain; version=0.0.4; charset=utf-8. Also, we should clarify that the charset is always UTF-8, even if the text format is used as a fallback for an incomplete or unknown content-type.

prometheus/docs#557

Allow leading/trailing whitespace in label values?

It's arguably never a good idea to do this, on the other hand, this can have implications for #14 and possibly others.

To be precise and reproducible, I am somewhat leaning towards "SHOULD not expose leading/trailing whitespace" and "ingestor MUST NOT drop leading/trailing whitespace". But having this strong wording leads to "what about storage/UI/etc which can not properly handle leading/trailing whitespace".

It might be best to prohibit leading/trailing whitespace, along with the implications for #14.

Unify escaping

Currently there's slightly different escaping rules for label values and help string. It's not unlikely that implementors will get this incorrect, so we should unify on just one approach.

prometheus/docs#550

Extend CONTRIBUTORS.md

If you're reading this, there's a non-zero chance that you should add yourself so people know who's who.

Specify inconsistent labels

Within a given metric, different children might have different labels. While this is wrong for direct instrumentation, it can be valid once target labels are involved. Thus this must be allowed for in the format, but some words of warning would be good.

Specify label ordering

Currently there's no specification around ordering of labels for exposition. However in practice client libraries have a consistent ordering due to unittests.

Optimisations in Prometheus 2.0 require a consistent ordering, so we should encourage that to be the case.

ENUMs

Do we want ENUMs, mapped against the numeric value of the time series?

If yes, we need to make this globally unique to avoid clashes (might be solved by namespacing as per https://github.com/RichiH/OpenMetrics/issues/2)

  • metric_enum{enum=”1:foo,2:bar,3:baz”} 2
  • metric_enum{enum="foo"} 1, metric_enum{enum="bar"} 0 ….
    • Have a "# ENUM enum" metadata to hint this to storage
  • metric_enum{enum_1=”foo”,enum_2=”bar”,enum_3=”baz”} 1

Decide how much to specify about units

There is a Prometheus convention that we use base units like seconds and bytes, rather than a mish-mash of units that are signalled via metadata/metric names.

I think we should formalise this in the spec as a recommended practice.

Specify Summary quantile semantics

Prometheus Summary semantics aren't currently formally specified, however there is by now a defacto standard.

The quantile is over the last 5ish minutes. If there was no data in that time, then a NaN should be reported. This is as a 0 _count would produce NaN when you did _sum / _count.

Add considerations about cardinality?

This is implementation-specific, but a few numbers might be useful as a frame of reference.

#22 comes to mind; having arbitrary full even text as label value would be a "metric" in name, but not in practice.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.