Giter VIP home page Giter VIP logo

Comments (27)

tobiashenkel avatar tobiashenkel commented on June 11, 2024 1

What about something like this?

mappings:
- match: test.timing.*.*.*
  match_metric_type: counter|gauge|timer
  name: "my_timer"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server"

from statsd_exporter.

gaizeror avatar gaizeror commented on June 11, 2024 1

I don't know if it worths creating a new issue, so I'll try here first.
We are using datadog statsd python client, and we can't handle well conflicts today.
I am writing this comment because we found out some metrics were'nt been sent to statsd_exporter for a week.

  1. There are no logs ASAIK when a conflict occures. It would be great to have an option to enable these logs, so we will fail early.
  2. There is no way to "ignore" conflicts and duplicate the metric when labels are different.
  3. There is no way to unitest it.

Any suggestion how to handle such cases?

from statsd_exporter.

amahomet avatar amahomet commented on June 11, 2024

guys, any news here?

from statsd_exporter.

grobie avatar grobie commented on June 11, 2024

What behavior would you expect here? We can't register both conflicting metrics and export them. Just logging the error and continuing would result in silently ignoring all conflicting metrics.

from statsd_exporter.

acdha avatar acdha commented on June 11, 2024

I hit this problem as a report that “Prometheus is down” when some developers were rolling out a new build with different help text. Exiting just meant that it was in a loop with Upstart restarting the service so I'm not sure that was substantially better than an ERROR log message.

from statsd_exporter.

grobie avatar grobie commented on June 11, 2024

@acdha So you would prefer writing out log lines and ignoring all metrics with the new signature until someone manually restarts the statsd_exporter in that case?

from statsd_exporter.

acdha avatar acdha commented on June 11, 2024

I guess it's a judgement call whether you think it's better to be disruptive, forcing people to actually notice the problem, or to allow other applications to continue sending stats without interruption.

from statsd_exporter.

acdha avatar acdha commented on June 11, 2024

In my case running an instance shared by several teams, it would have been preferable if only the project which changed their stats experienced a gap but there is an argument that it'd also be acceptable to simply say “monitor process flapping better”.

from statsd_exporter.

grobie avatar grobie commented on June 11, 2024

I guess I wouldn't share a statsd exporter between teams for such reasons. We generally tend to prefer failing hard and early, as everything else usually makes debugging very difficult. That's why I'm a bit hesitant to implement a solution which will silently ignore metrics.

In general, I'd recommend to use direct instrumentation with our client libraries instead of relying on the statsd_exporter so much.

from statsd_exporter.

acdha avatar acdha commented on June 11, 2024

Yeah, in this case it was a shared instance among developers working on the same project - the person who was working on an update was trying to figure out why he was getting error messages from the statsd client when it terminated prematurely.

from statsd_exporter.

grobie avatar grobie commented on June 11, 2024

Given we ignore a lot of metrics already in statsd_exporter, I'd be personally fine accepting such a pull request. Changing the behavior would require to change the signature of *Container.Get() methods like

func (c *CounterContainer) Get(metricName string, labels prometheus.Labels) prometheus.Counter {
hash := hashNameAndLabels(metricName, labels)
counter, ok := c.Elements[hash]
if !ok {
counter = prometheus.NewCounter(prometheus.CounterOpts{
Name: metricName,
Help: defaultHelp,
ConstLabels: labels,
})
c.Elements[hash] = counter
if err := prometheus.Register(counter); err != nil {
log.Fatalf(regErrF, metricName, err)
}
}
return counter
}
. Instead of logging the error directly, it should be returned to the caller, the caller should then only call log.Errorf instead of log.Fatalf and continue with the next metric.

from statsd_exporter.

SuperQ avatar SuperQ commented on June 11, 2024

Also, it would be good to have an exporter internal error counter so you can monitor for the problem.

from statsd_exporter.

avplab avatar avplab commented on June 11, 2024

Guys, probably my question is not related statsd_exporter, but when we aligned metric labels not to crash statsd_exporter, how to restart Prometheus to recalculate metrics?

from statsd_exporter.

SuperQ avatar SuperQ commented on June 11, 2024

@avplab Please take your question to our community.

from statsd_exporter.

jacksontj avatar jacksontj commented on June 11, 2024

I just opened a PR to fix this (#72). Although this fixes the immediate issue of "exporter dies" it doesn't solve the longer-term issue of "you have to restart the exporter".

The most common case that this would be an issue is the following: an app exists and is emitting metrics. A new release of the app goes out which adds/removes some tags-- at this point those metrics are "broken" until the exporter is restarted. In this situation the "old" metircs are no longer being emitted, and as such we could remove them (given some TTL).

Because of this I was thinking of adding in a feature to basically TTL out metrics that haven't been emitted for a while if there is a new metric being emitted. Alternatively there could be some API call to "unregister" a metric-- but that seems fairly clunky (and not very "statsd-esque".

I figured I'd float the idea here first -- as it is related to this larger issue. If one of those (or some other option) is wanted I'll open a separate PR for the feature-- so we can get this fix for the crashing in quicker.

from statsd_exporter.

grobie avatar grobie commented on June 11, 2024

Thanks a lot @jacksontj. I merged your contribution.

For the discussion of how to handle such conflicting metrics in general, I'd recommend to write to prometheus-developers@.

from statsd_exporter.

jacksontj avatar jacksontj commented on June 11, 2024

Just for linkage (if anyone is interested) here is the thread on the developer list -- https://groups.google.com/forum/#!topic/prometheus-developers/Q2pRR-UlHI0

from statsd_exporter.

matthiasr avatar matthiasr commented on June 11, 2024

I closed #74 because it had gone stale, but I am going to reopen this issue to track the underlying reasoning.

from statsd_exporter.

matthiasr avatar matthiasr commented on June 11, 2024

x-ref: more discussion in #120

from statsd_exporter.

tobiashenkel avatar tobiashenkel commented on June 11, 2024

The StatsD -> graphite pipeline supports multiple metric types on the same name nicely (by type namespacing support within the graphite backend). Further there are things out there which send e.g. counters and timers on the same name.

e.g.:

time="2018-02-23T06:21:34Z" level=info msg="lineToEvents:  zuul.geard.packet.GRAB_JOB_UNIQ:0|ms" source="exporter.go:428"
time="2018-02-23T06:21:34Z" level=info msg="lineToEvents:  zuul.geard.packet.GRAB_JOB_UNIQ:1|c" source="exporter.go:428"

So what do you think about the idea to extend the mapper with a type match such that we can map different typed metrics of the same name to different prometheus metric names?

from statsd_exporter.

TimoL avatar TimoL commented on June 11, 2024

@tobiashenkel that would be the ideal solution, I guess.

@matthiasr would you like that solution? Any hints how to implement it?

from statsd_exporter.

matthiasr avatar matthiasr commented on June 11, 2024

from statsd_exporter.

grobie avatar grobie commented on June 11, 2024

from statsd_exporter.

tobiashenkel avatar tobiashenkel commented on June 11, 2024

Both would work, one is nearer to the actual metric line, the other more comprehensive in the config file.

I'd be fine with either way.

from statsd_exporter.

matthiasr avatar matthiasr commented on June 11, 2024

from statsd_exporter.

matthiasr avatar matthiasr commented on June 11, 2024

Ideally this should be three new issues 😂

  1. I agree, if you can figure out a way to log this, please send a PR! Sometimes this isn't so easy because the conflicts only arise at scrape time in the Prometheus client.

  2. When labels are different, there should not be a conflict. Can you (in a new issue) detail how the conflict arises? Are there two metrics conflicting with each other, or one conflicting with a built-in metric (like in prometheus/influxdb_exporter#37)?

from statsd_exporter.

matthiasr avatar matthiasr commented on June 11, 2024
  1. Again, in a new issue, could you detail what exactly you would like to unit test, and ideally how? Since your app and the exporter communicate over the network, I'm not sure what exactly you mean to unit test.

from statsd_exporter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.