Giter VIP home page Giter VIP logo

Comments (10)

jkschneider avatar jkschneider commented on August 18, 2024 1

With the current implementation of CKMS that seems widespread (literally copied from place to place including Prometheus and Netflix), the quantiles are updated after each 500 sample batch. So it largely depends on how many samples are coming in within a particular time window.

But the flat nature of it in this graph just shows how CKMS is more stable and accurate at the expense of computational complexity. I imagine if you want an alert on one of these quantiles, you would set the threshold high enough that the perturbation in a successive approximation approach like Frugal is probably irrelevant.

At any rate, feel free to play with the sample from which this graph was generated.

from micrometer.

jkschneider avatar jkschneider commented on August 18, 2024

Prometheus uses CKMS, but we can evaluate the instrumentation cost of a range of quantile algorithms against this. Netflix also uses Frugal2U for load balancing. Great illustration of the effect over time here.

from micrometer.

jkschneider avatar jkschneider commented on August 18, 2024

Frugal is substantially faster, but as a successive approximation algorithm benefits from good initial estimates for faster convergence. There is a legitimate question about how to arrive at those initial estimates.

QuantilesBenchmark.ckmsQuantiles       avgt      30     1233.518 ±  105.064   ns/op
QuantilesBenchmark.frugal2uQuantiles   avgt      30       82.686 ±    2.720   ns/op

from micrometer.

checketts avatar checketts commented on August 18, 2024

I'm not familiar with CKMS/Frugal. Are those algorithms? If the output is pretty much identical, is it something worth pulling upstream?

I know a common footnote with using Summaries in Prometheus is in regards to potential 'cost' of the client side calculations.

Very cool to see the substantial difference.

from micrometer.

jkschneider avatar jkschneider commented on August 18, 2024

The average time benchmark does hide one somewhat pernicious characteristic about CKMS, which is the effect of the algorithm's "batch observations and calculate every so often" approach on worst case performance. The implementation in Netflix Ocelli and Prometheus batches 500 observations before computing a result. p=0.999 samples show the effect of this:

QuantilesBenchmark.ckmsQuantiles:p0.999      sample          1435648.000         ns/op
QuantilesBenchmark.frugal2uQuantiles:p0.999  sample             6693.648         ns/op

Yikes.

from micrometer.

jkschneider avatar jkschneider commented on August 18, 2024

An example of both algorithms at work simultaneously on two summaries recording the same values.

image

from micrometer.

checketts avatar checketts commented on August 18, 2024

The flat nature of CKMS seems to imply it lags.

If you don't mind, I would be interested to see if the numbers doubled at one point, how long would it take until CKMS actually register it?

from micrometer.

jkschneider avatar jkschneider commented on August 18, 2024

Ultimately, I built in 4 different quantile algorithms and selected the GK-based sliding window algorithm as the underlying implementation for @Timed because it requires no tuning per quantile (so is simplest for annotation use) and is otherwise most similar to the native Prometheus implementation which is CKMS wrapped in a sliding-window.

image

from micrometer.

sbilello avatar sbilello commented on August 18, 2024

if you want to know the number of requests in a certain period of time. How can you achieve that with the given _sum and _count metrics.
https://www.innoq.com/en/blog/prometheus-counters/ I was looking to apply the increase function correctly.

from micrometer.

checketts avatar checketts commented on August 18, 2024

Please ask this on stack overflow

from micrometer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.