Giter VIP home page Giter VIP logo

Comments (19)

arkgil avatar arkgil commented on May 26, 2024 3

Here you can check out my attempt at writing Telemetry.Metrics/StatsD integration: https://github.com/arkgil/telemetry_metrics_statsd. The code is not something I'd ship to production, but I hope it shows how one can use Telemetry.Metrics metric specs, attach to events and report metrics to external system based on that.

@deadtrickster I think that writing integration with Prometheus would be even easier, because the whole infrastructure is already in place. I.e. we would only need to create proper Prometheus metric and bump it from the event handler.

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

Hi @binaryseed! We are still working on the full experience but telemetry_metrics is meant to play a big role in that. We are hoping that new Phoenix applications will ship with something like this:

defmodule MyApp.Telemetry do
  use Supervisor

  def start_link(arg) do
    Supervisor.start_link(__MODULE__, arg, name: __MODULE__)
  end

  def init(_arg) do
    children = [
      {Telemetry.StatsD, metrics: metrics()}
    ]

    Supervisor.init(children, strategy: :one_for_one)
  end

  defp metrics do
    [
      counter(
        "http.request",
        metadata: [:controller, :action], tags: [:controller, :action]
      ),
      sum(
        "my_app.repo.query",
       metadata: [...]
      )
    ]
  end
end

Likely we won't ship with StatsD per se but the idea is exactly that you can replace the StatsD call by anything else (or even have multiple of them):

    children = [
      {Telemetry.StatsD, metrics: metrics()}
      {NewRelic.Telemetry, metrics: metrics()}
    ]

Or even different metrics per system. Then what NewRelic.Telemetry will do is to use the information in the metrics to attach to the event behind the scenes and publish it to new relic in the most efficient possible. In other words, Telemetry doesn't do any of the wiring together, it is up to each implementation.

In other words, having an Ecto package would be fine for now but in practice you will expect users to tell you exactly what to get out of Ecto.

/cc @tsloughter @deadtrickster

from telemetry.

deadtrickster avatar deadtrickster commented on May 26, 2024

What children processes are supposed to do? Get the list of metrics in the init, what else?

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

Get the list of metrics and then call :telemetry.attach on each of them and install the handlers that will aggregate the data in the way that makes most sense for the child process. For example, StatsD may have a pool of sockets and write to those sockets directly on each call. Prometheus may want to write this information to ets or another process (you would know better :D), etc. The important thing is exactly that we don't prescribe how the data is processes or aggregate in any way.

from telemetry.

deadtrickster avatar deadtrickster commented on May 26, 2024

Yeah, I think I mostly understand the idea (till I start writing exporter :-)), but what to do in this supervised child after calling all attaches? Just empty loop? Perhaps I think simple callback instead of child process will be enough...

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

@deadtrickster that's a very good question. I dunno. You could just have a process doing nothing forever or have a task with type temporary that exits after setting everything up. I guess you could even just run init and return :ignore.

That is perhaps a question for telemetry to answer because we still don't have a good mechanism to work with detached handlers and processes may be part of it. We will get there. :)

from telemetry.

tsloughter avatar tsloughter commented on May 26, 2024

Since it is something that should only be configured at the highest level (so usually a release) I think it'd be good to support a list of handlers to attach in sys.config:

{telemetry, [{handlers, [...]}]}

Need to think more on how that would look and will try it out with OpenCensus.

Actually, now that I write that I'm thinking maybe it would be OpenCensus that is configured to attach to telemetry events.. So a user who includes OpenCensus can enable attaching to telemetry events through its config. That may be better.

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

from telemetry.

tsloughter avatar tsloughter commented on May 26, 2024

@josevalim yea, I was starting on another reply to say start order sucks, so would definitely want to go with the latter.

I'm still not sure about getting the telemetry defined metrics to opencensus metrics though.

from telemetry.

tsloughter avatar tsloughter commented on May 26, 2024

Oh, and I don't think it'd look like the supervision example. I would expect the application (in this case statsd) to start and be able to configure itself based on application env and be updated by calls to it.

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

from telemetry.

binaryseed avatar binaryseed commented on May 26, 2024

For a New Relic exporter, I think it'd be ideal for the exporter to be responsible for attaching to the events, be it from application env config or via config passed to a process.

However regarding the telemetry-metrics part of the picture, the thing that seems a little tricky to me is taking a metric spec and turning it into the right Telemetry event... Say I get a metric spec name like my_app.repo.query. I could just assume that the Telemetry event is [:my_app, :repo, :query] and attach to that.

But it seems a little brittle - when I handle the event, the only thing I have to rely on is that query is the last segment of the event. The repo module could have a deeper nesting so it could have an arbitrary length, or it might not actually have Repo in the name. It's not possible to pattern match that last query segment which means my handle_event function can't really use the event to determine what to do.

I do want access to the Repo name because I'd like to include it in the metrics sent to New Relic (otherwise everything gets flattened if there are more than one Repo).

Perhaps that simply means that there shouldn't be app-specific things in the metric name / telemetry event. Maybe instead of my_app.repo.query it should always be ecto.query and the repo should be passed into the metadata?

Or perhaps it just means that the New Relic exporter should be passed higher-level config, like repos: [MyApp.Repo] and it should be responsible for deciding which metrics to deal with.

If there is a better place to have these broader design conversations, I'd be happy to shift them out of Github Issues...

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

from telemetry.

binaryseed avatar binaryseed commented on May 26, 2024

Great, I'll assume matching on metadata can give us everything we need.. I'll open an issue over in ecto for that particular case

I'll also assume for now that we'll start a child process that is passed a list of metrics and attaches to events based on them

from telemetry.

arkgil avatar arkgil commented on May 26, 2024

I agree that users should not be concerned with what handlers are attached to what events. After all, they're not interested in having a handler attached, they want to get a metric pushed to StatsD, NewRelic or scraped by Prometheus (possibly via opencensus).

Regarding child processes, in the small StatsD exporter PoC I made some time ago, there was no process at all, only something like:

defmodule MyApp.Telemetry do
  def start do
    TelemetryMetrics.StatsD.start(metrics(), host: "...", port: ...)
  end

  defp metrics do
    [
      counter(
        "http.request",
        metadata: [:controller, :action], tags: [:controller, :action]
      ),
      sum(
        "my_app.repo.query",
       metadata: [...]
      )
    ]
  end
end

So having a process is not really always necessary, at least until we discover we need them to handle handler errors. My point is, it's not something we should be really concerned about right now.

@binaryseed Regarding your concerns about metric/event names, metric specs provided by Telemetry.Metrics already include the information about expected metric name and the source event name. If I create a metric spec like this:

Telemetry.Metrics.counter("my_app.repo.query", name: "my_app.db_query")

I expect the [:my_app, :repo, :query] to be the source event name, my_app.db_query to be the metric name in the system we export to (unless the system follows different naming conventions, like separating name segments with hyphens or something like that).

The struct returned by Telemetry.Metric.counter/2 call above contains event_name field with [:my_app, :repo, :query] value, and metric_name field with [:my_app, :db_query] value, so you can use them directly when attaching handlers and assembling the "external" metric name.

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

@arkgil I would recommend keeping a process even if all it does is to start the handlers on init, and then trap exits and remove the handlers on terminate. I think it is important to have the handlers attached to a process that respects a supervision tree shutdown. Otherwise, if handlers are not removed on shutdown and they depend on an application, process or service, we can have failures when certain events are emitted during shutdown and those may be hard to reproduce.

from telemetry.

arkgil avatar arkgil commented on May 26, 2024

@josevalim right, that's a good point, I only thought about handling handler failures, but you're right that we need to support supervision tree shutdown.

from telemetry.

deadtrickster avatar deadtrickster commented on May 26, 2024

Oh, thanks for statsd lib, It should be helpful. I'll try to write prometheus integration soonish.

from telemetry.

josevalim avatar josevalim commented on May 26, 2024

There is also a ConsoleReporter shipping as part of Telemetry.Metrics as an example (as soon as this PR is merged) and @arkgil also wrote a "Writing Reporters" documentation for Telemetry.Metrics. I think we are all covered on this front now. :)

from telemetry.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.