Comments (19)
Here you can check out my attempt at writing Telemetry.Metrics/StatsD integration: https://github.com/arkgil/telemetry_metrics_statsd. The code is not something I'd ship to production, but I hope it shows how one can use Telemetry.Metrics metric specs, attach to events and report metrics to external system based on that.
@deadtrickster I think that writing integration with Prometheus would be even easier, because the whole infrastructure is already in place. I.e. we would only need to create proper Prometheus metric and bump it from the event handler.
from telemetry.
Hi @binaryseed! We are still working on the full experience but telemetry_metrics is meant to play a big role in that. We are hoping that new Phoenix applications will ship with something like this:
defmodule MyApp.Telemetry do
use Supervisor
def start_link(arg) do
Supervisor.start_link(__MODULE__, arg, name: __MODULE__)
end
def init(_arg) do
children = [
{Telemetry.StatsD, metrics: metrics()}
]
Supervisor.init(children, strategy: :one_for_one)
end
defp metrics do
[
counter(
"http.request",
metadata: [:controller, :action], tags: [:controller, :action]
),
sum(
"my_app.repo.query",
metadata: [...]
)
]
end
end
Likely we won't ship with StatsD per se but the idea is exactly that you can replace the StatsD call by anything else (or even have multiple of them):
children = [
{Telemetry.StatsD, metrics: metrics()}
{NewRelic.Telemetry, metrics: metrics()}
]
Or even different metrics per system. Then what NewRelic.Telemetry will do is to use the information in the metrics to attach to the event behind the scenes and publish it to new relic in the most efficient possible. In other words, Telemetry doesn't do any of the wiring together, it is up to each implementation.
In other words, having an Ecto package would be fine for now but in practice you will expect users to tell you exactly what to get out of Ecto.
/cc @tsloughter @deadtrickster
from telemetry.
What children processes are supposed to do? Get the list of metrics in the init, what else?
from telemetry.
Get the list of metrics and then call :telemetry.attach
on each of them and install the handlers that will aggregate the data in the way that makes most sense for the child process. For example, StatsD may have a pool of sockets and write to those sockets directly on each call. Prometheus may want to write this information to ets or another process (you would know better :D), etc. The important thing is exactly that we don't prescribe how the data is processes or aggregate in any way.
from telemetry.
Yeah, I think I mostly understand the idea (till I start writing exporter :-)), but what to do in this supervised child after calling all attaches? Just empty loop? Perhaps I think simple callback instead of child process will be enough...
from telemetry.
@deadtrickster that's a very good question. I dunno. You could just have a process doing nothing forever or have a task with type temporary that exits after setting everything up. I guess you could even just run init
and return :ignore
.
That is perhaps a question for telemetry to answer because we still don't have a good mechanism to work with detached handlers and processes may be part of it. We will get there. :)
from telemetry.
Since it is something that should only be configured at the highest level (so usually a release) I think it'd be good to support a list of handlers to attach in sys.config
:
{telemetry, [{handlers, [...]}]}
Need to think more on how that would look and will try it out with OpenCensus.
Actually, now that I write that I'm thinking maybe it would be OpenCensus that is configured to attach to telemetry events.. So a user who includes OpenCensus can enable attaching to telemetry events through its config. That may be better.
from telemetry.
from telemetry.
@josevalim yea, I was starting on another reply to say start order sucks, so would definitely want to go with the latter.
I'm still not sure about getting the telemetry defined metrics to opencensus metrics though.
from telemetry.
Oh, and I don't think it'd look like the supervision example. I would expect the application (in this case statsd) to start and be able to configure itself based on application env and be updated by calls to it.
from telemetry.
from telemetry.
For a New Relic exporter, I think it'd be ideal for the exporter to be responsible for attaching to the events, be it from application env config or via config passed to a process.
However regarding the telemetry-metrics
part of the picture, the thing that seems a little tricky to me is taking a metric spec and turning it into the right Telemetry event... Say I get a metric spec name like my_app.repo.query
. I could just assume that the Telemetry event is [:my_app, :repo, :query]
and attach to that.
But it seems a little brittle - when I handle the event, the only thing I have to rely on is that query
is the last segment of the event. The repo module could have a deeper nesting so it could have an arbitrary length, or it might not actually have Repo
in the name. It's not possible to pattern match that last query
segment which means my handle_event
function can't really use the event
to determine what to do.
I do want access to the Repo name because I'd like to include it in the metrics sent to New Relic (otherwise everything gets flattened if there are more than one Repo).
Perhaps that simply means that there shouldn't be app-specific things in the metric name / telemetry event. Maybe instead of my_app.repo.query
it should always be ecto.query
and the repo should be passed into the metadata
?
Or perhaps it just means that the New Relic exporter should be passed higher-level config, like repos: [MyApp.Repo]
and it should be responsible for deciding which metrics to deal with.
If there is a better place to have these broader design conversations, I'd be happy to shift them out of Github Issues...
from telemetry.
from telemetry.
Great, I'll assume matching on metadata can give us everything we need.. I'll open an issue over in ecto
for that particular case
I'll also assume for now that we'll start a child process that is passed a list of metrics and attaches to events based on them
from telemetry.
I agree that users should not be concerned with what handlers are attached to what events. After all, they're not interested in having a handler attached, they want to get a metric pushed to StatsD, NewRelic or scraped by Prometheus (possibly via opencensus).
Regarding child processes, in the small StatsD exporter PoC I made some time ago, there was no process at all, only something like:
defmodule MyApp.Telemetry do
def start do
TelemetryMetrics.StatsD.start(metrics(), host: "...", port: ...)
end
defp metrics do
[
counter(
"http.request",
metadata: [:controller, :action], tags: [:controller, :action]
),
sum(
"my_app.repo.query",
metadata: [...]
)
]
end
end
So having a process is not really always necessary, at least until we discover we need them to handle handler errors. My point is, it's not something we should be really concerned about right now.
@binaryseed Regarding your concerns about metric/event names, metric specs provided by Telemetry.Metrics
already include the information about expected metric name and the source event name. If I create a metric spec like this:
Telemetry.Metrics.counter("my_app.repo.query", name: "my_app.db_query")
I expect the [:my_app, :repo, :query]
to be the source event name, my_app.db_query
to be the metric name in the system we export to (unless the system follows different naming conventions, like separating name segments with hyphens or something like that).
The struct returned by Telemetry.Metric.counter/2
call above contains event_name
field with [:my_app, :repo, :query]
value, and metric_name
field with [:my_app, :db_query]
value, so you can use them directly when attaching handlers and assembling the "external" metric name.
from telemetry.
@arkgil I would recommend keeping a process even if all it does is to start the handlers on init, and then trap exits and remove the handlers on terminate. I think it is important to have the handlers attached to a process that respects a supervision tree shutdown. Otherwise, if handlers are not removed on shutdown and they depend on an application, process or service, we can have failures when certain events are emitted during shutdown and those may be hard to reproduce.
from telemetry.
@josevalim right, that's a good point, I only thought about handling handler failures, but you're right that we need to support supervision tree shutdown.
from telemetry.
Oh, thanks for statsd lib, It should be helpful. I'll try to write prometheus integration soonish.
from telemetry.
There is also a ConsoleReporter shipping as part of Telemetry.Metrics as an example (as soon as this PR is merged) and @arkgil also wrote a "Writing Reporters" documentation for Telemetry.Metrics. I think we are all covered on this front now. :)
from telemetry.
Related Issues (20)
- Can't start app in Mix release {:bad_return, {{:telemetry_app, :start... HOT 10
- Making execution order of handlers guaranteed HOT 16
- `:telemetry.span/3` documentation seems inconsistent HOT 4
- Is telemetry distributed ? HOT 2
- Automatically detach when attaching process ends HOT 1
- Using telemetry as a dependency in an Elixir script HOT 11
- Support a debug mode and capture unhandled events in debug mode HOT 2
- Safe ETS interactions HOT 6
- Errors in readme HOT 4
- Static API for discovering events HOT 28
- Add telemetry:span/3 HOT 67
- A flaky and odd compilation error HOT 16
- Guidance around monotonic times HOT 5
- Nested payload for measurement? HOT 1
- Release 0.4.2 HOT 3
- What's the difference with gen_event? HOT 1
- Shall we release 1.0? HOT 5
- span needs to pass context to the stop event HOT 16
- span/3 StartMetadata and StopMetadata semantics HOT 1
- telemetry.beam failed: :badfile HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from telemetry.