Giter VIP home page Giter VIP logo

mongodbatlas_exporter's Introduction

Mongodbatlas exporter for Prometheus

Archived: NO LONGER MAINTAINED

Atlas released a prometheus integration that provides much more reliable metrics without having to circumvent API rate limits.

See Introducing MongoDB’s Prometheus Monitoring Integration.

Limitations

  • Exporter supports up to 30 processes (mongod and mongos)

number of process calculation:
1 sharded cluster with 3 shards = 3x3 shards mongod processes + 3x3 mongos processes + 3x1 config mongod processes = 21
1 non-sharded cluster (replica set) = 3x1 mongod processes

  • Minimal scrape interval should be 1m

Configuration

mongodbatlas_exporter doesn't require any configuration file and the available flags can be found as below:

usage: mongodbatlas_exporter [<flags>]

Flags:
  --help                    Show context-sensitive help (also try --help-long and --help-man).
  --listen-address=":9905"  The address to listen on for HTTP requests.
  --atlas.public-key=ATLAS.PUBLIC-KEY
                            Atlas API public key
  --atlas.private-key=ATLAS.PRIVATE-KEY
                            Atlas API private key
  --atlas.project-id=ATLAS.PROJECT-ID
                            Atlas project id (group id) to scrape metrics from
  --atlas.cluster=ATLAS.CLUSTER ...
                            Atlas cluster name to scrape metrics from. Can be defined multiple times. If not defined all clusters in the project will be scraped
  --log-level=debug         Printed logs level.
  --version                 Show application version.

mongodbatlas_exporter's People

Contributors

freyert avatar melifaro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mongodbatlas_exporter's Issues

Sometimes, some metrics are missing

We sometimes rely on mongodbatlas_processes_stats_connections to list clusters in our dashboards, but we've noticed that this metric occasionally does not exist for processes.

List of metrics?

Can you add a list of the metrics currently exported by this?

Connection Reset by Peer

We recently observed " Connection Reset by Peer" when we are connecting to MongoDB Atlas instance , it runs for sometime and then the exporter fails with the below issues . We need restart of the exporter in case of connection failure to be automatically handled by the exporter

{"err":"Get "https://cloud.mongodb.com/api/atlas/v1.0/groups//processes": read tcp :47368->:443: read: connection reset by peer","level":"error","msg":"failed to list processes of the project","project":","timestamp":"2021-12-02T02:10:40.60864571Z"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7c785e]

goroutine 22 [running]:
mongodbatlas_exporter/mongodbatlas.(*AtlasClient).ListProcesses(0xc0001a9080)
/go/mongodbatlas_exporter/mongodbatlas/mongodbatlas.go:69 +0x31e
mongodbatlas_exporter/registerer.(*ProcessRegisterer).registerAtlasProcesses(0xc000260990)
/go/mongodbatlas_exporter/registerer/process_registerer.go:51 +0x42
mongodbatlas_exporter/registerer.(*ProcessRegisterer).Observe(0xc000260990)
/go/mongodbatlas_exporter/registerer/process_registerer.go:43 +0x1e
created by main.main
/go/mongodbatlas_exporter/main.go:56 +0x434

**When we are trying to restart the POD could not able to start again and the POD goes down with the above error **

Please have look at it and let us know if you need any additional information

Instrument the Atlas HTTP Client RoundTripper

I've added a special Error type HTTPError in #14. This was outside the scope of the PR and therefore I didn't not focus on doing this correctly. Sometimes doing things the inefficiently helps us learn to do things more efficiently.

Instead of handling/counting each error code whenever they occur in the code it is much better practice to wrap the RoundTripper for an HTTPClient. This wrapping acts as a sort of middleware from which we can instrument all HTTP calls at a single point.

Specifically we need to wrap the round tripper with InstrumentRoundTripperCounter. It will partition the CounterVec by method and code. This is exactly what we want.

The prometheus team has provided an example in their tests:
https://github.com/prometheus/client_golang/blob/0400fc44d42dd0bca7fb16e87ea0313bb2eb8c53/prometheus/promhttp/instrument_client_test.go#L204-L210

Checked VS. Unchecked Exporter

Summary

  1. Register a collector per instance/disk.
  2. Goroutine that listens for new instances/disks to register.
  3. Methods to Unregister and Reregister instances/disks as labels change?

Exploration

Due to the dynamic nature of this exporter we may want to investigate using an Unchecked exporter. One which returns no description of metrics. We really need a better understanding of the downsides of using an Unchecked exporter OR if we can find a better method to manage collectors that allows the collectors to remain checked.

The github.com/prometheus/client_golang/prometheus documentation indicates that our use case is expected and describes the pitfalls of checked vs. unchecked.

There is a more involved use case, too: If you already have metrics available, created outside of the Prometheus context, you don't need the interface of the various Metric types. You essentially want to mirror the existing numbers into Prometheus Metrics during collection.

Creation of the Metric instance happens in the Collect method. The Describe method has to return separate Desc instances, representative of the “throw-away” metrics to be created later. NewDesc comes in handy to create those Desc instances. Alternatively, you could return no Desc at all, which will mark the Collector “unchecked”. No checks are performed at registration time, but metric consistency will still be ensured at scrape time, i.e. any inconsistencies will lead to scrape errors. Thus, with unchecked Collectors, the responsibility to not collect metrics that lead to inconsistencies in the total scrape result lies with the implementer of the Collector. While this is not a desirable state, it is sometimes necessary. The typical use case is a situation where the exact metrics to be returned by a Collector cannot be predicted at registration time, but the implementer has sufficient knowledge of the whole system to guarantee metric consistency.

The big question now is what do they mean by "metric consistency"? I think I have seen an example of this inconsistency with the stackdriver exporter. It reported duplicate versions of a metric which caused a panic and crash.

I think though that the stackdriver exporter sets a good example for how we can keep our exporters checked. Currently we only register 2 collectors at startup. One for disks, and one for processes.

The stack driver exporter creates Collectors per project you ask it to scrape metrics from: https://github.com/prometheus-community/stackdriver_exporter/blob/16401d6cce781e5d99615e9518f220dbf56d6f0b/stackdriver_exporter.go#L124-L131

So what we would want is to create a collector per instance/disk, register with the metrics found when the collector starts, and success!

If an instance restarts, many of its labels and metrics will change. So we should unregister the collector and register a new one if the labels change.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.