Giter VIP home page Giter VIP logo

marius's Introduction

marius - a friendly prometheus rule test templater

ci build status

Marius takes your prometheus rules and creates files for unittests from all of them.

why

Ever broke your monitoring with an invalid alerting rule? Or tried to check if a rule is working with shutting down various services to get a signle from your prometheus? Or want to have a proper CI/CD pipeline for your prometheus setup?

Prometheus alert rules can tested (see 1). Quite uncool is that there is no way to bootstrap a test file for your multiple rules to start with. Its tedious to translate the myriad our your rules into proper testfiles. Marius can help you at least a little bit.

how does it work

At the moment please build your binary for yourself:

make build

Afterwards you can run it:

./bin/marius

how to check created test files

At the moment there is a Makefile target that runs prometheus unit tests for your created test files

make prom
(...)

Unit Testing:  test-example-rules.yaml
  FAILED:
    alertname:ScrapeTargetDown, time:5m0s,
        exp:"[Labels:{alertname=\"ScrapeTargetDown\", instance=\"unittest\", layer=\"monitoring\", page=\"true\", runbook=\"https://example.example.de/confluence/display/Runbooks\", severity=\"warning\"} Annotations:{description=\"The instance has been not scraped for metrics more than 5 minutes. This indicates eithe
r connectivity issues or a problem on the instance.\", summary=\"Sracping from Instance {{ $labels.instance }} not possible\"}]",
        got:"[Labels:{alertname=\"ScrapeTargetDown\", instance=\"unittest\", layer=\"monitoring\", page=\"true\", runbook=\"https://example.example.de/confluence/display/Runbooks\", severity=\"warning\"} Annotations:{description=\"The instance has been not scraped for metrics more than 5 minutes. This indicates eithe
r connectivity issues or a problem on the instance.\", summary=\"Sracping from Instance unittest not possible\"}]"

make: *** [prom] Error 1

Unit Testing:  test-example-rules.yaml
  SUCCESS

what should be created

Given a simple alerting rule marius should create something like this:

Input:

    - alert: ScrapeTargetDown
      expr: 'up == 0'
      for: '5m'
      labels:
        severity: 'warning'
        layer: 'monitoring'
        runbook: 'https://docu.example.de/confluence/display/Runbooks'
        page: 'true'
      annotations:
        summary: 'Sracping from Instance {{ $labels.instance }} not possible'
        description: 'The instance has been not scraped for metrics more than 5 minutes. This indicates either connectivity issues or a problem on the instance.'

Output:

tests:
- interval: 1m
  input_series:
    # replace this time series with time series matching the series you want to have

    - series: up{instance="unittest"}
      values: '0 0 0 0 0 0 0 0 0 0 0'

  alert_rule_test:
      - eval_time: 5m
        alertname: ScrapeTargetDown
        exp_alerts:
            - exp_labels:
                layer: monitoring
                page: true
                runbook: https://docu.example.de/confluence/display/Runbooks
                severity: warning
                instance: unittest
              exp_annotations:
                  summary: "Sracping from Instance unittest not possible"
                  description: "The instance has been not scraped for metrics more than 5 minutes. This indicates either connectivity issues or a problem on the instance."

not working right now

Marius is pretty much alpha. Be kind if it does not work :P

  • marius should create the test files under a definable path but right now is hard-coded to write into data/
  • marius does not replace alread integrated golang template within labels (e.g. annotations) - yet.
  • marius does not know what values a metric is providing so he just insert 0 values as the test data.
  • the docker build is broken. A Dockerfile is missing
  • there are no test cases for the golang code
  • no autmatic builds.

appendix

(1) https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/

marius's People

Contributors

la3mmchen avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.