Giter VIP home page Giter VIP logo

softasap / sa-prometheus-exporters Goto Github PK

View Code? Open in Web Editor NEW
20.0 2.0 12.0 283 KB

Collection of the preselected prometheus exporters to be installed on a target nodes

License: MIT License

Shell 30.82% Roff 20.10% Python 1.82% Makefile 5.64% Jinja 41.61%
prometheus prometheus-exporter node-exporter mysql-exporter elasticsearch-exporter blackbox-exporter redis-exporter memcached-exporter postgres-exporter mongodb-exporter

sa-prometheus-exporters's Introduction

sa-prometheus-exporters

Build Status

Re-usable list of exporters usually used with Prometheus

Bundled exporters:

Exporter Description Port Grafana Example Github
node Server params 9100 1860 http://192.168.2.66:9100/metrics prometheus/node_exporter
mysqld Prometheus MySQL exporter 9104 7362 http://192.168.2.66:9104/metrics prometheus/mysqld_exporter
elasticsearch Elastic search 9108 2322 http://192.168.2.66:9108/metrics justwatchcom/elasticsearch_exporter
nginx Nginx exporter 9113 11199 http://192.168.2.66:9115/metrics nginxinc/nginx-prometheus-exporter
blackbox External services 9115 http://192.168.2.66:9115/metrics prometheus/blackbox_exporter
apache Apache webserver 9117 http://192.168.2.66:9117/metrics Lusitaniae/apache_exporter
redis Redis exporter 9121 763 http://192.168.2.66:9121/metrics oliver006/redis_exporter
proxysql ProxySQL exporter 42004 http://192.168.2.66:42004/metrics percona/proxysql_exporter
memcached memcached health 9150 37 http://192.168.2.66:9150/metrics prometheus/memcached_exporter
postgres postgres exporter 9187 http://192.168.2.66:9187/metrics wrouesnel/postgres_exporter
mongodb Percona's mongodb exporter 9216 http://192.168.2.66:9216/metrics percona/mongodb_exporter
ssl ssl exporter 9219 http://192.168.2.66:9219/metrics ribbybibby/ssl_exporter
ecs aws ecs exporter 9222 http://192.168.2.66:9222/metrics slok/ecs-exporter
sql custom sql exporter 9237 http://192.168.2.66:9237/metrics justwatchcom/sql_exporter
phpfpm php fpm exporter via sock 9253 5579 http://192.168.2.66:9253/metrics Lusitaniae/phpfpm_exporter
cadvisor Google's cadvisor exporter 9280 (configurable) http://192.168.2.66:9280/metrics google/cadvisor
monit Monit exporter 9388 (configurable) http://192.168.2.66:9388/metrics commercetools/monit_exporter
cloudwatch Cloudwatch exporter 9400 (configurable) http://192.168.2.66:9400/metrics ivx/yet-another-cloudwatch-exporter

Example of usage:

Simple

     - {
         role: "sa-prometheus-exporters"
       }

Each exporter gets configured via OPTIONS specified in environment file located at /etc/prometheus/exporters/exportername_exporter Additionally, you have possibility to influence systemd service file itself, see advanced section

OPTIONS=" --some-exporter-params --some-exporter-params --some-exporter-params --some-exporter-params"

Advanced

box_prometheus_exporters:
  - {
      name: some_exporter,
      exporter_configuration_template: "/path/to/template/with/console/params/for/exporter",
      exporter_user: prometheus,
      exporter_group: prometheus,
      startup: {
        env:
          - ENV_VAR1: "ENV_VAR1 value",
          - ENV_VAR2: "ENV_VAR1 value"

        execstop: "/some/script/to/execute/on/stop",
        pidfile: "/if/you/need/pidfile",
        extralines:
          - "IF YOU WANT"
          - "SOME REALLY CUSTOM STUFF"
          - "in Service section of the systemd file"
          - "(see templates folder)"
      }

    }
  - {
      name: apache
    }
  - {
      name: blackbox
    }
  - {
      name: cadvisor
    }
  - {
      name: ecs,
      parameters: "--aws.region='us-east-1'"
    }
  - {
      name: elasticsearch
    }
  - {
      name: memcached
    }
  - {
      name: mongodb
    }
  - {
      name: monit,
      monit_user: "admin",
      monit_password: "see_box_example_for_example_)",
      listen_address: "0.0.0.0:9388"
    }
  - {
      name: mysqld,
      parameters: "-config.my-cnf=/etc/prometheus/.mycnf -collect.binlog_size=true -collect.info_schema.processlist=true"
    }
  - {
      name: node
    }
  - {
      name: nginx,
      parameters: "-nginx.scrape-uri http://<nginx>:8080/stub_status
    }
  - {
      name: proxysql
    }
  - {
      name: phpfpm,
      exporter_user: "www-data",
      parameters: "--phpfpm.socket-paths /var/run/php/php7.0-fpm.sock"
    }
  - {
      name: postgres
    }
  - {
      name: redis
    }
  - {
      name: sql
    }
  - {
      name: ssl
    }


roles:

     - {
         role: "sa-prometheus-exporters",
         prometheus_exporters: "{{box_prometheus_exporters}}"
       }

More exporters to consider: https://prometheus.io/docs/instrumenting/exporters/

If you want some to be included into role defaults, open the issue or PR (preferable)

mysqld exporter configuration

Best served with grafana dashboards from percona https://github.com/percona/grafana-dashboards

For those you need to add parameter -collect.binlog_size=true -collect.info_schema.processlist=true

additionally create exporter mysql user

CREATE USER 'prometheus_exporter'@'localhost' IDENTIFIED BY 'XXX' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'prometheus_exporter'@'localhost';

then either ensure environment variable in startup script (see configuration example in advanced)

export DATA_SOURCE_NAME='prometheus_exporter:XXX@(localhost:3306)/'

also option to do that via exporter environment file /etc/prometheus/exporters/mysqld_exporter

OPTIONS=""
DATA_SOURCE_NAME='prometheus_exporter:devroot@(localhost:3306)/'

or

create ~/.my.cnf

-config.my-cnf=/etc/prometheus/.mycnf -collect.binlog_size=true -collect.info_schema.processlist=true

[client]
user=prometheus_exporter
password=XXXXXX

For presentation, consider grafana dashboards , like

https://github.com/percona/grafana-dashboards

For troubleshouting, must have grafana dashboard is https://grafana.com/dashboards/456

apache exporter configuration

apache Server params 9117 http://192.168.2.66:9117/metrics Lusitaniae/apache_exporter

prometheus.yml

  - job_name: "apache"
    ec2_sd_configs:
      - region: us-east-1
        port: 9117
    relabel_configs:
        # Only monitor instances with a non-no Prometheus tag
      - source_labels: [__meta_ec2_tag_Prometheus]
        regex: no
        action: drop
      - source_labels: [__meta_ec2_instance_state]
        regex: stopped
        action: drop
        # Use the instance ID as the instance label
      - source_labels: [__meta_ec2_instance_id]
        target_label: instance
      - source_labels: [__meta_ec2_tag_Name]
        target_label: name
      - source_labels: [__meta_ec2_tag_Environment]
        target_label: env
      - source_labels: [__meta_ec2_tag_Role]
        target_label: role
      - source_labels: [__meta_ec2_public_ip]
        regex: (.*)
        replacement: ${1}:9117
        action: replace
        target_label: __address__

Grafana dashboard compatible with this exporter is, for example https://grafana.com/dashboards/3894

blackbox exporter configuration

Blackbox exporter running on a monitoring host can, for example be used

Example: monitoring ssh connectivity for external boxes

prometheus.yml

  - job_name: 'blackbox_ssh'
    metrics_path: /probe
    params:
      module: [ssh_banner]
    ec2_sd_configs:
      - region: us-east-1
        port: 9100
    relabel_configs:
        # Only monitor instances with a non-no Prometheus tag
      - source_labels: [__meta_ec2_tag_Prometheus]
        regex: no
        action: drop
      - source_labels: [__meta_ec2_instance_state]
        regex: stopped
        action: drop
        # Use the instance ID as the instance label
      - source_labels: [__meta_ec2_instance_id]
        target_label: instance
      - source_labels: [__meta_ec2_tag_Name]
        target_label: name
      - source_labels: [__meta_ec2_tag_Environment]
        target_label: env
      - source_labels: [__meta_ec2_tag_Role]
        target_label: role
      - source_labels: [__meta_ec2_private_ip]
        regex: (.*)(:.*)?
        replacement: ${1}:22
        target_label: __param_target
      # Actually talk to the blackbox exporter though
      - target_label: __address__
        replacement: 127.0.0.1:9115

And in a result you can tune prometheus alert manager to alert if you got issues with ssh connectivity

/etc/prometheus/rules/ssh_alerts.yml

groups:
- name: ssh_alerts.rules
  rules:
  - alert: SSHDown
    expr: probe_success{job="blackbox_ssh"} == 0
    annotations:
      description: '[{{ $labels.env }}]{{ $labels.instance }} {{ $labels.name }} of job {{ $labels.job }} is not responding on ssh.'
      summary: 'Instance {{ $labels.instance }} has possible ssh issue'
    for: 10m

memcached exporter configuration

memcached memcached health 9150 http://192.168.2.66:9150/metrics prometheus/memcached_exporter

prometheus.yml

  - job_name: "memcached"
    ec2_sd_configs:
      - region: us-east-1
        port: 9150
    relabel_configs:
        # Only monitor instances with a non-no Prometheus tag
      - source_labels: [__meta_ec2_tag_Prometheus]
        regex: no
        action: drop
      - source_labels: [__meta_ec2_instance_state]
        regex: stopped
        action: drop
        # Use the instance ID as the instance label
      - source_labels: [__meta_ec2_instance_id]
        target_label: instance
      - source_labels: [__meta_ec2_tag_Name]
        target_label: name
      - source_labels: [__meta_ec2_tag_Environment]
        target_label: env
      - source_labels: [__meta_ec2_tag_Role]
        target_label: role
      - source_labels: [__meta_ec2_public_ip]
        regex: (.*)
        replacement: ${1}:9150
        action: replace
        target_label: __address__

Compatible grafana dashboard https://grafana.com/dashboards/37

node exporter configuration

node Server params 9100 http://192.168.2.66:9100/metrics prometheus/node_exporter

Auto discovery in aws cloud, with filtering

prometheus.yml

  - job_name: "node"
    ec2_sd_configs:
      - region: us-east-1
        port: 9100
    relabel_configs:
        # Only monitor instances with a non-no Prometheus tag
      - source_labels: [__meta_ec2_tag_Prometheus]
        regex: no
        action: drop
      - source_labels: [__meta_ec2_instance_state]
        regex: stopped
        action: drop
        # Use the instance ID as the instance label
      - source_labels: [__meta_ec2_instance_id]
        target_label: instance
      - source_labels: [__meta_ec2_tag_Name]
        target_label: name
      - source_labels: [__meta_ec2_tag_Environment]
        target_label: env
      - source_labels: [__meta_ec2_tag_Role]
        target_label: role
      - source_labels: [__meta_ec2_public_ip]
        regex: (.*)
        replacement: ${1}:9100
        action: replace
        target_label: __address__

Some reasonable set of linked prometheus alerts to consider

/etc/prometheus/rules/node_alerts.yml

groups:
- name: node_alerts.rules
  rules:
  - alert: LowDiskSpace
    expr: node_filesystem_avail{fstype=~"(ext.|xfs)",job="node"} / node_filesystem_size{fstype=~"(ext.|xfs)",job="node"}* 100 <= 10
    for: 15m
    labels:
      severity: warn
    annotations:
      title: 'Less than 10% disk space left'
      description: |
        Consider sshing into the instance and removing old logs, clean
        temp files, or remove old apt packages with `apt-get autoremove`
      runbook: troubleshooting/filesystem_alerts.md
      value: '{{ $value | humanize }}%'
      device: '{{ $labels.device }}%'
      mount_point: '{{ $labels.mountpoint }}%'
  - alert: NoDiskSpace
    expr: node_filesystem_avail{fstype=~"(ext.|xfs)",job="node"} / node_filesystem_size{fstype=~"(ext.|xfs)",job="node"}* 100 <= 1
    for: 15m
    labels:
      pager: pagerduty
      severity: critical
    annotations:
      title: '1% disk space left'
      description: "There's only 1% disk space left"
      runbook: troubleshooting/filesystem_alerts.md
      value: '{{ $value | humanize }}%'
      device: '{{ $labels.device }}%'
      mount_point: '{{ $labels.mountpoint }}%'

  - alert: HighInodeUsage
    expr: node_filesystem_files_free{fstype=~"(ext.|xfs)",job="node"} / node_filesystem_files{fstype=~"(ext.|xfs)",job="node"}* 100 <= 20
    for: 15m
    labels:
      severity: critical
    annotations:
      title: "High number of inode usage"
      description: |
        "Consider ssh'ing into the instance and removing files or clean
        temp files"
      runbook: troubleshooting/filesystem_alerts_inodes.md
      value: '{{ $value | printf "%.2f" }}%'
      device: '{{ $labels.device }}%'
      mount_point: '{{ $labels.mountpoint }}%'

  - alert: ExtremelyHighCPU
    expr: instance:node_cpu_in_use:ratio > 0.95
    for: 2h
    labels:
      pager: pagerduty
      severity: critical
    annotations:
      description: CPU use percent is extremely high on {{ if $labels.fqdn }}{{ $labels.fqdn
        }}{{ else }}{{ $labels.instance }}{{ end }} for the past 2 hours.
      runbook: troubleshooting
      title: CPU use percent is extremely high on {{ if $labels.fqdn }}{{ $labels.fqdn
        }}{{ else }}{{ $labels.instance }}{{ end }} for the past 2 hours.

  - alert: HighCPU
    expr: instance:node_cpu_in_use:ratio > 0.8
    for: 2h
    labels:
      severity: critical
    annotations:
      description: CPU use percent is extremely high on {{ if $labels.fqdn }}{{ $labels.fqdn
        }}{{ else }}{{ $labels.instance }}{{ end }} for the past 2 hours.
      runbook: troubleshooting
      title: CPU use percent is high on {{ if $labels.fqdn }}{{ $labels.fqdn }}{{
        else }}{{ $labels.instance }}{{ end }} for the past 2 hours.

Grafana dashboard might be https://grafana.com/dashboards/22 (label_values(node_exporter_build_info, instance))

Consider: https://grafana.com/dashboards/6014 or https://grafana.com/dashboards/718

nginx exporter configuration

Additional parameters for that might be passed for nginx-prometheus-exporter: -nginx.plus Start the exporter for NGINX Plus. By default, the exporter is started for NGINX. The default value can be overwritten by NGINX_PLUS environment variable. -nginx.retries int A number of retries the exporter will make on start to connect to the NGINX stub_status page/NGINX Plus API before exiting with an error. The default value can be overwritten by NGINX_RETRIES environment variable. -nginx.retry-interval duration An interval between retries to connect to the NGINX stub_status page/NGINX Plus API on start. The default value can be overwritten by NGINX_RETRY_INTERVAL environment variable. (default 5s) -nginx.scrape-uri string A URI for scraping NGINX or NGINX Plus metrics. For NGINX, the stub_status page must be available through the URI. For NGINX Plus -- the API. The default value can be overwritten by SCRAPE_URI environment variable. (default "http://127.0.0.1:8080/stub_status") -nginx.ssl-verify Perform SSL certificate verification. The default value can be overwritten by SSL_VERIFY environment variable. (default true) -nginx.timeout duration A timeout for scraping metrics from NGINX or NGINX Plus. The default value can be overwritten by TIMEOUT environment variable. (default 5s) -web.listen-address string An address to listen on for web interface and telemetry. The default value can be overwritten by LISTEN_ADDRESS environment variable. (default ":9113") -web.telemetry-path string A path under which to expose metrics. The default value can be overwritten by TELEMETRY_PATH environment variable. (default "/metrics")

Exported metrics

Exported Metrics Common metrics:

nginxexporter_build_info -- shows the exporter build information. For NGINX, the following metrics are exported:

All stub_status metrics. nginx_up -- shows the status of the last metric scrape: 1 for a successful scrape and 0 for a failed one. Connect to the /metrics page of the running exporter to see the complete list of metrics along with their descriptions.

For NGINX Plus, the following metrics are exported:

Connections. HTTP. SSL. HTTP Server Zones. Stream Server Zones. HTTP Upstreams. Note: for the state metric, the string values are converted to float64 using the following rule: "up" -> 1.0, "draining" -> 2.0, "down" -> 3.0, "unavail" –> 4.0, "checking" –> 5.0, "unhealthy" -> 6.0. Stream Upstreams. Note: for the state metric, the string values are converted to float64 using the following rule: "up" -> 1.0, "down" -> 3.0, "unavail" –> 4.0, "checking" –> 5.0, "unhealthy" -> 6.0. Stream Zone Sync. nginxplus_up -- shows the status of the last metric scrape: 1 for a successful scrape and 0 for a failed one. Location Zones. Resolver. Connect to the /metrics page of the running exporter to see the complete list of metrics along with their descriptions. Note: to see server zones related metrics you must configure status zones and to see upstream related metrics you must configure upstreams with a shared memory zone.

You would need also activate

location /stub_status {
 	stub_status;
 	allow 127.0.0.1;	#only allow requests from localhost
 	deny all;		#deny all other hosts
 }

phpfpm exporter configuration

phpfpm php fpm exporter via sock 9253 http://192.168.2.66:9253/metrics Lusitaniae/phpfpm_exporter

Pay attention to configuration

First of all, exporter should work under user, that can read php-fpm sock (for example www-data if you used sa-php-fpm role), Second - you should provide path to sock, like --phpfpm.socket-paths /var/run/php/php7.0-fpm.sock

      exporter_user: "www-data",
      parameters: "--phpfpm.socket-paths /var/run/php/php7.0-fpm.sock"

Third, the FPM status page must be enabled in every pool you'd like to monitor by defining pm.status_path = /status

Auto discovery in aws cloud, with filtering

prometheus.yml

  - job_name: "phpfpm"
    ec2_sd_configs:
      - region: us-east-1
        port: 9253
    relabel_configs:
        # Only monitor instances with a non-no Prometheus tag
      - source_labels: [__meta_ec2_tag_Prometheus]
        regex: no
        action: drop
      - source_labels: [__meta_ec2_instance_state]
        regex: stopped
        action: drop
        # Use the instance ID as the instance label
      - source_labels: [__meta_ec2_instance_id]
        target_label: instance
      - source_labels: [__meta_ec2_tag_Name]
        target_label: name
      - source_labels: [__meta_ec2_tag_Environment]
        target_label: env
      - source_labels: [__meta_ec2_tag_Role]
        target_label: role
      - source_labels: [__meta_ec2_public_ip]
        regex: (.*)
        replacement: ${1}:9253
        action: replace
        target_label: __address__

Linked grafana dashboard:

https://grafana.com/dashboards/5579 (single pool)

https://grafana.com/dashboards/5714 (multiple pools)

postgres exporter configuration

Recommended approach - is to use dedicated user for stats collecting

create exporter postgres_exporter user

CREATE USER postgres_exporter PASSWORD 'XXXX';
ALTER USER postgres_exporter SET SEARCH_PATH TO postgres_exporter,pg_catalog;

As you are running as non postgres user, you would need to create views to read statistics information

-- If deploying as non-superuser (for example in AWS RDS)
-- GRANT postgres_exporter TO :MASTER_USER;
CREATE SCHEMA postgres_exporter AUTHORIZATION postgres_exporter;

CREATE VIEW postgres_exporter.pg_stat_activity
AS
  SELECT * from pg_catalog.pg_stat_activity;

GRANT SELECT ON postgres_exporter.pg_stat_activity TO postgres_exporter;

CREATE VIEW postgres_exporter.pg_stat_replication AS
  SELECT * from pg_catalog.pg_stat_replication;

GRANT SELECT ON postgres_exporter.pg_stat_replication TO postgres_exporter;

in /etc/prometheus/exporters/postgres_exporter

OPTIONS="some parameters"
DATA_SOURCE_NAME="login:password@(hostname:port)/"

Assuming, you provided above exporter is ready to operate.

For presentation, consider grafana dashboards , like

https://grafana.com/dashboards/3300

https://grafana.com/dashboards/3742

Blackbox exporter notes

startup options should be kind of

OPTIONS="--config.file=/opt/prometheus/exporters/blackbox_exporter/blackbox.yml"

where blackbox.yml is

modules:
  http_2xx:
    prober: http
    http:
  http_post_2xx:
    prober: http
    http:
      method: POST
  tcp_connect:
    prober: tcp
  pop3s_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^+OK"
      tls: true
      tls_config:
        insecure_skip_verify: false
  ssh_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^SSH-2.0-"
  irc_banner:
    prober: tcp
    tcp:
      query_response:
      - send: "NICK prober"
      - send: "USER prober prober prober :prober"
      - expect: "PING :([^ ]+)"
        send: "PONG ${1}"
      - expect: "^:[^ ]+ 001"
  icmp:
    prober: icmp

cloudwatch exporter configuration

You will need to give appropriate rights to your monitoring instance

resource "aws_iam_role" "monitoring_iam_role" {
  name = "monitoring_iam_role"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "ec2.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy_attachment" "ec2-read-only-policy-attachment" {
  role = "${aws_iam_role.monitoring_iam_role.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess"
}

resource "aws_iam_role_policy_attachment" "CloudWatchFullAccess" {
  role = "${aws_iam_role.monitoring_iam_role.name}"
  policy_arn = "arn:aws:iam::aws:policy/CloudWatchFullAccess"
}


resource "aws_iam_role_policy_attachment" "ec2-read-only-policy-attachment" {
  role = "${aws_iam_role.monitoring_iam_role.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess"
}

resource "aws_iam_role_policy_attachment" "CloudWatchFullAccess" {
  role = "${aws_iam_role.monitoring_iam_role.name}"
  policy_arn = "arn:aws:iam::aws:policy/CloudWatchFullAccess"
}

resource "aws_iam_role_policy" "iam_role_policy_prometheuscloudwatch" {
  name = "iam_role_policy_prometheuscloudwatch"
  role = "${aws_iam_role.monitoring_iam_role.id}"
  policy = <<EOF
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "prometheuscloudwatchexporter",
            "Effect": "Allow",
            "Action": [
                "tag:GetResources",
                "cloudwatch:GetMetricStatistics",
                "cloudwatch:ListMetrics"
            ],
            "Resource": "*"
        }
    ]
}
EOF
}

config.yml by default is scripted to give you idea about possibilities. Amend it. You should adjust it's structure to your environment

discovery:
  exportedTagsOnMetrics:
    vpn:
      - Project
      - Name
    elb:
      - Project
      - Name
    rds:
      - Project
      - Name
    ec:
     - Project
     - Name
    lambda:
     - Project
     - Name
    alb:
     - Project
     - Name

  jobs:
  - region: {{ ec2_region }}
    type: "vpn"
    searchTags:
      - Key: {{ prometheus_yace_tag_filter }}
        Value: {{ prometheus_yace_tag_value }}
    metrics:
      - name: TunnelState
        statistics:
        - 'Maximum'
        period: 60
        length: 300
      - name: TunnelDataOut
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: TunnelDataIn
        statistics:
        - 'Sum'
        period: 60
        length: 300
  - region: {{ ec2_region }}
    type: "lambda"
    searchTags:
      - Key: {{ prometheus_yace_tag_filter }}
        Value: {{ prometheus_yace_tag_value }}
    metrics:
      - name: Throttles
        statistics:
        - 'Average'
        period: 60
        length: 300
      - name: Invocations
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: Errors
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: Duration
        statistics:
        - 'Average'
        period: 60
        length: 300
      - name: Duration
        statistics:
        - 'Minimum'
        period: 60
        length: 300
      - name: Duration
        statistics:
        - 'Maximum'
        period: 60
        length: 300
  - region: {{ ec2_region }}
    type: "ec"
    searchTags:
      - Key: {{ prometheus_yace_tag_filter }}
        Value: {{ prometheus_yace_tag_value }}
    metrics:
      - name: NewConnections
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: GetTypeCmds
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: SetTypeCmds
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: CacheMisses
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: CacheHits
        statistics:
        - 'Sum'
        period: 60
        length: 300
  - region: {{ ec2_region }}
    type: "rds"
    searchTags:
      - Key: {{ prometheus_yace_tag_filter }}
        Value: {{ prometheus_yace_tag_value }}
    metrics:
      - name: DatabaseConnections
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: ReadIOPS
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: WriteIOPS
        statistics:
        - 'Sum'
        period: 60
        length: 300
  - region: {{ ec2_region }}
    type: "elb"
    searchTags:
      - Key: {{ prometheus_yace_tag_filter }}
        Value: {{ prometheus_yace_tag_value }}
    metrics:
      - name: RequestCount
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: HealthyHostCount
        statistics:
        - 'Minimum'
        period: 60
        length: 300
      - name: UnHealthyHostCount
        statistics:
        - 'Maximum'
        period: 60
        length: 300
      - name: HTTPCode_Backend_2XX
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: HTTPCode_Backend_5XX
        statistics:
        - 'Sum'
        period: 60
        length: 300
      - name: HTTPCode_Backend_4XX
        statistics:
        - 'Sum'
        period: 60
        length: 300
  - region: {{ ec2_region }}
    type: "alb"
    searchTags:
      - Key: {{ prometheus_yace_tag_filter }}
        Value: {{ prometheus_yace_tag_value }}
    metrics:
      - name: RequestCount
        statistics:
        - 'Sum'
        period: 600
        length: 600
      - name: HealthyHostCount
        statistics:
        - 'Minimum'
        period: 600
        length: 600
      - name: UnHealthyHostCount
        statistics:
        - 'Maximum'
        period: 600
        length: 600
      - name: HTTPCode_Target_2XX
        statistics:
        - 'Sum'
        period: 600
        length: 600
      - name: HTTPCode_Target_5XX
        statistics:
        - 'Sum'
        period: 600
        length: 600
      - name: HTTPCode_Target_4XX
        statistics:
        - 'Sum'
        period: 60
        length: 60

ssl exporter configuration

Just like with the blackbox_exporter, you should pass the targets to a single instance of the exporter in a scrape config with a clever bit of relabelling. This allows you to leverage service discovery and keeps configuration centralised to your Prometheus config.

scrape_configs:
  - job_name: 'ssl'
    metrics_path: /probe
    static_configs:
      - targets:
        - example.com:443
        - prometheus.io:443
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 127.0.0.1:9219  # SSL exporter.

Example Queries Certificates that expire within 7 days, with Subject Common Name and Subject Alternative Names joined on:

((ssl_cert_not_after - time() < 86400 * 7) * on (instance,issuer_cn,serial_no) group_left (dnsnames) ssl_cert_subject_alternative_dnsnames) * on (instance,issuer_cn,serial_no) group_left (subject_cn) ssl_cert_subject_common_name

Only return wildcard certificates that are expiring:

((ssl_cert_not_after - time() < 86400 * 7) * on (instance,issuer_cn,serial_no) group_left (subject_cn) ssl_cert_subject_common_name{subject_cn=~"\\*.*"})

Number of certificates in the chain:

count(ssl_cert_subject_common_name) by (instance)

Identify instances that have failed to create a valid SSL connection:

ssl_tls_connect_success == 0

Usage with ansible galaxy workflow

If you installed the sa-prometheus-exporters role using the command

ansible-galaxy install softasap.sa-prometheus-exporters

the role will be available in the folder library/softasap.sa-prometheus-exporters Please adjust the path accordingly.

     - {
         role: "softasap.sa-nginx"
       }

Ideas for more alerts: https://awesome-prometheus-alerts.grep.to/

Copyright and license

Code is dual licensed under the [BSD 3 clause] (https://opensource.org/licenses/BSD-3-Clause) and the [MIT License] (http://opensource.org/licenses/MIT). Choose the one that suits you best.

Reach us:

Subscribe for roles updates at [FB] (https://www.facebook.com/SoftAsap/)

Join gitter discussion channel at Gitter

Discover other roles at http://www.softasap.com/roles/registry_generated.html

visit our blog at http://www.softasap.com/blog/archive.html

sa-prometheus-exporters's People

Contributors

adriagalin avatar honarkhah avatar ph4r5h4d avatar voronenko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sa-prometheus-exporters's Issues

Node exporter unpack tasks error

When the task runs for the second time it gives an error, I think the correct behavior is to check if the binary exists and if not try to download and unpack it.
I get the following error running it for the second time.

fatal: [app]: FAILED! => {"changed": false, "dest": "/opt/prometheus/exporters", "extract_results": {"cmd": ["/bin/tar", "--extract", "-C", "/opt/prometheus/exporters", "-z", "-f", "/tmp/node_exporter-0.15.2.linux-amd64.tar.gz"], "err": "/bin/tar: node_exporter-0.15.2.linux-amd64/LICENSE: Cannot open: File exists\n/bin/tar: node_exporter-0.15.2.linux-amd64/NOTICE: Cannot open: File exists\n/bin/tar: node_exporter-0.15.2.linux-amd64/node_exporter: Cannot open: File exists\n/bin/tar: node_exporter-0.15.2.linux-amd64: Cannot utime: Operation not permitted\n/bin/tar: Exiting with failure status due to previous errors\n", "out": "", "rc": 2}, "gid": 999, "group": "prometheus", "handler": "TgzArchive", "mode": "0770", "msg": "failed to unpack /tmp/node_exporter-0.15.2.linux-amd64.tar.gz to /opt/prometheus/exporters", "owner": "root", "size": 4096, "src": "/tmp/node_exporter-0.15.2.linux-amd64.tar.gz", "state": "directory", "uid": 0}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.