Giter VIP home page Giter VIP logo

cfduty's Introduction

๐Ÿšจ CFDuty

Cloudflare Worker to use PagerDuty with Cloudflare Health Checks on a Pro plan.

This covers the simple use case where you have a few servers, running a few services, and you want to get notified if a server or service goes down, but you don't want to pay a huge sum of money to PagerDuty and you also don't want multiple redundant alerts for each service if a server goes down (or is rebooted).

๐Ÿšง Deploy the Worker

Edit wrangler.toml and put in your Cloudflare account ID. If you want to use a workers.dev domain, change that setting as well.

Deploy the worker as normal. You need to create several environment variables, which you can create as secrets with Wrangler (details below).

๐Ÿช Create a webhook

Create a webhook notification destination in the Cloudflare dashboard. The URL should be whatever route you've deployed the worker on, with the path /alert. So if you've put the Worker on alerts.example.com, the URL would be https://alerts.example.com/alert.

For the shared secret, it can be whatever you want. This is used to verify that the webhook is coming from your Cloudflare setup.

Note

You must have a Cloudflare Pro plan (or higher) to use webhooks. This will not work on a free Cloudflare plan.

๐Ÿ“ฃ Create health checks

Create a health check in the Cloudflare dashboard and set it to use your webhook. See below for what to name your health checks. Set it to trigger when the monitored service becomes unhealthy or becomes healthy.

๐Ÿคซ Secrets

Create secrets for the Worker as follows:

  • WEBHOOK_SECRET - The shared secret you added when you created the webhook.
  • CF_API_TOKEN - Cloudflare API token, needs read access to health checks.
  • CF_ZONE_ID - The Cloudflare zone ID where the health checks are.
  • KEY_foo - where foo is the name of your service. See below.

๐Ÿ“Ÿ PagerDuty services

This script uses naming conventions to group and deduplicate alerts.

For a server foo, create a PagerDuty service foo (the name actually doesn't need to match).

If foo is running several services, say, web, SMTP, and IMAP, create three health checks named foo-web, foo-smtp, and foo-imap, for each service. The first name component is used to dedup alerts, so multiple alerts on foo-* health checks will create just one PagerDuty incident and thus one page.

In PagerDuty, for each service, add an Integration and choose "Events API V2". This will create an integration key (routing key) for the service.

For each service (foo-*) create an environment variable (or secret) for your Worker called KEY_foo with this PagerDuty service routing key.

Note

This works with a free PagerDuty account.

๐Ÿ˜ด Auto-resolving incidents

If CF_API_TOKEN and CF_ZONE_ID are set, a "healthy" alert from a check (when a check changes from unhealthy to healthy) will query Cloudflare's API for all health checks. If all foo-* health checks (everything on the same service/server) are now healthy, the message will be sent to PagerDuty with a resolve action, which will automatically mark the open issue as resolved. If some foo-* checks are still unhealthy it will be sent as a low-severity message so the incident will remain open. (Any further pages are determined by your PagerDuty escalation policy.)

Thus, if you reboot a server, the incident will close automatically when the server comes back up.

Important

If the call to the PagerDuty API fails, this script does not queue and retry the call. This should be added at some point (PRs welcome).

cfduty's People

Contributors

i40west avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.