Giter VIP home page Giter VIP logo

glados's Introduction

build codecov

GLADOS

Glados (Generic Load Auditing of Servers) is a wrapper for Prometheus allowing simple python functions to be fed into Prometheus / Grafana. The aim is to allow those who wish to write Turrets which monitor specific metrics and feed them into Prometheus. This allows for simple, uncommon, metrics to be pulled from the Turret and can include a simple JSON file (with time).

Goals

Several goals on the horizon for this project include:

  • Ability to write Turret plugins and "point" glados to those plugins to allow for non-native Turrets.
  • Create a Turret from a JSON file, thuse allowing simple on-server JSON input.
  • Ability to write a looping script which emits JSON output every N seconds and point glados to a directory of these scripts to create Turrets out of.

License

Glados is released under the MIT License..

glados's People

Contributors

drjrm3 avatar

Stargazers

 avatar Dennis R Kennetz avatar

Watchers

 avatar

glados's Issues

Github Actions should check version

version.py seems to be out of date without re-running local UTs and GHA should check that the version is up to date with the local tags (at least up through bugfix version).

Json / csv turret

Add ability to auto-build a turret off of an input JSON (CSV?) file. This may require a rigid structure and limit flexibility of this method but will be a nice-to-have for 'grab data on disk' method.

This should also include some template data for:

  • Nvidia data. GPU: {temperature, memory usage, utilization, etc.}
  • LSF (generic scheduler) data. Queue: {Jobs running, pending, exited, finished}
  • Govee sensor (generic temperature / humidity) monitory. Location: {humidity, temperature}

NvidiaGpuTurret to gracefully accept more cases

NvidiaGpuTurret needs to more gracefully handle some more conditions as this currently crashes on some of the following:

  • nvidia-smi output where a fan speed is N/A.
  • nvidia-smi output where a GPU is in a bad state (ERR! statuses in multiple places).
  • Add coverage report to readme.

Add getter / setter methods for some Turret variables

Some Turret variables can stand for getting / setter method.

Prime example:

self._fileName does a check for existence in Turret but not once it is assigned in derived classes. We should, instead add a self.fileName getter / setter to perform this check inside the setter.

LIM turret

Add ability to check for output on all processes running under the same user as glados with names ending in .turret, get the output, and feed this into glados as a turret.

This will be useful for Turrets running on the same server as glados but may not be useful for turrets running on other servers. That said, other servers could run the json / csv turret and the lim turret could simply collect those and feed them to glados along with a timestamp of when it was last updated as a heartbeat.

TurretGauge to support timestamps

GaugeMetricFamily's add_metric function allows for timestamps. This should be leveraged in the code to handle offline input files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.