Giter VIP home page Giter VIP logo

Comments (7)

anmolbabu avatar anmolbabu commented on August 30, 2024 1

Attached with this comment is a proposal of the architecture. Please review and provide your valuable suggestions.
MonitoringArchitectureProposal.pdf

from documentation.

anmolbabu avatar anmolbabu commented on August 30, 2024

This has been picked up as https://tendrl.atlassian.net/browse/TEN-49.

from documentation.

jcsp avatar jcsp commented on August 30, 2024

How is HA handled for the "Monitoring Application" block? Presumably things like the "Alerting" block need to be running on exactly one node, what orchestrates that?

from documentation.

brainfunked avatar brainfunked commented on August 30, 2024

@anmolbabu I'm looking at the monitoring stack being an add-on to the tendrl core stack. What this means is that the tendrl core stack would not have dependencies upon the monitoring stack and would be able to carry out it's responsibilities and operations with or without the monitoring stack.

I'm currently looking at the monitoring stack to be able to provide the following functionality:

  • Performance monitoring
  • Time-series data
  • Administrative alerts

In the future, it may be extended to also support different external monitoring and alerting systems.

Tendrl core itself would need to have some built-in monitoring capabilities. These could vary from system to system, but at the very least, health monitoring should be part of tendrl core. This means that for ceph and gluster as our current targets, we'll need to draw up a list of all the data points and then categorise them into tendrl core (and in core, whether bridge or node agent) and the monitoring stack.

Also, wrt the architecture itself, as a separate stack, the monitoring application and it's database etc. would have to be completely separated from the tendrl application. The integration between the two stack could be either via the tendrl apis or directly via etcd. Keep in mind that tendrl core does not call any external system. Which means, it is likely that the monitoring stack would inject relevant data into etcd so that tendrl core could make use of it.

I would suggest that following be the action items on this:

  • Document the data points and categorise them.
  • Evaluate the integration between the monitoring and the core stacks in terms of the etcd protocol, api requirements etc.
  • Evaluate the HA concerns around the monitoring stack deployment.

from documentation.

anmolbabu avatar anmolbabu commented on August 30, 2024

TendrlMonitoring.pdf
Attached with this comment is an updated architecture proposal inline with the comments and discussions received.Please review and provide your valuable suggestions.

from documentation.

anmolbabu avatar anmolbabu commented on August 30, 2024

Minutes of conversations with @nthomas-redhat @brainfunked @shtripat @anupnivargi :

Alerts:

  1. node agent is responsible for transporting the alerts to etcd
  2. system state related alerts can be gathered from systemd, storaged etc.
  3. cluster state related alerts can be generated by the bridges
  4. performance and threshold monitoring alerts generated by the collectd
    all of these need to end with the node agent, because that's the single channel to take the alerts to etcd

Alert meta-data:

  1. every component should state who generated the alert
  2. the node agent can add the host id

Data Paths:

  1. one socket which is connected to for writes from collectd, read-only from the node agent and read-
    write from the bridge
  2. any alert on that socket is always taken to etcd by the node agent
  3. bridge can read the alerts and act on only the ones it can act on, ignore the rest
  4. it can also generate it's own alerts and put them on the socket for node agent to transport to etcd
  5. The alerting application will be responsible for:
    • watching alerts in etcd's /alerts directory and sending out mails/sms etc...
    • invoke the tendrl api callback to notify tendrl api of a new alert(and then on the tendrl api will
      notify it to ui).

Packaging:

  1. All sds-specific plugins and corresponding templates will be maintained as part of sds-bridge but they will be separate packages(sds-bridge and sds-monitoring).
  2. All physical resource plugins and corresponding templates will be maintained as part of node-agent and they will be again separate packages(node-agent and node-monitoring).
  3. Anything generic and related to monitoring will be part of bridge common and packaging will be separated as bridge-common and common-monitoring
  4. Alerting module will be maintained as part of separate repository and packaged separately but installed along with tendrl core stack
  5. Monitoring aggregator and the monitoring api layer will be will be maintained as part of separate repository and packaged.

from documentation.

anmolbabu avatar anmolbabu commented on August 30, 2024

A PR with agreed upon architecture has been raised at #51

from documentation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.