Giter VIP home page Giter VIP logo

container-node_exporter-textfiles's Introduction

container-node_exporter-textfiles

NOTICE The repository and image have been renamed on 18.02.2022! For more information click this spoiler
The container images name has been changed from `galexrt/node-exporter-smartmon` to **`galexrt/node-exporter-textfiles`**.
E.g., the image on quay.io is now available at `quay.io/galexrt/node-exporter-smartmon`.

The repository name has been changed from `container-node_exporter-smartmon` to **`container-node_exporter-textfiles`**.

This has been done as the image in this repository is not just for running the `smartmon` textfile collector script(s) anymore.

Container Image for easily running textfile exporter scripts from the Prometheus Community to be collected by the prometheus/node_exporter.

Container Image available from:

Container Image Tags:

  • main - Latest build of the main branch.
  • vYYYYmmdd-HHMMSS-NNN - Latest build of the application with date of the build.

Credits

This docker image contains the prometheus-community/node-exporter-textfile-collector-scripts repository, so that any script can be easily used with / from this Docker image.

Usage

This Docker image needs to be run in privileged mode for most of the scripts in the prometheus-community/node-exporter-textfile-collector-scripts repository, e.g., for smartmon.sh it is needed to be able to collect the SMART values.

The entrypoint script is putting the output into the directory /var/lib/node_exporter, by default filename named after which script is running.

Variables

Name Default Description
SCRIPT smartmon.sh Which of the node_exporter textfile collector scripts to run.
OUTPUT_FILENAME smartmon Name of the output file (file ending must not be added .prom).
INTERVAL 300 Interval at which to run the script at.

Any flags / args given to the container are passed to the SCRIPT that will be executed.

Grafana Dashboard

A Grafana dashboard for viewing smartmon metrics can be found in the grafana/ directory and Grafana.com - Smartmon Textfile Dashboard by Galexrt.

Required node_exporter Configuration

See node_exporter Configuration Prerequisites.

Kubernetes

Following deployments / installation methods are available:

container-node_exporter-textfiles's People

Contributors

galexrt avatar mje-nz avatar renovate[bot] avatar runningman84 avatar thisisqasim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

container-node_exporter-textfiles's Issues

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

dockerfile
Dockerfile
github-actions
.github/workflows/build.yml
  • actions/checkout v4
  • docker/login-action v3.1.0
  • docker/login-action v3.1.0
.github/workflows/helm-publish.yml
  • actions/checkout v4
  • azure/setup-helm v4
  • helm/chart-releaser-action v1.6.0
.github/workflows/weekly-release.yml
  • actions/checkout v4
  • docker/login-action v3.1.0
  • docker/login-action v3.1.0
helm-values
charts/node-exporter-textfiles/values.yaml

  • Check this box to trigger a request for Renovate to run again on this repository

Link in README.md is broken ("Required node_exporter Configuration")

Link in README.md is broken

Required node_exporter Configuration

See [node_exporter Configuration Prerequisites](https://github.com/galexrt/container-node_exporter-textfiles/blob/main/node_exporter.md).

I am new to node-exporter-textfiles and looking into retrieving smartmon data. Wondering why the textfile is passed through node-exporter instead of just running a slim web-server providing the file contents. Running several prometheus containers (node-exporter, cadvisor) in Docker Swarm anyway, prometheus could just automatically scape it directly.

Error: failed to create containerd task

I'm running into this error while deploying node_exporter-textfiles:

Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:75: mounting "/var/snap/microk8s/common/var/lib/kubelet/pods/66d75cbb-0865-4bd3-a995-5a53ceba006b/containers/smartmon/ff83452c" to rootfs at "/dev/termination-log" caused: open /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/smartmon/rootfs/dev/termination-log: read-only file system: unknown

It's a 3-node physical cluster running MicroK8s. The node OS is Ubuntu LTS and smarttools is installed.

My values.yaml:

textfiles:
  # List of textfiles script that should be run in the daemonset as separate containers
  - name: smartmon
    script: smartmon.py
    # If unset, will default to the `name`
    outputFilename: "smartmon"
    # Defaults to `300`
    interval: 900
    additionalEnv: []
    additionalVolumeMounts:
      - mountPath: /dev
        name: host-dev
        readOnly: true
    resources: {}

Container Image Missing Python

After pulling the repo and installing via...

microk8s helm3 install -f ./values.yaml node-exporter-textfiles .

The resultant image does not work. Containers init, but that's it. Logs show the following.

Starting smartmon.sh loop ...
/usr/bin/env: 'python3': No such file or directory

I've traced this to an issue with the Docker image build - pulling python3 isn't working.

root@nuc0:~/node-exporter-textfiles/charts/node-exporter-textfiles# microk8s kubectl exec --stdin --tty node-exporter-textfiles-5hc5q -- /bin/bash
root@node-exporter-textfiles-5hc5q:/# which python3
root@node-exporter-textfiles-5hc5q:/# apt-get install python3
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package python3
root@node-exporter-textfiles-5hc5q:/# apt -qq update
36 packages can be upgraded. Run 'apt list --upgradable' to see them.
root@node-exporter-textfiles-5hc5q:/# apt-get install python3
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
bzip2 file libmagic-mgc libmagic1 libmpdec2 libpython3-stdlib libpython3.7-minimal libpython3.7-stdlib libreadline7 libsqlite3-0 mime-support python3-minimal python3.7
python3.7-minimal readline-common xz-utils
Suggested packages:
bzip2-doc python3-doc python3-tk python3-venv python3.7-venv python3.7-doc binutils binfmt-support readline-doc
The following NEW packages will be installed:
bzip2 file libmagic-mgc libmagic1 libmpdec2 libpython3-stdlib libpython3.7-minimal libpython3.7-stdlib libreadline7 libsqlite3-0 mime-support python3 python3-minimal python3.7
python3.7-minimal readline-common xz-utils
0 upgraded, 17 newly installed, 0 to remove and 36 not upgraded.
Need to get 6153 kB of archives.
After this operation, 31.6 MB of additional disk space will be used.
Do you want to continue? [Y/n]

does not work for nvme disk

The exporter seems to generate metrics for sata disks but fails for my nvme hosts:

root@node-exporter-smartmon-2k9qk:/var/lib/node_exporter# cat smartmon.prom  
# HELP smartmon_smartctl_version SMART metric smartctl_version
# TYPE smartmon_smartctl_version gauge
smartmon_smartctl_version{version="6.6"} 1
# HELP smartmon_device_active SMART metric device_active
# TYPE smartmon_device_active gauge
smartmon_device_active{disk="/dev/sda",type="scsi"} 1
smartmon_device_active{disk="/dev/sdb",type="scsi"} 1
smartmon_device_active{disk="/dev/sdc",type="scsi"} 1
# HELP smartmon_device_info SMART metric device_info
# TYPE smartmon_device_info gauge
smartmon_device_info{disk="/dev/sda",type="scsi",vendor="IET",product="VIRTUAL-DISK",revision="0001",lun_id="0x30000001000000010x60000000000000000e00000000010001",model_family="",device_model="",serial_number="",firmware_version=""} 1
smartmon_device_info{disk="/dev/sdb",type="scsi",vendor="IET",product="VIRTUAL-DISK",revision="0001",lun_id="0x30000002000000010x60000000000000000e00000000020001",model_family="",device_model="",serial_number="",firmware_version=""} 1
smartmon_device_info{disk="/dev/sdc",type="scsi",vendor="IET",product="VIRTUAL-DISK",revision="0001",lun_id="0x30000003000000010x60000000000000000e00000000030001",model_family="",device_model="",serial_number="",firmware_version=""} 1
# HELP smartmon_device_smart_available SMART metric device_smart_available
# TYPE smartmon_device_smart_available gauge
smartmon_device_smart_available{disk="/dev/sda",type="scsi"} 1
smartmon_device_smart_available{disk="/dev/sdb",type="scsi"} 1
smartmon_device_smart_available{disk="/dev/sdc",type="scsi"} 1
# HELP smartmon_device_smart_enabled SMART metric device_smart_enabled
# TYPE smartmon_device_smart_enabled gauge
smartmon_device_smart_enabled{disk="/dev/sda",type="scsi"} 0
smartmon_device_smart_enabled{disk="/dev/sdb",type="scsi"} 0
smartmon_device_smart_enabled{disk="/dev/sdc",type="scsi"} 0
# HELP smartmon_device_smart_healthy SMART metric device_smart_healthy
# TYPE smartmon_device_smart_healthy gauge
smartmon_device_smart_healthy{disk="/dev/sda",type="scsi"} 1
smartmon_device_smart_healthy{disk="/dev/sdb",type="scsi"} 1
smartmon_device_smart_healthy{disk="/dev/sdc",type="scsi"} 1
# HELP smartmon_smartctl_run SMART metric smartctl_run
# TYPE smartmon_smartctl_run gauge
smartmon_smartctl_run{disk="/dev/sda",type="scsi"} 1644098182
smartmon_smartctl_run{disk="/dev/sdb",type="scsi"} 1644098190
smartmon_smartctl_run{disk="/dev/sdc",type="scsi"} 1644098198
root@node-exporter-smartmon-2k9qk:/var/lib/node_exporter# smartctl -a /dev/nvme0
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.4.0-97-generic] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 980 500GB
Serial Number:                      S64DNF0R967004T
Firmware Version:                   1B4QFXO7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 500,107,862,016 [500 GB]
Unallocated NVM Capacity:           0
Controller ID:                      5
Number of Namespaces:               1
Namespace 1 Size/Capacity:          500,107,862,016 [500 GB]
Namespace 1 Utilization:            125,461,372,928 [125 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 d911b194f9
Local Time is:                      Sat Feb  5 22:01:21 2022 UTC
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0055):     Comp DS_Mngmt Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     82 Celsius
Critical Comp. Temp. Threshold:     85 Celsius
Namespace 1 Features (0x10):        *Other*

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     5.24W       -        -    0  0  0  0        0       0
 1 +     4.49W       -        -    1  1  1  1        0       0
 2 +     2.19W       -        -    2  2  2  2        0     500
 3 -   0.0500W       -        -    3  3  3  3      210    1200
 4 -   0.0050W       -        -    4  4  4  4     1000    9000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        53 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    2%
Data Units Read:                    2,343,146 [1.19 TB]
Data Units Written:                 9,288,707 [4.75 TB]
Host Read Commands:                 55,157,318
Host Write Commands:                351,969,224
Controller Busy Time:               891
Power Cycles:                       3
Power On Hours:                     2,382
Unsafe Shutdowns:                   0
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    2
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               53 Celsius
Temperature Sensor 2:               59 Celsius
Thermal Temp. 2 Transition Count:   670
Thermal Temp. 2 Total Time:         124

Error Information (NVMe Log 0x01, max 64 entries)
No Errors Logged

I have also some virtual disk created by longhorn in k8s, they should not appear in the exporter output because there are no smart values anyway:

root@node-exporter-smartmon-2k9qk:/var/lib/node_exporter# smartctl /dev/sda  
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.4.0-97-generic] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org


Integrate with Prometheus AlertManager

I would like to integrate this container with Prometheus AlertManager to receive notifications when a hard drive is unhealthy.

What are the steps beside the basic configs to get this working?

@galexrt do you have any example for this to get this running. Maybe you have this running by yourself.

If you need any more info, please let me know!

Thanks in advance.

no matches for kind "PodSecurityPolicy" in version "extensions/v1beta1"

Installing from Helm on Kubernetes v1.22 gives error:

Release "node-exporter-textfiles" does not exist. Installing it now.
Error: unable to build kubernetes objects from release manifest: resource mapping not found for name: "node-exporter-textfiles" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "extensions/v1beta1"
ensure CRDs are installed first

These are the api-resources I have:

$ kubectl api-resources | grep -i PodSecurityPolicy
podsecuritypolicies               psp                 policy/v1beta1                         false        PodSecurityPolicy

So I think you need to change extensions/v1beta1 to policy/v1beta1 here

Unable to deploy image since end of October

On my personal server, I’m unable to deploy any version of this image newer than v20201022-124227-096.
Any attempt to deploy newer versions than that (tested just now with v20201102-203151-584, for some reason I can’t pull Docker images from Oct 25 and Oct 29 right now) end up with the container failing to start with the error message Script smartmon.sh doesn't exist. Exiting 1.

I have not changed any configuration on the server related to how the image is deployed, so I wonder if something changed in the image itself? Is there any configuration change necessary on my side that I somehow missed?

Thank you!

Grafana: changed query field

The temp fields reported by smartctl do not match the grafana query.

Smartctl reports smartmon_temperature_celsius_raw_value
The graph is looking for smartmon_airflow_temperature_cel_value

smartctl version 6.4 2014-10-07 r4002 [x86_64-linux-4.19.0-0.bpo.2-amd64]
image digest: 7c54c3be8aa5

Happy to make the PR myself, just not sure if it's a global issue or not

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.