galexrt / container-node_exporter-textfiles Goto Github PK
View Code? Open in Web Editor NEWContainer Image for easily running textfile exporter scripts to be collected by the prometheus/node_exporter.
License: MIT License
Container Image for easily running textfile exporter scripts to be collected by the prometheus/node_exporter.
License: MIT License
It would be great to get some hint how to use this exporter with the prometheus operator helm chart...
The temp fields reported by smartctl do not match the grafana query.
Smartctl reports smartmon_temperature_celsius_raw_value
The graph is looking for smartmon_airflow_temperature_cel_value
smartctl version 6.4 2014-10-07 r4002 [x86_64-linux-4.19.0-0.bpo.2-amd64]
image digest: 7c54c3be8aa5
Happy to make the PR myself, just not sure if it's a global issue or not
I'm running into this error while deploying node_exporter-textfiles
:
Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:75: mounting "/var/snap/microk8s/common/var/lib/kubelet/pods/66d75cbb-0865-4bd3-a995-5a53ceba006b/containers/smartmon/ff83452c" to rootfs at "/dev/termination-log" caused: open /var/snap/microk8s/common/run/containerd/io.containerd.runtime.v2.task/k8s.io/smartmon/rootfs/dev/termination-log: read-only file system: unknown
It's a 3-node physical cluster running MicroK8s. The node OS is Ubuntu LTS and smarttools
is installed.
My values.yaml
:
textfiles:
# List of textfiles script that should be run in the daemonset as separate containers
- name: smartmon
script: smartmon.py
# If unset, will default to the `name`
outputFilename: "smartmon"
# Defaults to `300`
interval: 900
additionalEnv: []
additionalVolumeMounts:
- mountPath: /dev
name: host-dev
readOnly: true
resources: {}
After pulling the repo and installing via...
microk8s helm3 install -f ./values.yaml node-exporter-textfiles .
The resultant image does not work. Containers init, but that's it. Logs show the following.
Starting smartmon.sh loop ...
/usr/bin/env: 'python3': No such file or directory
I've traced this to an issue with the Docker image build - pulling python3 isn't working.
root@nuc0:~/node-exporter-textfiles/charts/node-exporter-textfiles# microk8s kubectl exec --stdin --tty node-exporter-textfiles-5hc5q -- /bin/bash
root@node-exporter-textfiles-5hc5q:/# which python3
root@node-exporter-textfiles-5hc5q:/# apt-get install python3
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package python3
root@node-exporter-textfiles-5hc5q:/# apt -qq update
36 packages can be upgraded. Run 'apt list --upgradable' to see them.
root@node-exporter-textfiles-5hc5q:/# apt-get install python3
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
bzip2 file libmagic-mgc libmagic1 libmpdec2 libpython3-stdlib libpython3.7-minimal libpython3.7-stdlib libreadline7 libsqlite3-0 mime-support python3-minimal python3.7
python3.7-minimal readline-common xz-utils
Suggested packages:
bzip2-doc python3-doc python3-tk python3-venv python3.7-venv python3.7-doc binutils binfmt-support readline-doc
The following NEW packages will be installed:
bzip2 file libmagic-mgc libmagic1 libmpdec2 libpython3-stdlib libpython3.7-minimal libpython3.7-stdlib libreadline7 libsqlite3-0 mime-support python3 python3-minimal python3.7
python3.7-minimal readline-common xz-utils
0 upgraded, 17 newly installed, 0 to remove and 36 not upgraded.
Need to get 6153 kB of archives.
After this operation, 31.6 MB of additional disk space will be used.
Do you want to continue? [Y/n]
On my personal server, Iβm unable to deploy any version of this image newer than v20201022-124227-096
.
Any attempt to deploy newer versions than that (tested just now with v20201102-203151-584
, for some reason I canβt pull Docker images from Oct 25 and Oct 29 right now) end up with the container failing to start with the error message Script smartmon.sh doesn't exist. Exiting 1
.
I have not changed any configuration on the server related to how the image is deployed, so I wonder if something changed in the image itself? Is there any configuration change necessary on my side that I somehow missed?
Thank you!
Hello,
Please would you consider creating a Helm chart repo with Github pages? Then it will be possible to easily install the Helm chart and follow updates.
https://helm.sh/docs/howto/chart_releaser_action/
I can help you, but it's quite straightforward π
The exporter seems to generate metrics for sata disks but fails for my nvme hosts:
root@node-exporter-smartmon-2k9qk:/var/lib/node_exporter# cat smartmon.prom
# HELP smartmon_smartctl_version SMART metric smartctl_version
# TYPE smartmon_smartctl_version gauge
smartmon_smartctl_version{version="6.6"} 1
# HELP smartmon_device_active SMART metric device_active
# TYPE smartmon_device_active gauge
smartmon_device_active{disk="/dev/sda",type="scsi"} 1
smartmon_device_active{disk="/dev/sdb",type="scsi"} 1
smartmon_device_active{disk="/dev/sdc",type="scsi"} 1
# HELP smartmon_device_info SMART metric device_info
# TYPE smartmon_device_info gauge
smartmon_device_info{disk="/dev/sda",type="scsi",vendor="IET",product="VIRTUAL-DISK",revision="0001",lun_id="0x30000001000000010x60000000000000000e00000000010001",model_family="",device_model="",serial_number="",firmware_version=""} 1
smartmon_device_info{disk="/dev/sdb",type="scsi",vendor="IET",product="VIRTUAL-DISK",revision="0001",lun_id="0x30000002000000010x60000000000000000e00000000020001",model_family="",device_model="",serial_number="",firmware_version=""} 1
smartmon_device_info{disk="/dev/sdc",type="scsi",vendor="IET",product="VIRTUAL-DISK",revision="0001",lun_id="0x30000003000000010x60000000000000000e00000000030001",model_family="",device_model="",serial_number="",firmware_version=""} 1
# HELP smartmon_device_smart_available SMART metric device_smart_available
# TYPE smartmon_device_smart_available gauge
smartmon_device_smart_available{disk="/dev/sda",type="scsi"} 1
smartmon_device_smart_available{disk="/dev/sdb",type="scsi"} 1
smartmon_device_smart_available{disk="/dev/sdc",type="scsi"} 1
# HELP smartmon_device_smart_enabled SMART metric device_smart_enabled
# TYPE smartmon_device_smart_enabled gauge
smartmon_device_smart_enabled{disk="/dev/sda",type="scsi"} 0
smartmon_device_smart_enabled{disk="/dev/sdb",type="scsi"} 0
smartmon_device_smart_enabled{disk="/dev/sdc",type="scsi"} 0
# HELP smartmon_device_smart_healthy SMART metric device_smart_healthy
# TYPE smartmon_device_smart_healthy gauge
smartmon_device_smart_healthy{disk="/dev/sda",type="scsi"} 1
smartmon_device_smart_healthy{disk="/dev/sdb",type="scsi"} 1
smartmon_device_smart_healthy{disk="/dev/sdc",type="scsi"} 1
# HELP smartmon_smartctl_run SMART metric smartctl_run
# TYPE smartmon_smartctl_run gauge
smartmon_smartctl_run{disk="/dev/sda",type="scsi"} 1644098182
smartmon_smartctl_run{disk="/dev/sdb",type="scsi"} 1644098190
smartmon_smartctl_run{disk="/dev/sdc",type="scsi"} 1644098198
root@node-exporter-smartmon-2k9qk:/var/lib/node_exporter# smartctl -a /dev/nvme0
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.4.0-97-generic] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 980 500GB
Serial Number: S64DNF0R967004T
Firmware Version: 1B4QFXO7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 500,107,862,016 [500 GB]
Unallocated NVM Capacity: 0
Controller ID: 5
Number of Namespaces: 1
Namespace 1 Size/Capacity: 500,107,862,016 [500 GB]
Namespace 1 Utilization: 125,461,372,928 [125 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 d911b194f9
Local Time is: Sat Feb 5 22:01:21 2022 UTC
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0055): Comp DS_Mngmt Sav/Sel_Feat Timestmp
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 82 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Namespace 1 Features (0x10): *Other*
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 5.24W - - 0 0 0 0 0 0
1 + 4.49W - - 1 1 1 1 0 0
2 + 2.19W - - 2 2 2 2 0 500
3 - 0.0500W - - 3 3 3 3 210 1200
4 - 0.0050W - - 4 4 4 4 1000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 53 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 2%
Data Units Read: 2,343,146 [1.19 TB]
Data Units Written: 9,288,707 [4.75 TB]
Host Read Commands: 55,157,318
Host Write Commands: 351,969,224
Controller Busy Time: 891
Power Cycles: 3
Power On Hours: 2,382
Unsafe Shutdowns: 0
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 2
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 53 Celsius
Temperature Sensor 2: 59 Celsius
Thermal Temp. 2 Transition Count: 670
Thermal Temp. 2 Total Time: 124
Error Information (NVMe Log 0x01, max 64 entries)
No Errors Logged
I have also some virtual disk created by longhorn in k8s, they should not appear in the exporter output because there are no smart values anyway:
root@node-exporter-smartmon-2k9qk:/var/lib/node_exporter# smartctl /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.4.0-97-generic] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
This repository currently has no open or pending branches.
Dockerfile
.github/workflows/build.yml
actions/checkout v4
docker/login-action v3.1.0
docker/login-action v3.1.0
.github/workflows/helm-publish.yml
actions/checkout v4
azure/setup-helm v4
helm/chart-releaser-action v1.6.0
.github/workflows/weekly-release.yml
actions/checkout v4
docker/login-action v3.1.0
docker/login-action v3.1.0
charts/node-exporter-textfiles/values.yaml
Installing from Helm on Kubernetes v1.22 gives error:
Release "node-exporter-textfiles" does not exist. Installing it now.
Error: unable to build kubernetes objects from release manifest: resource mapping not found for name: "node-exporter-textfiles" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "extensions/v1beta1"
ensure CRDs are installed first
These are the api-resources I have:
$ kubectl api-resources | grep -i PodSecurityPolicy
podsecuritypolicies psp policy/v1beta1 false PodSecurityPolicy
So I think you need to change extensions/v1beta1
to policy/v1beta1
here
Link in README.md is broken
Required node_exporter Configuration
See [node_exporter Configuration Prerequisites](https://github.com/galexrt/container-node_exporter-textfiles/blob/main/node_exporter.md).
I am new to node-exporter-textfiles and looking into retrieving smartmon data. Wondering why the textfile is passed through node-exporter instead of just running a slim web-server providing the file contents. Running several prometheus containers (node-exporter, cadvisor) in Docker Swarm anyway, prometheus could just automatically scape it directly.
I would like to integrate this container with Prometheus AlertManager to receive notifications when a hard drive is unhealthy.
What are the steps beside the basic configs to get this working?
@galexrt do you have any example for this to get this running. Maybe you have this running by yourself.
If you need any more info, please let me know!
Thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.