cloudandheat / prometheus_smart_exporter Goto Github PK
View Code? Open in Web Editor NEWConfigurable S.M.A.R.T. metric exporter for Prometheus
License: GNU General Public License v3.0
Configurable S.M.A.R.T. metric exporter for Prometheus
License: GNU General Public License v3.0
Could you add support for NVMe devices please ?
There is no nvme devices on this bus "/sys/bus/scsi/devices/".
When I attempt to run the exporter on my machine, the helper fails with:
ERROR:smart_exporter_helper:while handling client
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/smart_exporter_helper/__init__.py", line 249, in main
handle_client(client_sock)
File "/usr/local/lib/python3.6/dist-packages/smart_exporter_helper/__init__.py", line 142, in handle_client
info = read_drive_info("/dev/"+device)
File "/usr/local/lib/python3.6/dist-packages/smart_exporter_helper/__init__.py", line 112, in read_drive_info
"Raw": int(fields[9]),
ValueError: invalid literal for int() with base 10: '0/0'
Looking at the code, I believe the issue is that for one of my drives, in the RAW_VALUE
, the value is 0/0
for 234 Thermal_Throttle
, something that smart_exporter_helper
doesn't seem to handle properly.
# smartctl -iA /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.16.0-2-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Intel 730 and DC S35x0/3610/3700 Series SSDs
Device Model: INTEL SSDSC2BB800G4
Serial Number: (redacted)
LU WWN Device Id: (redacted)
Firmware Version: D2010370
User Capacity: 800,166,076,416 bytes [800 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Jul 24 12:29:46 2018 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 5254
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 6
170 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
174 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 5
175 Power_Loss_Cap_Test 0x0033 100 100 010 Pre-fail Always - 670 (29 9035)
183 SATA_Downshift_Count 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
190 Temperature_Case 0x0022 066 064 000 Old_age Always - 34 (Min/Max 23/37)
192 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 5
194 Temperature_Internal 0x0022 100 100 000 Old_age Always - 45
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
199 CRC_Error_Count 0x003e 100 100 000 Old_age Always - 26
225 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 471567
226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age Always - 880
227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age Always - 34
228 Workload_Minutes 0x0032 100 100 000 Old_age Always - 315196
232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0
234 Thermal_Throttle 0x0032 100 100 000 Old_age Always - 0/0
241 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 471567
242 Host_Reads_32MiB 0x0032 100 100 000 Old_age Always - 249231
OpenMetrics requires Exporters to name all counters with _total
at the end. This is missing for reallocated sectors.
Hello,
Commit #78728f8 on the check_smart_attributes repo refactored the check_smartdb.json to be proper json. Following the recommended sudo curl -o /etc/prometheus_smart_exporter/devices.json https://raw.githubusercontent.com/thomas-krenn/check_smart_attributes/master/check_smartdb.json
in the readme no longer works. The last working smartdb is located at https://raw.githubusercontent.com/thomas-krenn/check_smart_attributes/c83683e1b82f7e173049d2b9a1727432c39e8f86/check_smartdb.json
The readme should be updated for now, with plans to support this new format moving forward.
try to run: # /usr/local/bin/prometheus_smart_exporter --device-db /etc/prometheus/exporters/smart/devices.json -a 0.0.0.0 -p 9257 -vv /var/run/prometheus_smart_helper/ipc
INFO:prometheus_smart_exporter:device db loaded with 4 devices
INFO:prometheus_smart_exporter:attribute_mapping loaded with 16 generic rules, 0 per device rules for 0 devices
Traceback (most recent call last):
File "/usr/local/bin/prometheus_smart_exporter", line 10, in
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/prometheus_smart_exporter/init.py", line 402, in main
logger.getChild("collector")
File "/usr/local/lib/python3.6/dist-packages/prometheus_client/registry.py", line 24, in register
names = self._get_names(collector)
File "/usr/local/lib/python3.6/dist-packages/prometheus_client/registry.py", line 64, in _get_names
for metric in desc_func():
File "/usr/local/lib/python3.6/dist-packages/prometheus_smart_exporter/init.py", line 105, in collect
data = self._recv_smart_info(sock)
File "/usr/local/lib/python3.6/dist-packages/prometheus_smart_exporter/init.py", line 82, in _recv_smart_info
ver, length = Header.unpack(hdr)
struct.error: unpack requires a buffer of 9 bytes
/usr/bin/python3 -V
Python 3.6.8
uname -a
Linux backup 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
ls -la /var/run/prometheus_smart_helper/ipc
srw------- 1 prometheus prometheus 0 Sep 24 11:01 /var/run/prometheus_smart_helper/ipc
smart_exporter_helper fails to run due to call to socket.listen() on Python versions earlier than 3.5. socket.listen() requires a backlog parameter for all versions of Python earlier than 3.5.
File "/usr/lib/python3/dist-packages/smart_exporter_helper/__init__.py", line 228, in main
sock.listen()
TypeError: listen() takes exactly one argument (0 given)
# python3 --version
Python 3.4.3
Hi, guys, thanx for your great work.
Well, I tried to get smart of my disks connected with megaraid controller, but no luck.
I know to get smart info with smartctl in my case I need to add extra arg to this command - "-d megaraid,X".
I guess I can do it with this trick:
--smartctl-arg SMARTCTL_ARG
Pass an additional argument to the smartctl command.
Can be specified multiple times.
But app tells me something like this -
smart_exporter_helper: error: unrecognized arguments: --smartctl-arg=-d megaraid,8-11
no thoughts, can you help?
Thanks for a great exporter.
I see that the only reason this app is split in two subcomponents is to avoid running the whole thing as root. It seems that the only place that actually need root is running smartctl subprocess.
Dont you think exporter can be dramatically simplified if there is option to skip smart_exporter_helper
and run sudo smartctl
directly instead. sudo is flexible enough to allow only smartctl for designated user.
/usr/bin/smart_exporter_helper[796]: ERROR:smart_exporter_helper:while handling client
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/smart_exporter_helper/__init__.py", line 249, in main
handle_client(client_sock)
File "/usr/lib/python3/dist-packages/smart_exporter_helper/__init__.py", line 142, in handle_client
info = read_drive_info("/dev/"+device)
File "/usr/lib/python3/dist-packages/smart_exporter_helper/__init__.py", line 111, in read_drive_info
"Thresh": int(fields[5]),
ValueError: invalid literal for int() with base 10: '---'
sudo smartctl -a /dev/sda
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-34-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: SanDisk SD8SB8U512G1122
Serial Number: 170225423404
LU WWN Device Id: 5 001b44 4a6ab4778
Firmware Version: X4140000
User Capacity: 512,110,190,592 bytes [512 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Oct 26 08:19:47 2018 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.