Comments (14)
Same thing today.
from collectd.
With --enable-debug this time:
[2012-04-13 16:53:45] kstat chain has been updated [2012-04-13 16:53:45] plugin_dispatch_values: time = 1334350425.059; interval = 30.000; host = cairo.our.org; plugin = memory; plugin_instance = ; type = memory; type_instance = kernel; [2012-04-13 16:53:45] uc_update: cairo.our.org/swap/swap-used: ds[0] = 911417344.000000 [2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003. [2012-04-13 16:53:45] uc_update: cairo.our.org/memory/memory-kernel: ds[0] = 512491520.000000 [2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003. [2012-04-13 16:53:45] uc_update: cairo.our.org/cpu-0/cpu-user: ds[0] = 0.566302 [2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003. [2012-04-13 16:53:45] uc_update: cairo.our.org/nfs-v2client/nfs_procedure-null: ds[0] = 0.000000 [2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003. [2012-04-13 16:53:45] write_graphite plugin: [rcf-metrics]:2003 buf 452/1428 (31.7 %) "cairo_our_org.collectd.zfs_arc.cache_eviction-eligible.value 0 1334350425 " [2012-04-13 16:53:45] plugin_dispatch_values: time = 1334350425.060; interval = 30.000; host = cairo.our.org; plugin = zfs_arc; plugin_instance = ; type = cache_eviction; type_instance = ineligible; [2012-04-13 16:53:45] uc_update: cairo.our.org/zfs_arc/cache_eviction-ineligible: ds[0] = 0.000000 [2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003. [2012-04-13 16:53:45] write_graphite plugin: [rcf-metrics]:2003 buf 529/1428 (37.0 %) "cairo_our_org.collectd.nfs-v2client.nfs_procedure-null.value 0 1334350425 " [2012-04-13 16:53:45] plugin_dispatch_values: time = 1334350425.061; interval = 30.000; host = cairo.our.org; plugin = nfs; plugin_instance = v2client; type = nfs_procedure; type_instance = getattr; [2012-04-13 16:53:45] write_graphite plugin: [rcf-metrics]:2003 buf 608/1428 (42.6 %) "cairo_our_org.collectd.zfs_arc.cache_eviction-ineligible.value 0 1334350425 " [2012-04-13 16:53:45] uc_update: cairo.our.org/nfs-v2client/nfs_procedure-getattr: ds[0] = 0.000000 Assertion failed: ksp != NULL, file common.c, line 640 [2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003. Abort (core dumped)
from collectd.
More data. Note the NaNs:
[2012-09-05 22:59:03] plugin_dispatch_values: time = 1346900343.984; interval = 30.000; host = silmaril.our.
org; plugin = zfs_arc; plugin_instance = ; type = cache_size; type_instance = L2;
[2012-09-05 22:59:03] uc_update: silmaril.our.org/zfs_arc/cache_size-L2: ds[0] = 0.000000
[2012-09-05 22:59:03] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-09-05 22:59:03] write_graphite plugin: wg_flush_nolock: timeout = 0.000; send_buf_fill = 1374;
[2012-09-05 22:59:03] write_graphite plugin: [rcf-metrics]:2003 buf 77/1428 (5.4 %) "silmaril_our_org.collec
td.zfs_arc.cache_size-L2.value 0.000000 1346900343
"
[2012-09-05 22:59:03] zfs_arc plugin: Reading kstat value "allocated" failed.
...
[2012-09-05 22:59:03] plugin_dispatch_values: time = 1346900343.990; interval = 30.000; host = silmaril.our.
org; plugin = zfs_arc; plugin_instance = ; type = cache_ratio; type_instance = L2;
[2012-09-05 22:59:03] uc_update: silmaril.our.org/zfs_arc/cache_ratio-L2: ds[0] = NaN
[2012-09-05 22:59:03] kstat chain has been updated
...
[2012-09-05 22:59:04] plugin_read_thread: Effective interval of the memory plugin is 30.000000000.
[2012-09-05 22:59:04] plugin_read_thread: Next read of the memory plugin at 1346900373.989394022.
[2012-09-05 22:59:03] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-09-05 22:59:04] write_graphite plugin: [rcf-metrics]:2003 buf 158/1428 (11.1 %) "silmaril_our_org.collectd.zfs_arc.cache_ratio-L2.value NaN 1346900343
"
Assertion failed: ksp != NULL, file common.c, line 640
Abort (core dumped)
from collectd.
Hi @jblaine,
could you test the changes in PR #126?
Thanks and best regards,
—octo
from collectd.
For convenience, I'll copy the information you gave on the pull request to here:
[2012-09-07 13:10:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-07 13:10:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-07 13:35:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-07 15:35:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
[2012-09-07 21:03:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 21:03:54] get_kstat_value ("stolen"): ksp is NULL.
[...]
[2012-09-07 21:03:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 21:03:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-08 04:01:24] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("l2_write_bytes"): ksp is NULL.
Interestingly, this starts with the value deleted every time the problem comes up. This is from the following line in the ZFS-ARC plugin:
/* Operations */
za_read_derive (ksp, "allocated","cache_operation", "allocated");
za_read_derive (ksp, "deleted", "cache_operation", "deleted");
za_read_derive (ksp, "stolen", "cache_operation", "stolen");
So reading the field "allocated" works, while reading "deleted" fails. But why doesn't the ZFS ARC plugin print an error message? When printing that message, get_kstat_value()
should return an error indicator, which should result in a second error being printed by the ZFS ARC plugin.
Can you try the following in the command line?:
kstat -m zfs -i 0 -n arcstats
Best regards,
—octo
from collectd.
Yes, the errors are being reported from zfs_arc -- I had omitted them due to size.
Here are 2 full blocks from that time:
[2012-09-07 13:10:54] kstat chain has been updated
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "allocated" failed.
[2012-09-07 13:10:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "deleted" failed.
[2012-09-07 13:10:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "stolen" failed.
[2012-09-07 13:10:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "mutex_miss" failed.
[2012-09-07 13:10:54] get_kstat_value ("hash_collisions"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "hash_collisions" failed.
[2012-09-07 13:10:54] get_kstat_value ("evict_l2_cached"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "evict_l2_cached" failed.
[2012-09-07 13:10:54] get_kstat_value ("evict_l2_eligible"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "evict_l2_eligible" failed.
[2012-09-07 13:10:54] get_kstat_value ("evict_l2_ineligible"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "evict_l2_ineligible" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_data_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_data_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_metadata_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_metadata_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_data_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_data_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_metadata_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_data_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_data_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_metadata_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_metadata_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_data_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_data_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_metadata_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("hits"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("misses"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_hits"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_misses"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
...
[2012-09-07 13:35:54] kstat chain has been updated
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "allocated" failed.
[2012-09-07 13:35:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "deleted" failed.
[2012-09-07 13:35:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "stolen" failed.
[2012-09-07 13:35:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "mutex_miss" failed.
[2012-09-07 13:35:54] get_kstat_value ("hash_collisions"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "hash_collisions" failed.
[2012-09-07 13:35:54] get_kstat_value ("evict_l2_cached"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "evict_l2_cached" failed.
[2012-09-07 13:35:54] get_kstat_value ("evict_l2_eligible"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "evict_l2_eligible" failed.
[2012-09-07 13:35:54] get_kstat_value ("evict_l2_ineligible"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "evict_l2_ineligible" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_data_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_data_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_metadata_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_metadata_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_data_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_data_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_metadata_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_data_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_data_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_metadata_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_metadata_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_data_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_data_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_metadata_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("hits"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("misses"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_hits"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_misses"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
adm:silmaril>
from collectd.
kstat -m zfs -i 0 -n arcstats
runs fine.
module: zfs instance: 0
name: arcstats class: misc
c 528136002
c_max 1578153984
c_min 197269248
crtime 66.794009685
data_size 350626816
deleted 6274390
demand_data_hits 9378713
demand_data_misses 1138984
demand_metadata_hits 41489545
demand_metadata_misses 4188808
evict_l2_cached 0
evict_l2_eligible 330758987264
evict_l2_ineligible 69180891136
evict_skip 110616641
hash_chain_max 16
hash_chains 9690
hash_collisions 22375272
hash_elements 35613
hash_elements_max 127411
hdr_size 6646104
hits 71039707
l2_abort_lowmem 0
l2_cksum_bad 0
l2_evict_lock_retry 0
l2_evict_reading 0
l2_feeds 0
l2_free_on_write 0
l2_hdr_size 0
l2_hits 0
l2_io_error 0
l2_misses 0
l2_read_bytes 0
l2_rw_clash 0
l2_size 0
l2_write_bytes 0
l2_writes_done 0
l2_writes_error 0
l2_writes_hdr_miss 0
l2_writes_sent 0
memory_throttle_count 1874
mfu_ghost_hits 2007968
mfu_hits 32130850
misses 10350139
mru_ghost_hits 4061865
mru_hits 20174786
mutex_miss 2
other_size 118803456
p 334877357
prefetch_data_hits 1106754
prefetch_data_misses 1294671
prefetch_metadata_hits 19064695
prefetch_metadata_misses 3727676
recycle_miss 3809672
size 476076376
snaptime 3815140.9903571
from collectd.
Okay, then it's not ZFS ARC as such that's broken, it's more that the kstat handling needs to be corrected.
[2012-09-07 13:10:54] kstat chain has been updated
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "allocated" failed.
One of the threads is checking the kstat chain periodically and updates it (and is calling init functions) when needed. This isn't handled gracefully in the ZFS ARC plugin and may result in this problem. I'll write a patch now that I know what's going on.
from collectd.
I'm not sure if it matters, but note too that the kstat command output above does not even show an "allocated" stat. There is also no 'stolen' shown, but many errors: "[2012-09-11 10:00:54] zfs_arc plugin: Reading kstat value "stolen" failed."
from collectd.
Good point. It seems to work most of the time though, right? Or are you getting error messages about "allocated" all the time?
from collectd.
Looking now I see that those have been all the time. I started the current test collectd on Sep 7. Every Interval seconds the following appears:
[2012-09-12 11:27:54] zfs_arc plugin: Reading kstat value "allocated" failed.
[2012-09-12 11:27:54] zfs_arc plugin: Reading kstat value "stolen" failed.
[2012-09-12 11:27:54] plugin_dispatch_values: Dataset not found: mutex_operation (from "silmaril.our.org/zfs_arc/mutex_operation-miss"), check your types.db!
from collectd.
Hello,
In src/types.db (and in /usr/share/collectd/types.db) :
mutex_operations value:DERIVE:0:U
in src/zfs_arc.c
za_read_derive (ksp, "mutex_miss", "mutex_operation", "miss");
Notice the final "s" in mutex_operations in src/types.db.
Short fix : update your /usr/share/collectd/types.db
Real fix : same thing in src/types.db?
Regards,
Yves
from collectd.
@ymettier Looks like the bug is in the code since 4f5234d. I've fixed this in 4d99b79.
Best regards,
—octo
from collectd.
Good eye, Yves.
from collectd.
Related Issues (20)
- Collectd SNMP CSV file Update Delay HOT 3
- Python 3.10 compilation error HOT 3
- Inclusive language: Replace the word "master/slave" with "primary/subordinate" HOT 4
- How to configure collectd.conf for Dell Data Domain DD6400? HOT 1
- Is there any plans to have python3 as the default for collectd python plugin? HOT 3
- [collectd 6] one thread per write plugin breaks write_http (and write_prometheus) behavior HOT 6
- The collectd_virt_if_errors_rx_total, collectd_virt_if_errors_tx_total metrics of Virt plugin are not collected HOT 6
- mqtt plugin: Unknown type: HOT 2
- collectd 5.9.0 keep reporting the "virt plugin: Array index out of bounds: tag_index = 11/12" HOT 5
- Processes Plugin missing mnl_socket_bind symbol. HOT 4
- Upon starting, collectd sends OK notifications for all the values which have thresholds defined HOT 3
- CollectD - 5.11 : Memory leak with libnss HOT 3
- "uc_update: Value too old flood" in log HOT 3
- There are no "intel_rdt" related metrics in types.db file HOT 1
- turbostat: missing Skylake Cstates HOT 2
- turbostat plugin failed to open msr HOT 1
- smart plugin causes "num_err_log_entries" to increase on a Seagate FireCuda 530 NVMe drive
- conntrack plugin: per-state metrics HOT 4
- MySQL: Bad DS types for WSREP/Galera variables HOT 2
- Swap plugin sends usage data twice HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from collectd.