Giter VIP home page Giter VIP logo

Comments (14)

jblaine avatar jblaine commented on May 22, 2024

Same thing today.

from collectd.

jblaine avatar jblaine commented on May 22, 2024

With --enable-debug this time:

[2012-04-13 16:53:45] kstat chain has been updated
[2012-04-13 16:53:45] plugin_dispatch_values: time = 1334350425.059; interval = 30.000; host = cairo.our.org; plugin = memory; plugin_instance = ; type = memory; type_instance = kernel;
[2012-04-13 16:53:45] uc_update: cairo.our.org/swap/swap-used: ds[0] = 911417344.000000
[2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-04-13 16:53:45] uc_update: cairo.our.org/memory/memory-kernel: ds[0] = 512491520.000000
[2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-04-13 16:53:45] uc_update: cairo.our.org/cpu-0/cpu-user: ds[0] = 0.566302
[2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-04-13 16:53:45] uc_update: cairo.our.org/nfs-v2client/nfs_procedure-null: ds[0] = 0.000000
[2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-04-13 16:53:45] write_graphite plugin: [rcf-metrics]:2003 buf 452/1428 (31.7 %) "cairo_our_org.collectd.zfs_arc.cache_eviction-eligible.value 0 1334350425
"
[2012-04-13 16:53:45] plugin_dispatch_values: time = 1334350425.060; interval = 30.000; host = cairo.our.org; plugin = zfs_arc; plugin_instance = ; type = cache_eviction; type_instance = ineligible;
[2012-04-13 16:53:45] uc_update: cairo.our.org/zfs_arc/cache_eviction-ineligible: ds[0] = 0.000000
[2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-04-13 16:53:45] write_graphite plugin: [rcf-metrics]:2003 buf 529/1428 (37.0 %) "cairo_our_org.collectd.nfs-v2client.nfs_procedure-null.value 0 1334350425
"
[2012-04-13 16:53:45] plugin_dispatch_values: time = 1334350425.061; interval = 30.000; host = cairo.our.org; plugin = nfs; plugin_instance = v2client; type = nfs_procedure; type_instance = getattr;
[2012-04-13 16:53:45] write_graphite plugin: [rcf-metrics]:2003 buf 608/1428 (42.6 %) "cairo_our_org.collectd.zfs_arc.cache_eviction-ineligible.value 0 1334350425
"
[2012-04-13 16:53:45] uc_update: cairo.our.org/nfs-v2client/nfs_procedure-getattr: ds[0] = 0.000000
Assertion failed: ksp != NULL, file common.c, line 640
[2012-04-13 16:53:45] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
Abort (core dumped)

from collectd.

jblaine avatar jblaine commented on May 22, 2024

More data. Note the NaNs:

[2012-09-05 22:59:03] plugin_dispatch_values: time = 1346900343.984; interval = 30.000; host = silmaril.our.
org; plugin = zfs_arc; plugin_instance = ; type = cache_size; type_instance = L2;
[2012-09-05 22:59:03] uc_update: silmaril.our.org/zfs_arc/cache_size-L2: ds[0] = 0.000000
[2012-09-05 22:59:03] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-09-05 22:59:03] write_graphite plugin: wg_flush_nolock: timeout = 0.000; send_buf_fill = 1374;
[2012-09-05 22:59:03] write_graphite plugin: [rcf-metrics]:2003 buf 77/1428 (5.4 %) "silmaril_our_org.collec
td.zfs_arc.cache_size-L2.value 0.000000 1346900343
"
[2012-09-05 22:59:03] zfs_arc plugin: Reading kstat value "allocated" failed.
...
[2012-09-05 22:59:03] plugin_dispatch_values: time = 1346900343.990; interval = 30.000; host = silmaril.our.
org; plugin = zfs_arc; plugin_instance = ; type = cache_ratio; type_instance = L2;
[2012-09-05 22:59:03] uc_update: silmaril.our.org/zfs_arc/cache_ratio-L2: ds[0] = NaN
[2012-09-05 22:59:03] kstat chain has been updated
...
[2012-09-05 22:59:04] plugin_read_thread: Effective interval of the memory plugin is 30.000000000.
[2012-09-05 22:59:04] plugin_read_thread: Next read of the memory plugin at 1346900373.989394022.
[2012-09-05 22:59:03] plugin: plugin_write: Writing values via write_graphite/rcf-metrics/2003.
[2012-09-05 22:59:04] write_graphite plugin: [rcf-metrics]:2003 buf 158/1428 (11.1 %) "silmaril_our_org.collectd.zfs_arc.cache_ratio-L2.value NaN 1346900343
"
Assertion failed: ksp != NULL, file common.c, line 640
Abort (core dumped)

from collectd.

octo avatar octo commented on May 22, 2024

Hi @jblaine,

could you test the changes in PR #126?

Thanks and best regards,
—octo

from collectd.

octo avatar octo commented on May 22, 2024

For convenience, I'll copy the information you gave on the pull request to here:

[2012-09-07 13:10:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-07 13:10:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.

[2012-09-07 13:35:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-07 13:35:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.

[2012-09-07 15:35:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-07 15:35:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 15:35:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.

[2012-09-07 21:03:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 21:03:54] get_kstat_value ("stolen"): ksp is NULL.
[...]
[2012-09-07 21:03:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 21:03:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.

[2012-09-08 04:01:24] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("mutex_miss"): ksp is NULL.
[...]
[2012-09-08 04:01:24] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-08 04:01:24] get_kstat_value ("l2_write_bytes"): ksp is NULL.

Interestingly, this starts with the value deleted every time the problem comes up. This is from the following line in the ZFS-ARC plugin:

        /* Operations */
        za_read_derive (ksp, "allocated","cache_operation", "allocated");
        za_read_derive (ksp, "deleted",  "cache_operation", "deleted");
        za_read_derive (ksp, "stolen",   "cache_operation", "stolen");

So reading the field "allocated" works, while reading "deleted" fails. But why doesn't the ZFS ARC plugin print an error message? When printing that message, get_kstat_value() should return an error indicator, which should result in a second error being printed by the ZFS ARC plugin.

Can you try the following in the command line?:

kstat -m zfs -i 0 -n arcstats

Best regards,
—octo

from collectd.

jblaine avatar jblaine commented on May 22, 2024

Yes, the errors are being reported from zfs_arc -- I had omitted them due to size.

Here are 2 full blocks from that time:

[2012-09-07 13:10:54] kstat chain has been updated
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "allocated" failed.
[2012-09-07 13:10:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "deleted" failed.
[2012-09-07 13:10:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "stolen" failed.
[2012-09-07 13:10:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "mutex_miss" failed.
[2012-09-07 13:10:54] get_kstat_value ("hash_collisions"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "hash_collisions" failed.
[2012-09-07 13:10:54] get_kstat_value ("evict_l2_cached"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "evict_l2_cached" failed.
[2012-09-07 13:10:54] get_kstat_value ("evict_l2_eligible"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "evict_l2_eligible" failed.
[2012-09-07 13:10:54] get_kstat_value ("evict_l2_ineligible"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "evict_l2_ineligible" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_data_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_data_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_metadata_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_metadata_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_data_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_data_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_metadata_hits"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_hits" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_data_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_data_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("demand_metadata_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "demand_metadata_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_data_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_data_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("prefetch_metadata_misses"): ksp is NULL.
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_misses" failed.
[2012-09-07 13:10:54] get_kstat_value ("hits"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("misses"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_hits"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_misses"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:10:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.

...

[2012-09-07 13:35:54] kstat chain has been updated
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "allocated" failed.
[2012-09-07 13:35:54] get_kstat_value ("deleted"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "deleted" failed.
[2012-09-07 13:35:54] get_kstat_value ("stolen"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "stolen" failed.
[2012-09-07 13:35:54] get_kstat_value ("mutex_miss"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "mutex_miss" failed.
[2012-09-07 13:35:54] get_kstat_value ("hash_collisions"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "hash_collisions" failed.
[2012-09-07 13:35:54] get_kstat_value ("evict_l2_cached"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "evict_l2_cached" failed.
[2012-09-07 13:35:54] get_kstat_value ("evict_l2_eligible"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "evict_l2_eligible" failed.
[2012-09-07 13:35:54] get_kstat_value ("evict_l2_ineligible"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "evict_l2_ineligible" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_data_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_data_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_metadata_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_metadata_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_data_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_data_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_metadata_hits"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_hits" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_data_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_data_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("demand_metadata_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "demand_metadata_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_data_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_data_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("prefetch_metadata_misses"): ksp is NULL.
[2012-09-07 13:35:54] zfs_arc plugin: Reading kstat value "prefetch_metadata_misses" failed.
[2012-09-07 13:35:54] get_kstat_value ("hits"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("misses"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_hits"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_misses"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_read_bytes"): ksp is NULL.
[2012-09-07 13:35:54] get_kstat_value ("l2_write_bytes"): ksp is NULL.
adm:silmaril>

from collectd.

jblaine avatar jblaine commented on May 22, 2024

kstat -m zfs -i 0 -n arcstats runs fine.


module: zfs                             instance: 0
name:   arcstats                        class:    misc
        c                               528136002
        c_max                           1578153984
        c_min                           197269248
        crtime                          66.794009685
        data_size                       350626816
        deleted                         6274390
        demand_data_hits                9378713
        demand_data_misses              1138984
        demand_metadata_hits            41489545
        demand_metadata_misses          4188808
        evict_l2_cached                 0
        evict_l2_eligible               330758987264
        evict_l2_ineligible             69180891136
        evict_skip                      110616641
        hash_chain_max                  16
        hash_chains                     9690
        hash_collisions                 22375272
        hash_elements                   35613
        hash_elements_max               127411
        hdr_size                        6646104
        hits                            71039707
        l2_abort_lowmem                 0
        l2_cksum_bad                    0
        l2_evict_lock_retry             0
        l2_evict_reading                0
        l2_feeds                        0
        l2_free_on_write                0
        l2_hdr_size                     0
        l2_hits                         0
        l2_io_error                     0
        l2_misses                       0
        l2_read_bytes                   0
        l2_rw_clash                     0
        l2_size                         0
        l2_write_bytes                  0
        l2_writes_done                  0
        l2_writes_error                 0
        l2_writes_hdr_miss              0
        l2_writes_sent                  0
        memory_throttle_count           1874
        mfu_ghost_hits                  2007968
        mfu_hits                        32130850
        misses                          10350139
        mru_ghost_hits                  4061865
        mru_hits                        20174786
        mutex_miss                      2
        other_size                      118803456
        p                               334877357
        prefetch_data_hits              1106754
        prefetch_data_misses            1294671
        prefetch_metadata_hits          19064695
        prefetch_metadata_misses        3727676
        recycle_miss                    3809672
        size                            476076376
        snaptime                        3815140.9903571

from collectd.

octo avatar octo commented on May 22, 2024

Okay, then it's not ZFS ARC as such that's broken, it's more that the kstat handling needs to be corrected.

[2012-09-07 13:10:54] kstat chain has been updated
[2012-09-07 13:10:54] zfs_arc plugin: Reading kstat value "allocated" failed.

One of the threads is checking the kstat chain periodically and updates it (and is calling init functions) when needed. This isn't handled gracefully in the ZFS ARC plugin and may result in this problem. I'll write a patch now that I know what's going on.

from collectd.

jblaine avatar jblaine commented on May 22, 2024

I'm not sure if it matters, but note too that the kstat command output above does not even show an "allocated" stat. There is also no 'stolen' shown, but many errors: "[2012-09-11 10:00:54] zfs_arc plugin: Reading kstat value "stolen" failed."

from collectd.

octo avatar octo commented on May 22, 2024

Good point. It seems to work most of the time though, right? Or are you getting error messages about "allocated" all the time?

from collectd.

jblaine avatar jblaine commented on May 22, 2024

Looking now I see that those have been all the time. I started the current test collectd on Sep 7. Every Interval seconds the following appears:

[2012-09-12 11:27:54] zfs_arc plugin: Reading kstat value "allocated" failed.
[2012-09-12 11:27:54] zfs_arc plugin: Reading kstat value "stolen" failed.
[2012-09-12 11:27:54] plugin_dispatch_values: Dataset not found: mutex_operation (from "silmaril.our.org/zfs_arc/mutex_operation-miss"), check your types.db!

from collectd.

ymettier avatar ymettier commented on May 22, 2024

Hello,

In src/types.db (and in /usr/share/collectd/types.db) :

mutex_operations        value:DERIVE:0:U

in src/zfs_arc.c

za_read_derive (ksp, "mutex_miss", "mutex_operation", "miss");

Notice the final "s" in mutex_operations in src/types.db.

Short fix : update your /usr/share/collectd/types.db

Real fix : same thing in src/types.db?

Regards,
Yves

from collectd.

octo avatar octo commented on May 22, 2024

@ymettier Looks like the bug is in the code since 4f5234d. I've fixed this in 4d99b79.

Best regards,
—octo

from collectd.

jblaine avatar jblaine commented on May 22, 2024

Good eye, Yves.

from collectd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.