Giter VIP home page Giter VIP logo

zenpacks.daviswr.zfs's People

Contributors

daviswr avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

sempervictus

zenpacks.daviswr.zfs's Issues

Failure to model a single pool in a system containing 3

OpenZFS 2.1 host with 3 zpools hangs on modeling eternally - had to disable the ZFS plugin to get the rest of it commit into the DB. With the plugin enabled, even after zpools are discovered, the modeler hangs indefinitely.
There is an error message in Zenoss:

stderr | interval cannot be zero usage: 	status [-c [script1,script2,...]] [-igLpPstvxD]  [-T d\|u] [pool] ...  	    [interval [count]]
-- | --

The pool which is failing to model is a raidz2 of 6 drives. There's another raidz2 in there with more disks, and both of them have faults showing. The one which works has one UNAVAIL and one FAULTED - both show up as events in Zenoss. The failing pool has a single FAULTED disk in it.

Some Pools Return "Code: 2 - Msg: Misuse of shell builtins"

I'm seeing a host with 4 pools - single vdev rpool, raidz1 pool, another raidz1, and a 5-wide span of 2-disk mirrors with a SLOG mirror and a 2-diskL2ARC span, only return data for two of the pools (rpool and a raidz1). The other two pools show up as having a warning of Code: 2 - Msg: Misuse of shell builtins and no pool state whatsoever. The "stateless" pools VDEVs are not accounted for, nor are their comprising storage devices.
The host systems are Arch Linux (so tip bash), its sudoless as root (isolated env), and the ZFS revision is 2.0.0.

Re-usable parsers for various ZFS tool outputs

Currently both modelers and all ZenCommand parsers implement parsing of tool output individually. Shared parsing functions for each output type would reduce complexity and perhaps make modeling a little more fault-tolerant.

Should be considered a prerequisite to #6.

Implement support for pending native crypto

@tcaputi has pretty much completed work on native crypto implementation for OpenZFS (openzfs/zfs#4329). This work adds some complexity to how information is stored and presented, as well as CLI interface. Given that the ZenPack works off zdb output, and that dataset-level attributes remain CT, i'm assuming that we should be able to see all relevant attributes whether we have a key loaded or not (aka, should still work while DS is encrypted). We would however want to output information regarding the crypto config (on/off, keysource, cipher, and pbkdfiters) to be logged by Zenoss.

@daviswr: Could i ask you to take a look toward implementation? Every time i start working on this ZenPack i get bogged down by the idiosyncratic differences between Python and my 3rd gen language of choice (Ruby) as relating to string parsing, indents, and set manipulation. I should have some cycles in Jan, but i'm massively behind on Metasploit work, so am throwing this up as an issue instead of a PR presuming you have the cycles to tackle it. Thanks as always.

Generate events based on zpool status errors

The zpool.status parser should generate events based on messages in the status and errors fields for the pool.

Additionally, vdev & device events if other components in the output have error messages.

SUSPENDED state not detected correctly

We had a failure go unnoticed this morning - pool shows up as ONLINE in Zenoss (with no IO in the graphs) but went SUSPENDED on the host hours ago.
Zenoss 6.3 with zenpack built off of 8a17ca6
Thanks as always

REMOVED disk not correctly detected by zpool component

We use /dev/disk/by-id/ paths referencing the ata- or scsi-/sas- symlinks pointing to our devices in our zpool configurations. Just noticed a pool lost a drive, OS removed it altogether at fail-time, and the zenpack is having some issues with this.
The disk is still showing as online, but there is a warning message generated saying:

Component:  raidz1-0
Event Class:    /Cmd/Fail
Status:     New
Message:    Traceback (most recent call last):
  File "/opt/zenoss/Products/ZenRRD/zencommand.py", line 819, in _processDatasourceResults
    parser.processResults(datasource, results)
  File "/opt/zenoss/packs/ZenPacks.daviswr.ZFS/ZenPacks/daviswr/ZFS/parsers/zpool/status.py", line 68, in processResults
    health = pool_match.groups()[0]
AttributeError: 'NoneType' object has no attribute 'groups'

I'm assuming a problem in the zpool status output parser.

As a result, the disk itself is not marked as being offline in Zenoss, but the VDEV does show yellow (warning state) due to the parsing problem resulting in the reference to a 'NoneType' object.

Integrate ZFS modeler into ZPool modeler

There is no order in which modelers are executed, so it's possible for the ZFS modeler to run prior to the ZPool modeler the first time a system is modeled, this missing the datasets due to ZPool components not yet having been created.

Probably can't stub-out Pools in case ZFS runs after ZPool, which would replace the previously-made components (I think...)

Subsequent models are normally fine.

Parse error for 2.2.3

Will track down the cause but noting this here for record-keeping - catching this error against a 2.2.3 built, packaged, and installed on Ubuntu 22.04. May happen on others, will check Arch shortly:

2024-03-26 23:38:57,653 ERROR zen.ZenModeler: Traceback (most recent call last):
  File "/opt/zenoss/Products/DataCollector/zenmodeler.py", line 669, in processClient
    datamaps = plugin.process(device, results, self.log)
  File "/opt/zenoss/ZenPacks/ZenPacks.daviswr.ZFS-0.8.0-py2.7.egg/ZenPacks/daviswr/ZFS/modeler/plugins/daviswr/cmd/ZFS.py", line 200, in process
    comp[key] = int(datasets[ds][key])
ValueError: invalid literal for int() with base 10: 'none'

Cache VDEV Enumeration and Small Suggestions

Thank you for this zenpack - its a lifesaver in our environment. At the latest version (0.7.0), cache drive vdev enumeration fails, i've had to comment it out (https://github.com/daviswr/ZenPacks.daviswr.ZFS/blob/master/ZenPacks/daviswr/ZFS/modeler/plugins/daviswr/cmd/ZPool.py#L153). I'll spin up a lab system to replicate the error, but along the lines of "NoneType has no member named 'dev'".

Separately, i've added local thresholds for pool capacity notification - may be useful to have them in the zenpack. A pool at 90% is something to be concerned about (especially with automated snapshots or heavy use). Also, would be very useful to have a configuration option to disable enumeration of snapshots. Some of our systems have thousands of snapshots across datasets, it gets painful pretty quick (we are only monitoring pools for now anyway, but DS usage and ZVOL IO would be nice).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.