Giter VIP home page Giter VIP logo

netapp-cdot-nagios's People

Contributors

1tft avatar 48kram avatar aleex42 avatar badnetmask avatar chartya avatar derjd avatar kvonkleinsorgen avatar log1-c avatar mkayontour avatar mmslkr avatar schurzi avatar sirsteff avatar tobiasgiese avatar tontonitch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

netapp-cdot-nagios's Issues

check_cdot_snapmirror.pl: Use of uninitialized value $current_transfer

Command and output "anonymized". ONTAP 8.3.2P6, snapmirror status quiesced

./check_cdot_snapmirror.pl --hostname --username --password

Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
Use of uninitialized value $current_transfer in string ne at ./check_cdot_snapmirror.pl line 73.
CRITICAL: 9 snapmirror(s) failed - 11 snapmirror(s) ok
Name Healthy Delay
SVM_iSCSI_XXX_mirror false 184707s
SVM_iSCSI_XXX_mirror false 184707s
SVM_iSCSI_XXX_mirror false 184707s
SVM_CIFS_XXX_mirror false 184707s
SVM_iSCSI_XXX_mirror false 184707s
SVM_iSCSI_XXX_mirror false 184707s
SVM_iSCSI_XXX_mirror false 184707s
SVM_NFS_XXX_mirror false 184707s
SVM_iSCSI_XXX_mirror false 184707s

[check_cdot_wrong_aggregate.pl] WT...?

I've just looked into the source code of the script since it was giving me an error regarding the DP volumes and I couldn't understand why and...
Why is the check looking into the aggregate name for the string "ata"? Isn't it probably better to check what type the aggregate is and filter it via parameter?
If it is better, let me work on it.

BR,

Giorgio

check_cdot_diff_snapmirror.pl| call method "children_get"

Hi,
I am trying to monitor num of volumes without sanmirrors using scrip "check_cdot_diff_snapmirror.pl". But It seems is working for some and failing for others. eg:
For Cluster1 i am getting below error on execution
./check_cdot_diff_snapmirror.pl --hostname Cluster1 --username USRTNAME --password PASWORD
Can't call method "children_get" on an undefined value at ./check_cdot_diff_snapmirror.pl line 64.

But same is working fine for Cluster2:
./check_cdot_diff_snapmirror.pl --hostname Cluster2 --username USRTNAME --password PASWORD
25 volume(s) without snapmirror:
----details-----

How to fix it ?

Pull Request - check_cdot_aggr.pl

Hello,
regarding check_cdot_aggr.pl: I have fixed output of $perfmsg and added regex as console parameter for this check (copied feature from check_cdot_volume.pl) etc..
Please can you give me repo push access, so that I can create a new branch and make a pull request or simply insert following file:
MODIFIED_check_cdot_aggr.zip

<title>400 Bad Request</title>

./check_netapp_aggr.pl -H 192.168.200.152 -u nagios -p Monitor_246 -w 40 -c 50 --aggr data04_SAS_SHELVES
UNKNOWN: NaServer::parse_xml - Error in parsing xml:
syntax error at line 1, column 49, byte 49:

================================================^

<title>400 Bad Request</title> at /usr/lib64/perl5/vendor_perl/XML/Parser.pm line 187

This happens after upgrade ONTAP from 9.2 to 9.3P4
Tried netapp-manageability-sdk-9.3 - no luck

Can't call method "children_get" on an undefined value at ./check_cdot_diff_snapmirror.pl line 121

Hi!
First, thanks very much for this splendid check set.

This may be similar to #47, but we get:
Can't call method "children_get" on an undefined value at ./check_cdot_diff_snapmirror.pl line 121., on one of our 3 clusters.

We're using the latest ONTAP SDK, so I'm not sure what might be happening here. Unfortunately I know nothing of Perl so I'm fairly useless without some pointers on collecting debug for you.

Thanks!

Add authentication with certificate

Security policies prevent using username and password in some companies. API is able to authenticate with certificate. We should enable this option here, too. It could be a minimal change or proper implementation.
I was able to do a minimum change by adding parameter --cert, which then uses the same --username and --password parameters for the public certificate filepath and the private key filepath. This is not a complete implementation but is a quick and simple fix that works for us. I can code something better too, but please suggest how it should be implemented to comply best to standards in this project.

check_cdot_quota.pl have some issue

Hi
if the volume have user quota but no disk limit will got the errors
Argument "-" isn't numeric in multiplication (*) at ./check_cdot_quota.pl line 101.

I don't know how to fix it

check_cdot_multipath - Can't call method "child_get" on an undefined value

Hi,
thank you for your scripts! We run it on ONTAP 9.6 with netapp-manageability-sdk-9.4 and on master branch (2ac4070) we get following error:

# check_cdot_multipath.pl --hostname 123.123.123.123 -u nagios -p **********
Can't call method "child_get" on an undefined value at /usr/local/bin/nagios/netapp/netapp-cdot-nagios/check_cdot_multipath.pl line 51.

If I can provide some debug output, please, let me know how.

Perfdata ouput of check_cdot_volume contains script name

Hi,

I`d like to ask what's the reason that the performance data output contains the script name, surrounded by "::"?
Example:
Vol_vSphere/vol_vSphere_nfs_01::check_cdot_volume_usage::space_used
Vol_block/block_ROOT::check_cdot_volume_usage::space_used

This makes creating graphs from the performance data a bit uncomfortable.
Graphite seems to escape this nicely:
image

But pnp4nagios has its problems. Though I didn't dive into it, to know why there are no pnp graphs for this check.

Best regards and thanks for the great plugins!

Too many same RRDs files at check_cdot_lun.pl

Hello

I have a problem with plugin "check_cdot_lun.pl".

In March, I received an order to implement Icinga1 checks for Netapp monitoring.
Have I downloaded latest plugins "netapp-cdot-nagios-master" and latest Netapp SDK and enabled monitoring.

RAW Commandline sees success.

./check_cdot_lun.pl --hostname xx.xx.yy.zz --username icingaadmin --password xxxxxx --size - warning 98 --size - critical 99 --perf

The Netapp host hosts over 200 volumes.

And here I get problem with "check_cdot_lun.pl".

After some time it has been found that Performance Folder for this netapp host contains very, very many RRD files and the file system gets full quickly.
Currently I see in this netapp host file system over 25500 files and is occupied with 9 GB of memory.

After a short analysis I found out that for one volume alone there are already 259 RRD files within one week and with every new tag, even more files are added.

Example.
-rw-rw-r-- 1 icingaadmin icingaadmin 384952 May 21 13:39 Netapp_lun ._ vol_server_db1_server_db01_ (Usage__427956.87890625_512071.875_MB; 83.574%) | vol_server_db1_server_db01.rrd
-rw-rw-r-- 1 icingaadmin icingaadmin 384952 May 21 13:39 Netapp_lun .
vol_server_db1_server_db01
(Usage__427986.7421875_512071.875_MB; 83.579%) | vol_server_db1_server_db01.rrd
..
-rw-rw-r-- 1 icingaadmin icingaadmin 384952 May 23 00:29 Netapp_lun .
vol_server_db1_server_db01
(Usage__428173.83203125_512071.875_MB; 83.616%) | vol_server_db1_server_db01.rrd
-rw-rw-r-- 1 icingaadmin icingaadmin 384952 May 23 01:56 Netapp_lun .
vol_server_db1_server_db01
(Usage__428173.91015625_512071.875_MB; 83.616%) | vol_server_db1_server_db01.rrd ..
..
-rw-rw-r-- 1 icingaadmin icingaadmin 384952 May 25 11:49 am Netapp_lun .
vol_server_db1_server_db01
(Usage__429678.859375_512071.875_MB; 83.910%) | vol_server_db1_server_db01.rrd
-rw-rw-r-- 1 icingaadmin icingaadmin 384952 May 25 11:20 Netapp_lun .
vol_server_db1_server_db01
(Usage__429678.890625_512071.875_MB; _83.910%) | _vol_server_db1_server_db01.rrd

Together (in this case) 259 files - from 21.05 to 25.05- for a volume and differ only by adding "* Usage__xxxx *".
And that's how it looks with every volume (some of them have less the others, even more RRD files).

Is it possible to do something here to switch off function "Performance Data"?

Thx

Use of uninitialized value

Hello,

I get the following plugin output but I don't know why. This only happens on one cdot-node

Use of uninitialized value $state in string ne at /usr/lib/nagios/plugins/netapp-cdot-nagios/check_cdot_interfaces.pl line 126.
Use of uninitialized value $state in string ne at /usr/lib/nagios/plugins/netapp-cdot-nagios/check_cdot_interfaces.pl line 126.
OK: All IFGRP fully active

I don't know why this happens.
cdot Version is 8.3.2P1. All other clusters have a newer release. Maybe that's the reason

Maybe you can help me with that output.
Thanks
Greetings

Dumper Undefined

I ran check_cdot_clusterlinks.pl and received the following error:

user@server1:/usr/lib/nagios/plugins$ ./check_cdot_clusterlinks.pl --hostname netapp --username monitoruser --password pw Undefined subroutine &main::Dumper called at ./check_cdot_clusterlinks.pl line 117.

after a little digging and a hint from http://www.perlmonks.org/bare/?node_id=430132 I added

use Data::Dumper;

after the other use statements, and it corrected the issue:

user@server1:/usr/lib/nagios/plugins$ ./check_cdot_clusterlinks.pl --hostname netapp --username monitoruser --password pw OK: all clusterlinks up

Should the Data::Dumper be included there, or is there something wrong somewhere else in my install?

check_cdot_efficiency.pl --help broken

Getting this:

./check_cdot_efficiency.pl  --help
Got a 0-length file from ./check_cdot_efficiency.pl via Pod::Perldoc::ToTerm!?

 at /usr/bin/perldoc line 13.

UNKNOWN in Zapi invoke cannot connect to socket

Hi,

I am experiencing the following plugin output with all check_cdot commands, using NetApp-manageability-SDK 9.7.x-9.8.x.

UNKNOWN: in Zapi::invoke, cannot connect to socket

This is presented when running as Icinga, but when executed from command line, it works.

Any ideas?

transport type is HTTPS

embedded perl interpreter

script(s) didn't work out of the box on an Ubuntu 14.04 machine.
Did work when i ran the check_cdot_aggregates.pl in bash but didn't when nagios ran it with its epn
I don't know hot to fix the code or if this is based on code coming from the NetApp manageability-sdk.
What helped was adding a line "# nagios: -epn" to the top of the script.
I didn't use github so far. Should I patch the scripts with this line and create a pull request?

Used NetApp SDK / API 5.4

What type of user is needed?

I am wondering what type of user is needed on my NetApp cDot and 7-Mode systems to use these plugins and what permissions they need to run them.
Thanks!

How to access NetApp

Hi,

I am asking myself the following question: How do they access the information from the NetApp?
Do you use a normal SNMP query?

Fix Performance-data for Icinga2

Hi,
the volume-check can send performance-data as well, but the function doesn't work in the right way. There is only 1 Pipe-Symbol "|" allowed. Bit with "--perf" there are plenty of them. Maybe you can have a look at it.
Thanks and regards,
pgress

Failed test query: NaServer::parse_xml - Error in parsing xml:

I am using ONTAP 9.3P2 for before version like Ontap 9.2 is working fine. Getting below error while added DATA ONTAP 9.3P2 version.

Failed test query: NaServer::parse_xml - Error in parsing xml:
syntax error at line 1, column 49, byte 49:

================================================^

<title>400 Bad Request</title> at /usr/lib64/perl5/XML/Parser.pm line 187

Feature request - show reconstruction-percentage

Maybe others like this too and maybe I'm overlooking some issues that this could cause.
My feature request was:
Add rebuild percentage to check_cdot_rebuild to be able to track progress.
I did this quick & possibly dirty hack in patch file format:
--- check_cdot_rebuild.pl 2022-02-28 11:17:27.142714000 +0100
+++ check_cdot_rebuild_percent.pl 2022-02-28 13:09:22.043997000 +0100
@@ -78,10 +78,12 @@
foreach my $rg (@rgs) {

             my $rg_reconstruct = $rg->child_get_string( "is-reconstructing" );
  •            my $rg_reconstruct_percent = $rg->child_get_string( "reconstruction-percentage" );
    
               if ($rg_reconstruct eq "true") {
    
  •                unless (grep(/$aggr_name/, @failed_aggrs)) {
    
  •                    push( @failed_aggrs, $aggr_name );
    
  •                    push( @failed_aggrs, $aggr_name.":".$rg_reconstruct_percent );
                   }
               }
           }
    

check_cdot_disk.pl: Write the owner-node-name to the failed disks.

Hi,

We tried out your scripts to check our NetApp-Metro-Cluster. First, thanks for the great work.
With the first Tests we noticed that the output is not very clear, if one or more disks are failed.

This is an output in Icinga:
image

In this case we don't know which disk in which NetApp of our Cluster has the Problem. I saw you get at line 114 the ownership of the disk with
my $owner = $disk->child_get( "disk-ownership-info" );
, but you don't put this information in the output.

It would be very nice, if you can add this.,e.g. in line 117 with this (or better solutions)
push @disk_list, $disk->child_get_string( "disk-name" )." (".$owner->{children}->[5]->{content}.")";
(I hope this is always the element 6 in the array).

My colleagues, the NetApp-Admins, explained to me, that is better to write the "owner-node-name" instead of home-node-name.

thank you!

Use of uninitialized value $ok_msg(..) in check_cdot_aggr.pl

When warning/critical thresholds for check_cdot_aggr.pl are not met:

[root@netapp_cluster ~]# /usr/local/nagios/libexec/check_cdot_aggr.pl --hostname netapp_cluster --username nagios --password $PASS --warning 80 --critical 90 -aggr of_n1_aggr001_rP
CRITICAL: of_n1_aggr001_rP (95%)

Use of uninitialized value $ok_msg in concatenation (.) or string at /usr/local/nagios/libexec/check_cdot_aggr.pl line 167.
OK:
[root@nagios ~]# echo $?
2
[root@nagios ~]#

This strange behavior goes for WARNING states too.

All goes smooth when results are within thesholds:

[root@nagios ~]# /usr/local/nagios/libexec/check_cdot_aggr.pl --hostname netapp_cluster --username nagios --password $PASS --warning 96 --critical 97 -aggr of_n1_aggr001_rP
OK: of_n1_aggr001_rP (95%)
[root@nagios ~]# echo $?
0
[root@nagios ~]#

Add DIMM status check

Hello,
It would be great to add DIMM status check via the check_cdot_global.pl plugin:
Output example:

>system controller memory dimm show
			               DIMM    UECC      CECC      CPU                 Slot
			Node           Name    Count     Count     Socket     Channel  Number  Status
			-------------- ------- --------  --------  ---------  -------  ------  ------
			node1          DIMM-1         0         0          0       0        0  ok
			node1          DIMM-NV1       0         0          0       1        1  ok
			node2          DIMM-1         1         0          0       0        0  ok
			node3          DIMM-NV1       0         0          0       1        1  ok
			4 entries were displayed.

BR,
Yannick

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.