NRPE (Nagios Remote Plugin Executor) allows you to execute Nagios plugins remotely on other Linux/Unix machines. This allows you to monitor remote machine metrics (disk usage, CPU load, etc.). NRPE can also communicate with some of the Windows agent addons, so you can execute scripts and check metrics on remote Windows machines as well.
This HOWTO will show you how to install and configure NRPE on CentOS.
NOTE: For this tutorial you will require epel which is a high quality set of additional packages for Enterprise Linux. Please see their wiki on the steps to install it on your CentOS box.
First of all you need to install NRPE and a few Perl modules on the remote machine.
yum -y install perl-Sys-Statistics-Linux nrpe
This will create the /etc/nrpe.d directory. We need to create a config file for nrpe in this directory called nrpe.cfg with the following content:
log_facility=local1
pid_file=/var/run/nrpe.pid
server_port=7000
#server_address=127.0.0.1 #Comment out to bound to all
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=x.x.x.x
dont_blame_nrpe=0
# command_prefix=/usr/bin/sudo
debug=0
command_timeout=60
connection_timeout=300
#allow_weak_random_seed=1
#include=
#include_dir=
#include_dir=# Command definitions
command[check_linux_cpu]=/usr/lib64/nagios/plugins/check_linux_stats.pl -C -w 80 -c 100 -s 5
command[check_root_partition_space]=/usr/lib64/nagios/plugins/check_linux_stats.pl -D -w 20 -c 10 -s 5 -u % -p /
command[check_boot_partition_space]=/usr/lib64/nagios/plugins/check_linux_stats.pl -D -w 20 -c 10 -s 5 -u % -p /boot
command[check_linux_memory]=/usr/lib64/nagios/plugins/check_linux_stats.pl -M -w 80,50 -c 100,70
command[check_linux_uptime]=/usr/lib64/nagios/plugins/check_linux_stats.pl -U -w 1440 #Uptime in minutrs (1440 is 24 hours)
command[check_linux_load]=/usr/lib64/nagios/plugins/check_linux_stats.pl -L -w 10,8,5 -c 20,18,15
The server_port config option tells the nrpe daemon on which port to listen for incoming connections from your Nagios server. You can set this to anything you like. For extra security, you can limit remote connections from certain hosts by setting the allowed_hosts option to match the IP of your Nagios server. The dont_blame_nrpe should in most cases be set to 0. This tells nrpe not to parse any remote arguments in commands, which can be a huge security risk if enabled. For safety sake try to keep this set to 0.
NOTE: There are two Perl files required for nrpe to work correctly. These are utils.pm and utils.sh. Both these can be found in the plugins directory on the machine running Nagios in the plugins directory. On my server they are located under /usr/lib64/nagios/plugins. On machines that don’t run 64-bit operating systems, you should copy these files to the the correct directory which normally is /usr/lib/nagios/plugins. You should also edit the check_linux_stats.pl and change the use lib line to resemble the correct location.
For all the remote checks in this example I will be using the check_linux_stats.pl Perl script, which is a Nagios plugin to check linux system performance (cpu, mem, load, disk usage, disk io, network usage, open files and processes). You should follow the link to read more on all the different options for the great plugin.
Once you have created the config file, you will need start the nrpe service on the local machine:
chkconfig nrpe on ; service nrpe start
Now onto the Nagios part of the configuration. In this part I will show you how to define the services that will be interacting with your remote CentOS box running nrpe.
In the nrpe.cfg file you will notice that I defined a few check commands. These commands have names in [] brackets. For this example I will use check_linux_remote_cpu which checks the CPU usage of the machine. You will also notice that I defined the -w and -c options (for when this service is in a warning or critical state) on the machine running nrpe. This is because we set the dont_blame_nrpe config option to 0 which will prevent the Nagios server from sending these as arguments when querying the CentOS box running nrpe.
On my Nagios server I then define the remote cpu check in /etc/nagios/objects/commands.cfg as follows:
define command{
command_name check_linux_remote_cpu
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p 7000 -c check_linux_cpu
}
And in the config file for the remote host like this:
define service {
use generic-service
host_name my.host.co.za
service_description CPU
check_command check_linux_remote_cpu
}
And after you reload your Nagios config it will know how query the server and execute the remote check_linux_stats.pl script. You can look at the examples in this HOWTO to monitor disk usage, memory, load averages and more.
For more information, refer to the links in this HOWTO.