Good Practices : How to develop monitoring plugin Nagios?

As you know, Centreon provides a number of plugins for certain pollers. But it is possible that you need to monitor specific equipment on your platform that Centreon team doesn’t know. By searching on the Internet, you will be able to find the right plugin, but sometimes it won’t meet all your needs. So you will have to develop your own plugin. No worries, you can do it easily, even if you are not a professional developer. The aim of this article is to give you the best practices to develop monitoring plugin.

Return codes

A plugin have to send a return code. This interpreted code is the result of the plugin execution. We call this result “status”. This is two summary tables about return codes for hosts and services :

Hosts:

Plugin return code Host status
0 UP
1 DOWN
Other Maintains last known state

Services:

Return code Service status
0 OK
1 WARNING
2 CRITICAL
3 UNKNOWN
Other CRITICAL : unknown return code

 

Output message of the plugin

The output message helps the user to understand the information. There are two types of output for plugin:

  • OUTPUT: displayed on monitoring screen in a real-time hosts and services. Its size is limited to 255 characters.
  • LONGOUTPUT: displayed in details page of host and service. Its size is limited to 8192 characters.

The plugin can provide performance data which are optional. However, if you want to have a graph showing the evolution of the result, it is mandatory that the plugin generates performance data.
Performance data are described after the “|” (pipe). This feature is available through the keystroke AltGR 6.
The performance data should be displayed as :
‘label’=value[UOM];[warn] ;[crit];[min];[max]

  • UOM: measure unit (octets, bits/s, volts, …)
  • warn: WARNING threshold
  • crit: CRITICAL threshold
  • min: minimal value of control
  • max: maximal value of control

Example: GPING OK – rtt min/avg/max/mdev = 0.021/0.541/0.598/0.541 ms|time=0.541344ms pl=0%

We obtain two curves on the graph:

  • “time” whose unit is time in millisecond (ms)
  • “pl” whose unit is percentage of lost package (%)

We call these performance data metrics.

New releases are possible if you use the plugins Centreon Broker in a version later than version 2.3. This is an example of ouput :

CPU Usage 98%|c[cpu]=98%;80;95;0;100

The information is:

You can directly specify type of data source (DS) in the plugin. We encourage you to visit the RRDtool website about data sources and their operating mode. (http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html).

Language used

For this article, we present plugin in Perl. Why should we use this language to develop the monitoring plugin ?
Some answers :

  • Interpreted language easy-to-learn
  • easy and generally know by the administrators
  • many free library available on the web
  • often installed on Unix or linux system by default
  • easy to manage system start-up command and recovered results
  • advanced treatment characters strings
  • relatively efficient
  • Perl interpreter embedded in Nagios: optimized performance
  • Perl Connector available with Centreon 2.4 and Centreon Engine 1.3: Optimized higher performance.

Syntax of plugin Perl

#!/usr/bin/perl
-w

#!/usr/bin/perl -w
use strict;
# ceci est un commentaire
printf « %s\n », « Hello World »;

The first lign of script (# !/usr/bin/perl) is called shebang or hashbang. The option -w of shebang and « use strict ; » allow verifying variables and actions non-specific to help “de-bugging” .

Plugin Options

#!/usr/bin/perl -w
use strict;
# Getopt pour récupérer les paramètres du plugins
use Getopt::Long;
# Utilisation de l’API Nagios
use lib « /usr/lib/nagios/plugins»;
use utils qw($TIMEOUT %ERRORS &print_revision &support);
# Déclaration des variables
my %OPTION = (‘help’ => undef, ‘warning’ => 90, ‘critical’ => 95,
‘host’ => undef) ;
# Récupération des valeurs des différents paramètres
Getopt::Long::Configure(‘bundling’);
GetOptions
(“h” => \$OPTION{‘help’}, “help” => \$OPTION{‘help’},
“H=s” => \$OPTION{‘host’}, “hostname=s” => \$OPTION{‘host’},
“w=s” => \$OPTION{‘warning’}, “warning=s” => \$OPTION{‘warning’},
“c=s” => \$OPTION{‘critical’}, “critical=s” => \$OPTION{‘critical’}) ;
if (defined($OPTION{‘help’})) {
# Afficher l’aide
exit $ERRORS{‘UNKNOWN’};
}

Possibility of using the following variables :

$TIMEOUT
&print_revision(“check_machin.pl”, “1.0”);

For return code of plugin, you should use :

exit ($ERRORS{‘OK’}) ;
exit ($ERRORS{‘WARNING’}) ;
exit ($ERRORS{‘CRITICAL’}) ;
exit ($ERRORS{‘UNKNOWN’}) ;

These are clearer than exit 0, exit 1, exit 2, etc.
If you want to connect your remote device via SNMP protocol, you should use the library Net::SNMP, very expanded and often used in Nagios plugin. This library will save you from running a system command in your plugin, that will effectively make it less efficient.

Example of GET via the library perl Net::SNMP :

#!/usr/bin/perl-w
use strict;
use Net::SNMP ;
use lib « /usr/lib/nagios/plugins»;
use utils qw($TIMEOUT %ERRORS &print_revision &support);
my %OPTION = (‘help’ => undef, ‘warning’ => 90, ‘critical’ => 95,
‘host’ => undef, ‘snmpcommunity’ => public, snmpversion => 1);
my ( $session, $error ) = Net::SNMP->session(-hostname => $snmp_host, -community => $snmp_community, -version => $snmp_version);
if ( !defined($session) ) {
print(“UNKNOWN: $error\n”);
exit ($ERRORS{‘UNKNOWN’}) ;
}
my $snmp_oid = “1.3.6.1.2.1.1.5.0”;
# Exemple de get
my $resultOID = $session->get_request(-varbindlist => [$snmp_oid]);
if (!defined($resultOID)) {
printf(“UNKNOWN: %s.\n”, $session->error); $session->close;
exit ($ERRORS{‘UNKNOWN’});
}
my $result = $resultOID->{$snmp_oid};printf(“OID : $snmp_oid, DESC : $result \n”);</>

Example ofWALK via the library perl Net::SNMP :

#!/usr/bin/perl-w
use strict;
use Net::SNMP ;
use lib « /usr/lib/nagios/plugins»;
use utils qw($TIMEOUT %ERRORS &print_revision &support);
my %OPTION = (‘help’ => undef, ‘warning’ => 90, ‘critical’ => 95,
‘host’ => undef, ‘snmpcommunity’ => public, snmpversion => 1);
my ( $session, $error ) = Net::SNMP->session(-hostname => $snmp_host, -community => $snmp_community, -version => $snmp_version);
if ( !defined($session) ) {
print(“UNKNOWN: $error\n”);
exit ($ERRORS{‘UNKNOWN’}) ;
}
my $snmp_oid = “1.3.6.1.2.1”;
# Exemple de walk

my $resultOID = $session->get_table(-baseoid => [$snmp_oid]);
if (!defined($resultOID)) {
printf(“UNKNOWN: %s.\n”, $session->error); $session->close;
exit ($ERRORS{‘UNKNOWN’});
}
foreach my $key (keys %$resultOID) {
printf(« OID : $key, Desc : $$resultOID{$key}\n) ;
}

Other examples, if you want to connect to database Oracle(r), you can use the library Perl DBD-Oracle instead of sqlplus in silent mode. Each time you want to develop a plugin to connect your remote device or server using whatever protocol, check before on CPAN (Comprehensive Perl Archive Network) if there is a library that you can make your job easy and your plugin efficient.

Some advices about options

1. To use standard options (-h/–help, -H/–host, -w/–warning, -c/–critical, -C/–community, -p/–port, -u/–user, -P/–password, …)
2. To display detailled help when you call to -h/–help :
3. Printf « -H|–host Host name or IP adress »;
4. Check that all options are correctly defined, for example :
5. If (!defined($opt_H)) {help(); exit $ERRORS{UNKNOWN};}
6. To use API Nagios (utils.pm)
7. To use API Net::SNMP to check an equipment or server via SNMP protocol
8. To avoid using external programs :
9. grep/sed/awk/cut/… are FORBIDDEN! Perl can do it
10. snmpget/snmpwalk are FORBIDDEN! Perl can do it

It is not so easy to develop a plugin and before putting it into production, it is strongly advise to test it. To do this, you can check the syntax of your plugin like:
perl -c masonde.pl

Once you have checked the plugin syntax, you can test it in command line with Nagios user. When you test with the command line, you have to start plugins from the Nagios user account. Some plugins write temporary files if you start them as « root » and once in production, they don’t work because they are managed by « nagios » user and temporary files can be modified by « nagios » user. In fact, the plugin send an error message because the nagios user can’t write in buffer file.
There also are some «rules about plugin development for Nagios » and a documentation available here : http://nagiosplug.sourceforge.net/developer-guidelines.html

Incoming search terms:

  • nagios performance data
  • centreon best practice
  • can nagios display status strings
  • plugin to monitor multiple hosts on nagios
  • nagios script output format
  • nagios return exitcode with data
  • how to write rules in nagios
  • develop nagios plugin
  • centreon nagios plugins
  • centreon nagios best practice

2 thoughts on “Good Practices : How to develop monitoring plugin Nagios?

Leave a Reply