I have been using Nagios, an enterprise class server and network monitoring system, for almost 10 years now and have …
yet to find another free and open source surveillance system that can beat it.
This article will walk you through setting up a basic Nagios installation on a Solaris 10 system. For this example, I’m using Solaris 10 Update 6 (released October 2008) running in 32-bit mode on a VMware virtual machine. The hostname is “sol10vm”, but it will be different in your configuration. The alternate versions of Solaris and Apache Web Server should work fine; I have run Nagios on everything from Red Hat 7.3 to Mac OS X.
Nagios installation prerequisites
This tutorial assumes that you have installed the GNU and GNU make compiler collection that is on the Solaris 10 installation disc, and that the compiler is working correctly. In most cases, it’s just a matter of adding
/usr/sfw/bin to your path environment variable. If you run “gcc” and “gmake” and get the following output, you are probably good to go.
[email protected]:/> gcc gcc: no input files [email protected]:/> gmake gmake: *** No targets specified and no makefile found. Stop.
For this demonstration, I am using the Apache Web Server packages provided by Steve Christensen’s SunFreeware project, specifically Apache 2.0.59 and its dependencies. These packages will be installed under
/usr/local, so make sure that
/usr/local/bin is on your way and that
/usr/local/ssl/lib can be found in your system library search path (use “crle;” see “man crle” for details).
Once you have edited the file
/usr/local/apache2/conf/httpd.conf and started the web server with
/usr/local/apache2/bin/apachectl start, use your web browser to go to http: // your hostname. It should look something like this:
Click to enlarge.
Downloading, compiling and installing Nagios
The first step in installing Nagios is to create a Nagios user and group. The following commands show how to do this on a freshly installed Solaris system. In your case, the user ID might not be 100, but I recommend that the Nagios group ID is the same as the “nagios” user ID.
[email protected]:/> useradd -c "nagios user" -d /usr/local/nagios nagios [email protected]:/> grep nagios /etc/passwd nagios:x:100:1::/home/nagios:/bin/sh [email protected]:/> groupadd -g 100 nagios [email protected]:/> grep nagios /etc/group nagios::100: [email protected]:/> usermod -g nagios nagios
As of January 2008, the latest version of Nagios is 3.0.6 and Nagios plug-ins are version 1.4.13. You can get both from the Nagios downloads page. Download and extract both archives to a location of your choice. I prefer
I prefer to keep my Nagios installation in its own directory, so we’ll pass an argument to the configure script telling it to install everything in
[email protected]:/usr/local/src/nagios-3.0.6> ./configure --prefix=/usr/local/nagios
Once the configuration process is complete without error, type “gmake all” to compile the basic Nagios software and web CGIs. Then type “gmake install” to install everything. Once the installation is complete, run “gmake install-init” then “gmake install-config” to install sample configuration files and allow Nagios to start at system startup.
Once Nagios itself has been compiled and installed, the next step is to repeat the process with the Nagios plugins, which provide enhanced system and service checks. After unzipping the source code archive, the configuration step is the same:
[email protected]:/usr/local/src/nagios-plugins-1.4.13> ./configure --prefix=/usr/local/nagios
Once the configuration script is complete, run “gmake” and “gmake install” to install the plug-ins in the directory that was created when you installed the base Nagios package. In addition, you must add
/usr/local/nagios/lib to your system library search path using the “crle” command as you did with
/usr/local/lib. If this step is omitted, it will cause errors with some of the plugins.
Configuring Apache for Nagios
For this example we will not configure Nagios for HTTP user authentication. This makes the tutorial easier, but it should not be used in a production environment. Once you have gone through this tutorial and understood how things are set up, read the official Nagios documentation and change your configuration to implement user authentication.
To configure Apache for use with Nagios, add the following code to your Apache configuration file. In this case, the file is located in
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin Alias /nagios /usr/local/nagios/share
Options ExecCGI AllowOverride None Order allow,deny Allow from all Options None AllowOverride None Order allow,deny Allow from all
Once Apache is configured for Nagios, restart the web server with
/usr/local/apache2/bin/apachectl graceful, or by running
/usr/local/apache2/bin/apachectl stop followed by
Even though Nagios is not yet fully configured and started, you should be able to navigate to http: // yourhostname / nagios in a web browser and see a screen like this:
Click to enlarge.
Nagios has a number of configuration files, located in both
The first file we need to edit is
/usr/local/nagios/etc/cgi.cfg. In this file, change the value of “use_authentication” to 0. For production use, you will want to re-enable it after reading the documentation on HTTP user authentication.
The second file to edit is
/usr/local/nagios/etc/nagios.cfg. In this file, change both “check_external_commands” and “use_syslog” to 0. This prevents someone from running external commands on your Nagios installation when user authentication is not in effect and prevents Nagios spam your syslog.
The default “contact group” configuration for Nagios is correct in this basic example. Edit
/usr/local/nagios/etc/objects/contacts.cfg and change “[email protected]”to your e-mail address under the contact definition” nagiosadmin. “For e-mail alerts to work, you need a working mail server or mail relay on your Solaris system (this configuration is beyond the scope of this article).
You will see in contacts.cfg that the contact definition says to use the generic contact template. This model is defined in
/usr/local/nagios/etc/objects/templates.cfg, and also refers to periods in
/usr/local/nagios/etc/objects/timeperiods.cfg. In most cases, you’ll want to leave these definitions alone, but they’re highly customizable and allow for multiple contacting across multiple shifts, or for contacting different people depending on what time of day a problem arises.
If you run the command
gmake install-config sooner after compiling and installing Nagios, there is already a localhost.cfg file in place to check various services on the local machine where Nagios is running. You can safely ignore the “linux-server” references in this file; the author assumes that it will run on a Linux system. We want to reduce them to network connectivity, web server, and SSH daemon checks. Comment out the entries in this file for the Root_Partition, Current Users, Total Processes, Current Load, and Swap Usage services. This will only leave the service definitions for “check_http”, “check_ssh” and “PING” uncommented. The commands used for service checks are defined in the commands.cfg file. You can add your own by editing the file, and then use them in your service definitions.
The printer.cfg, switch.cfg, and windows.cfg files contain more examples of how to monitor printers, switches, and Windows systems using some of Nagios’ advanced plug-ins. We won’t be using these files in this tutorial, but they’re worth reading to get a feel for how the different pieces of the Nagios puzzle fit together.
Once the configuration files have been changed to your satisfaction, it’s time to run Nagios to check your configuration files and make sure nothing has been overlooked. To do this, run
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg. If everything checks out, the output will look like this:
Click to enlarge.
If the Nagios configuration check fails, it will tell you what problems it found. Go back and check your config files, then run the checking process again until it says, “Things look good. ”
Make Nagios work
Once you’ve got your setup right, it’s time to launch Nagios. If you ran “gmake install-init” earlier, a script has already been created in
/etc/init.d it will all start correctly for you. To run
/etc/init.d/nagios start to start the process. Once it’s running, you should be able to navigate to http: // yourhostname / nagios with a web browser and click Tactical Overview to see an overall status. In this screenshot, you can see that the only monitored host is OK, as well as all three services on that host. Notifications for two of these services are disabled.
Click to enlarge.
By clicking on Service Detail, you will get a detailed status report on all the individual services monitored, along with the result of their last check:
Click to enlarge.
Host Detail does exactly that – it shows a detailed status display with one line for each monitored host:
Click to enlarge.
The links to Host Group Overview and Host Group Summary will show similar status views for each host group (as defined in the configuration files). Since we only have one host (and one host group) in this quick tutorial, there is no need to view screenshots.
By default, Nagios will check each host and service every five minutes. If something goes down, the web view of that host or service will change from green to red and an email notification will be sent to contact groups (and by extension, contacts) defined in the host template via templates.cfg. Once the host or service resumes normal operation, email alerts will be sent to the defined contacts.
Further reading on Nagios
This tutorial barely scratches the surface of the features of the Nagios enterprise monitoring system and shows only the most basic features. The Nagios online documentation goes into more detail, and a number of good books have been published on the subject. I recommend these titles:
Hopefully, this basic tutorial will get you started using Nagios for all of your network and server monitoring needs.
ABOUT THE AUTHOR: Bill Bradford is the creator and maintainer of SunHELP and lives in Houston, Texas with his wife Amy.
Did you find this useful? Write to Matt Stansberry regarding your concerns about the data center at [email protected].