What is Nagios and why does any network administrator need it?
Nagios is a powerful, enterprise-class host, service, application, and network monitoring program. Designed to be fast, flexible, and rock-solid stable. Nagios runs on *NIX hosts (like Ubuntu Linux) and can monitor Windows, Linux/Unix/BSD, Netware, and network devices. And best of all, it's free open source software!
Well, it may be free software, but it does cost something - time and effort! Installation is the easy part, configuration is much harder. At least, it used to be, since check_mk makes it so much easier. But let's discuss installing it first.
I'm going to describe how to install Nagios on Ubuntu 8.04.3 LTS. I'll use a fresh installation of the Ubuntu Server Edition JeOS in a virtual machine to get a clean start. While details may differ between other versions of Ubuntu or Linux, most of the guide will still apply.
While Ubuntu does include a packaged version of Nagios in its official repositories, it's probably not the latest version, so I recommend you download, compile and install it yourself.
Before we do that, we have to install some prerequisites first, though. Here's how to install all required dependencies for the Nagios core:
sudo apt-get install apache2 build-essential libapache2-mod-php5 libgd2-xpm-dev traceroute wget
This installs the Apache webserver with PHP and its graphics library as well as the build environment necessary to compile software, traceroute (optional) and the wget downloader (it's not included with JeOS).
Next we prepare the system by creating a user and group for Nagios and a group for running Nagios commands from the web interface:
sudo adduser --system --home /usr/local/nagios --no-create-home --group --disabled-login nagios
sudo addgroup --system nagcmd
sudo adduser nagios nagcmd
sudo adduser www-data nagcmd
As a security measure, the newly created system user is disabled so normal login isn't possible. It's sufficient for running Nagios as a service, though. The web server user www-data is added to the nagcmd group so that commands can be issued from the web interface.
Now we can download and extract the Nagios Core (current version is 3.2.0 as of writing):
tar xzf nagios-3.2.0.tar.gz
Let's build it:
It's important to specify the command group so that the binaries will get the proper permissions - otherwise Nagios can't be controlled from the web interface. Compiling the software takes some time, so be patient.
When it's done, install it:
sudo make fullinstall
fullinstall combines install, install-init, install-commandmode, and install-webconf. It doesn't include install-config, though, so we execute that manually:
sudo make install-config
Now that all the binaries and config files have been installed, we're going to restrict access to the web interface by setting an administrator password:
sudo htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
Enter a password and confirm it. You'll need the username (nagiosadmin) and password later when logging into the web interface.
sudo invoke-rc.d apache2 reload
Now the Nagios Core installation is done. Leave its directory:
Before we can use Nagios to monitor something, we need to install the monitoring plugins. The plugins have dependencies of their own, so we have to install their prerequisites first:
sudo apt-get -y install libmysqlclient15-dev libssl-dev mailx libldap2-dev libnet-snmp-perl libpq-dev libradius1-dev smbclient snmp fping qstat
You probably don't need all of them, but to compile as many monitoring plugins as possible, they should be installed. Only libmysqlclient15-dev, libssl-dev and mailx are really required - while fping and qstat are entirely optional.
Now download and extract the Nagios Plugins package (as of writing, it's version 1.4.14):
tar xzf nagios-plugins-1.4.14.tar.gz
Build and install it:
./configure --with-nagios-user=nagios --with-nagios-group=nagios
sudo make install
When it's done, you can leave the directory:
Next we need to fix Nagios' configuration - Ubuntu's mail command is located in /usr/bin instead of /bin:
sudo sed -i~ 's| /bin/mail | /usr/bin/mail |' /usr/local/nagios/etc/objects/commands.cfg
Before we start Nagios, it's a good idea to verify that the configuration is OK:
sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
The pre-flight check should confirm that everything is alright. This command can be used after modifiying any Nagios configuration file to ensure the system will continue to work.
Now it's time to start up the Nagios service:
sudo invoke-rc.d nagios start
If it started (as it should), add it to the system startup sequence:
sudo update-rc.d nagios defaults 30 18
From then on, Nagios gets started (and shut down) with the system.
Well done! If you followed through here, you now have a working Nagios up and running. It's already monitoring itself (localhost) and can be access with a webbrowser:
Username: nagiosadmin, Password: The one you specified earlier!
Normally, usually, one would now install some or all of the official Nagios Addons: NRPE (which lets you remotely monitor other Linux/Unix or Windows hosts), NSCA (to integrate passive alerts from remote machines), or NDOUtils (which is an experimental database connector and (used to be) required for interesting third-party addons like NagVis) - but all (or at least: most) of that is no longer necessary thanks to an amazing new extension called check_mk!
I can't stress enough how important this new plugin is! Many, many thanks to Mathias Kettner (he's the mk in check_mk) for such a wonderful addon!
So what does it do? It could be described as "a new general purpose Nagios plugin for retrieving data" - but that description hardly does it justice! It replaces NRPE, NSClient++ and check_snmp. It can also be used in place of NDOUtils (and a database) for addons like NagVis. It also makes configuration much easier so config tools like NConf are also no longer needed. In fact, I set up a Nagios system today that monitors dozens of hosts and hundreds of services, in just a few hours!
So let's take a look at it - after installing Python support for Apache so its multiadmin interface (an optional feature) will be available:
sudo apt-get -y install libapache2-mod-python
Again, we download and install the software. No need to compile it, though, since it's a Python program:
tar xzf check_mk-1.1.0.tar.gz
We just have to run its setup script. If you omit the "--yes", it will ask a lot of questions, but answering yes to all of them is just fine (at least with our current JeOS setup):
sudo ./setup.sh --yes
To make its multiadmin feature readily available to Nagios, we'll add it to the Nagios navigation bar (the list of links in the left frame pane):
sudo sed -i~check_mk '/Configuration/a\<li><a href="/check_mk/filter.py" target="<?php echo $link_target;?>">Check_MK Multiadmin</a></li>' /usr/local/nagios/share/side.php
By default, check_mk is prepared for PNP (an addon we'll install later) in its stable version 0.4.x - here we prepare it for the latest PNP4Nagios version 0.6.x:
sudo sed -i~ "s|/nagios/pnp/index.php?host=\$HOSTNAME\$&srv=\$SERVICEDESC\\$|/pnp4nagios/graph?host=\$HOSTNAME\$\&srv=\$SERVICEDESC\$' class='tips' rel='/pnp4nagios/popup?host=\$HOSTNAME\$\&srv=\$SERVICEDESC\$|;s|/nagios/pnp/index.php?host=\$HOSTNAME\\$|/pnp4nagios/graph?host=\$HOSTNAME\$\&srv=_HOST_' class='tips' rel='/pnp4nagios/popup?host=\$HOSTNAME\$\&srv=_HOST_|" /usr/share/doc/check_mk/check_mk_templates.cfg
This ensures that the action links take us to the PNP graphs for the hosts and services - and even better, the graphs will be shown as popups when hovering over the action icons! (You'll soon see what I mean - and how cool this is!)
Reload Apache and Nagios and you're done:
sudo invoke-rc.d apache2 reload
sudo invoke-rc.d nagios reload
You may leave the check_mk directory:
Now that check_mk is installed, we'll enable monitoring its own host by installing the check_mk agent. While the agent can be queried through various means, the regular way is by making it accessible through xinetd, so we install that first:
sudo apt-get install xinetd
Then we only need to copy the agent script check_mk_agent.linux to /usr/bin/check_mk_agent and the xinetd configuration file xinetd.conf to /etc/xinetd.d/check_mk:
sudo cp -p /usr/share/check_mk/agents/check_mk_agent.linux /usr/bin/check_mk_agent
sudo cp -p /usr/share/check_mk/agents/xinetd.conf /etc/xinetd.d/check_mk
Optionally, for security reasons, you may want to edit /etc/xinetd.d/check_mk and specify which IP addresses may query your agent. Uncomment the option only_from and edit the addresses listed there.
Reload xinetd to activate the new configuration:
sudo invoke-rc.d xinetd reload
That's it! Easy, huh? Remember, you can install the agent on other Linux systems just as easily - just copy the check_mk_agent script there and make it available through xinetd (or another means of access, like SSH, which is somewhat more advanced).
To make Nagios monitor your hosts, and to configure check_mk to your liking, edit check_mk's main configuration file:
List all hosts you want monitored as a comma-separated string array of the configuration variable all_hosts. Right now we'll only monitor localhost:
all_hosts = [ 'localhost' ]
Since localhost is already specified in the original Nagios configuration, we have to disable the original entry, otherwise we'd get a conflict and the our new configuration wouldn't be accepted:
sudo sed -i~check_mk 's/cfg_file=.*localhost.cfg/#&/' /usr/local/nagios/etc/nagios.cfg
This comments out the localhost.cfg. Now we can scan our hosts to auto-discover available services - this is one of the most helpful features of check_mk:
sudo check_mk -I alltcp
After the scan, you can automatically add the newly discovered services to Nagios by running the following command:
sudo check_mk -R
And it's done! Look at the Nagios web interface again - you'll see the new localhost. If you also added the agent to other hosts as briefly mentioned above, and listed their hostnames in main.mk, you could already be monitoring a whole lot of remote systems!
Before you continue, I highly recommend another config tweak that keeps the nagios.log file size down - since by default Nagios logs check_mk activity which can quickly take up a lot of space:
sudo sed -i 's/log_external_commands=1/log_external_commands=0/;s/log_passive_checks=1/log_passive_checks=0/' /usr/local/nagios/etc/nagios.cfg
Restart Nagios for the change to take effect - from now on, you can always do that with:
sudo check_mk -R
To monitor Windows hosts, you simply copy /usr/share/check_mk/agents/windows/check_mk_agent.exe there and run it like this:
To enable the autostart of the agent:
net start check_mk_agent
Then add the hostname to main.mk, scan with sudo check_mk -I alltcp and recreate the config with sudo check_mk -R.
To monitor switches or other devices which are accessible through SNMP, specify the hostname in main.mk with the tag snmp like this: 'HOSTNAME|snmp' - scan it with sudo check_mk -I snmp_info HOSTNAME (or another snmp scan type, check out check_mk -L | grep snmp for a list) and recreate the config with sudo check_mk -R.
There's much more to it - check_mk lets you easily and quickly set up a complete monitoring solution, even for very large and complex environments! Make sure to read about all of its other useful features in the Online Documentation!
Next is PNP which is an output addon that creates and displays beautiful and informative charts out of the data Nagios collects. While Nagios itself mainly shows on/off states (service is running, or it isn't), PNP lets you see how a service performs. In a way, it's like Munin, but perfectly integrated into Nagios.
It depends on rrdtool, so install that and additional prerequisites first:
sudo apt-get -y install librrds-perl php5-gd rrdtool
Download and extract PNP's latest version, currently 0.6.2:
tar xzf pnp4nagios-0.6.2.tar.gz
Compile and install:
sudo make fullinstall
fullinstall combines install, install-webconf, install-init, and install-config.
There are various modes of operation and quite complicated installation instructions posted on its official website, but a single command can set it up:
sudo sed -i~pnp4nagios '/process_performance_data/s/0/1/;$a\broker_module=/usr/local/pnp4nagios/bin/npcdmod.o config_file=/usr/local/pnp4nagios/etc/npcd.cfg' /usr/local/nagios/etc/nagios.cfg
This uses the npcdmod.o module which makes additional, manual nagios.cfg changes unnecessary! Why isn't this properly documented in its official manual?
A special service to process performance data is required for this mode, so copy its configuration to the proper place:
sudo cp -p /usr/local/pnp4nagios/etc/npcd.cfg-sample /usr/local/pnp4nagios/etc/npcd.cfg
Start it up and add it to the system's autostart:
sudo invoke-rc.d npcd start
sudo update-rc.d npcd defaults 20
PNP doesn't like Ubuntu's default magic_quotes_gpc setting, so we change it. We also have to enable mod_rewrite:
sudo sed -i~ '/magic_quotes_gpc/s/On/Off/' /etc/php5/apache2/php.ini # /etc/php5/cli/php.ini
sudo a2enmod rewrite
Reload Apache for the changes to take effect:
sudo invoke-rc.d apache2 reload
Visit http://localhost/pnp4nagios/ to ensure everything is set up correctly. Then delete the install.php to be able to use PNP:
sudo rm -f /usr/local/pnp4nagios/share/install.php
To perfectly integrate PNP into Nagios and enable mouse-over popups of graphs, put status-header.ssi into Nagios and set up its permissions:
sudo cp -p contrib/ssi/status-header.ssi /usr/local/nagios/share/ssi
sudo chown nagios:nagios /usr/local/nagios/share/ssi/status-header.ssi
sudo chmod 644 /usr/local/nagios/share/ssi/status-header.ssi
Reload Nagios and you're done:
sudo invoke-rc.d nagios reload
Now you can leave PNP's directory:
Hover your mouse cursor over the action symbol to see a preview of the graphs as a floating popup image. Click it to go directly to the graphs. Once you get used to this feature, you won't want to miss it!
The final addon I'm going to introduce today is NagVis. It's a visualization engine for Nagios that has to be seen to be believed. It's that cool! Check out the screenshots!
Install its prerequisites:
sudo apt-get -y install graphviz php5-cli php5-gd php5-mysql
Then download and extract the current version 1.4.5:
tar xzf nagvis-1.4.5.tar.gz
This version is the first to support mklivestatus. mklivestatus is another great feature of check_mk which lets other addons access Nagios stats without requiring a database and a connector.
Install it like this:
sudo ./install.sh -i mklivestatus -q
Although we chose mklivestatus, the default is MySQL access, so we change that:
sudo sed -i~ 's/;backend="ndomy_1"/backend="live_1"/' /usr/local/nagios/share/nagvis/etc/nagvis.ini.php
Then we have to set PHP's timezone - otherwise we'd get a lot of error messages when trying to open the NagVis pages:
sudo sed -i~ "s|;date.timezone =|date.timezone = `cat /etc/timezone`|" /etc/php5/apache2/php.ini # /etc/php5/cli/php.ini
Now reload Apache:
sudo /etc/init.d/apache2 reload
Leave NagVis' directory:
Now you can access it here:
Setting up NagVis is beyond the scope of this guide, so refer back to its Documentation.
Congratulations! You've successfully completed your Nagios, check_mk, PNP and NagVis installation! But this isn't the end, it's just the beginning. Take a snapshot of your virtual machine (if you used one like I did) and then continue to set up your monitoring solution - check_mk's main.mk is the key...
Hi Stefan, I really appreciate your tutorial. What I like to suggest is to modify the templates.cfg with an further entry:ReplyDelete
Now, after every host appears an additional Icon with a direct link to the statistics.
Hi Philipp, thanks for your comment.ReplyDelete
Check_MK by default already includes this functionality as an "action_url". When both Check_MK and PNP4Nagios are installed, it works out-of-the-box.
Thank you for your answer. I just used the PNP part in my first try. But now I tried the hole tutorial. I had some difficulties in doing it. I had to modify /usr/local/nagios/etc/check_mk.d/check_mk_templates.cfgReplyDelete
action_URL /nagios/pnp to /pnp4nagios further I had to create the /var/log/nagios/rw folder with ownership nagios:nagcmd. Now everything is working properly.
Are you sure you followed every step properly? The instructions were written for Check_MK v1.1.0 which only included support for PNP v0.4.x. That's why there's a (pretty long) sed command to rewrite the check_mk_templates.cfg, pretty much the same as you did manually.ReplyDelete
By the way, Check_MK is now at version 1.1.3. I didn't have time to update the whole HOWTO yet, but since it's mostly just a matter of downloading the latest versions instead of the ones I used here, it shouldn't be much of a problem.
With the latest Check_MK updates, this whole guide is becoming obsolete very quickly since the amazing Mathias Kettner is now including an install script which does all of this and more automatically! The only drawback right now is that it only supports Ubuntu 9.10 (among a few other select Linux distros) - but I'm hopeful he'll update it for 10.04 LTS once it's released. I'll write some more about it another time, it's truly awesome!
Stefan, I'm a newbie to Nagios after only a week or so, but you have made my WEEK! Your tutorial is brilliant!ReplyDelete
Now I just need to dig a bit after my 15 minute install/testflight and see what ports I have to open for check_mk over the 'net. Literally, with your instructions, 15 minutes from bare Nagios to monitoring localhost and a Windows box that's separate. You are awesome.
Regards from a long-time Windows Sysadmin who is rapidly bailing wherever possible while still trying to maintain a decent standard of living...
I'm glad I could help!ReplyDelete
On hosts where you installed the Linux agent to have them monitored, it suffices to open port 6556, TCP. If you're using Ubuntu's ufw firewall interface, you can use this command:
sudo ufw allow 6556/tcp
It's a good idea to restrict which IP addresses are allowed to connect in /etc/xinetd.d/check_mk ("only_from"). Restart xinetd afterwards.
For even more security, use SSH instead of xinetd. Check out the "Datasource programs" section of the check_mk documentation for detailed instructions.
Thank you so much! This is the only straightforward way I found to install pnp. I had modify the process a bit because I installed nagios via the ubuntu package already, but it worked. Also, I would have prefered "edit this file, changing this line" rather than using sed, but, regardless, it did the trick! Once again, I thank you greatly!ReplyDelete
Or much more simple...ReplyDelete
1. Download latest check_mk Tarball.
3. cd into the dir
4. sudo ./install_nagios.sh
5. wait 10min (depending on internet speed)
Finally you've got
Nagios 3.2.1, Nagvis 1.4.7, pnp4nagios 0.6, check_mk,
and rrd incl. rrd-cache daemon up and running.
Exactly! However, unfortunately the install_nagios.sh script is currently only compatible with Ubuntu 9.10.ReplyDelete
I've been holding off posting more about this until it's compatible with 8.04 LTS and 10.04 LTS. (Personally, I'd much prefer 8.04 LTS support since all of my stable servers at work will stay at that version until 10.04's stability and compatibility have improved - but that's another topic.)
Funny: By now, Nagios is more of an addon for check_mk than the other way round. Amazing. :)
Okay, I've just tested it on Ubuntu 10.04 LTS.ReplyDelete
Have a look "cat /etc/lsb-release"
Within the install_nagios.sh change this lines from:
lif grep -qi "DISTRIB_DESCRIPTION=\"Ubuntu 9.10\"" /etc/lsb-release 2>&1 >/dev/null
elif grep -qi "DISTRIB_DESCRIPTION=\"Ubuntu 10.04 LTS\"" /etc/lsb-release 2>&1 >/dev/null
DISTRONAME="Ubuntu 10.04 LTS"
Then it's as simple as three commands:ReplyDelete
sed -i~ '/Ubuntu/s/9.10/10.04 LTS/;s/9.10/10.04/' install_nagios-1.1.4.sh
sudo bash install_nagios-1.1.4.sh
Now that's an easy tutorial... :)
Thanks for the great tutorial. It really is pretty easy with such a great explanation. I am just stuck on one thing and i would really appreciate some help if you dont mind.
I can see all my hosts but the status is always "DOWN" and the status information always reads "(Return code of 127 is out of bounds - plugin may be missing)". I have searched the web for days trying to find the solution and a lot of forums point to permissions on a "folder" but none of them say which folder or what permissions to give it.
Have you seen this before and do you have a solution for my dilema? Cause i am pulling my hair out.
Great info. I already have a great nagios setup and was looking at adding in PNP. I followed your steps and looks like most everything works except when I click the action script a new page is opened instead of opening inside the same frame. The hover over doesn't work aswell. Do you need to have check_mk for the functionally to work? Also by installing check_mk will this damage any of my current nagios configs. Luckly its running in a hyper-v virtual session so it will be easy to backup. Thanks!
Thanks a lot for this post. Nagios, check_mk and other stuff You mentioned are great fun to me and I just cannot wait till I set up complete monitoring system in my enterprise environment.ReplyDelete
Thank you so much for the detailed help. Everything went as expected except one thing. I have check_mk version 1.1.12p7.ReplyDelete
When I run:
check_mk -I alltcp
Cannot resolve alltcp into IP address.
Cannot get information from host 'alltcp': Cannot contact agent: host 'alltcp' has no IP address.
Likewise, when I run:
check_mk -I tcp
Cannot resolve tcp into IP address.
Cannot get information from host 'tcp': Cannot contact agent: host 'tcp' has no IP address.
In both cases tcp and alltcp are being interpreted as a hostname argument. I ran check_mk --help and do not see the options "tcp' or "alltcp” anywhere. Was this taken out in this version of check_mk perhaps?
Following your instructions i installed this on Ubuntu 8.04.4 LTS. Is this still the best distro or would I be better off starting clean with Ubuntu 9.10?
"alltcp" is no longer necessary, just leave it off.Delete
Currently the best long-term stable base is probably Ubuntu Server 10.04 ("lucid"). The new LTS release, 12.04, unfortunately disappoints because of various incompatibilities, so I'd stick to the older version for now.
Most importantly however, stop following this outdated guide and choose the easy way: Just use OMD, the Open Monitoring Distribution, do check it out here: http://omdistro.org/
Thanks Stefan! I wish I had seen this first! :-) I'm starting over with a clean install of Ubuntu 10.04 and OMD!ReplyDelete
I have a question if you can help. With the version I was installing (now scratching) I was not successful at monitoring ESXi hosts. The agent returned no information and I could not get xinet installed. Is there any help with how to monitor VMWARE hosts?
I have omd up and running. Thank you for the lead on that! Can you help me find information on how to configure host icons in this distro? There doesn't seem to be much documentation and it is just different enough from the other nagios installs that I cannot seem to find out where to put "icon_image" in any config and have it work. Thanks!
With OMD, you can still configure and use Nagios (or Icinga, or even another monitoring core) like you would in a manual installation - by editing the Nagios configuration files in etc/nagios (within the site directory, not /etc).Delete
Instead of that, you could also use check_mk which is much more powerful and at the same time easier to use. You'd edit etc/check_mk/main.mk for that.
Or, totally new and amazingly cool, you could use WATO to manage your complete Nagios install with the web interface and without having to edit config files manually.
So which approach are you using?
This comment has been removed by the author.ReplyDelete
I think I may have found what I am looking for here:
Thank you for your help!
I have a live nagios system up and running. I am monitoring windows,printers and linux. But am looking to monitor more things. I came across this tutorial and figured well I'll try it and see how it goes.
I came across the part of "sudo sed -i~ "s|/nagios/pnp/index.php?host=\$HOSTNAME\$&srv=\$SERVICEDESC\\$|/pnp4nagios/graph?host=\$HOSTNAME\$\&srv=\$SERVICEDESC\$' class='tips' rel='/pnp4nagios/popup?host=\$HOSTNAME\$\&srv=\$SERVICEDESC\$|;s|/nagios/pnp/index.php?host=\$HOSTNAME\\$|/pnp4nagios/graph?host=\$HOSTNAME\$\&srv=_HOST_' class='tips' rel='/pnp4nagios/popup?host=\$HOSTNAME\$\&srv=_HOST_|" /usr/share/doc/check_mk/check_mk_templates.cfg",
I received the error "sed: can't read /usr/share/doc/check_mk/check_mk_templates.cfg: No such file or directory".
After checking the directory there is no templates in that directory.
I decided I will run the install again this without the --yes option and now I see 2 entries of MK listed on the left frame and it still doesn't find the templates. I am using 1.2.0p1 with nagios 3.4.1 on Ubuntu 12.04.
Thanks for any assistance
I am running through your tutorial, great help, btw. I ran into an issue with pnp4nagios. It says framework error, any ideas?
Thanks for the tutorial works perfectly!ReplyDelete
This tutorial is excelent :) Helped me a lot with just one exeption. For some unknown to me reason i am unable to create any graph. It shows nothing but a blank page /pnp4nagios/graph?host=MY_HOST probably something to do with ownership but i have no clue where it might be :<ReplyDelete