At work, we run a Nagios server to monitor the health of our web infrastructure. “What’s Nagios?” I hear you ask. Well, nagios is the 800lb gorilla in the world of open source monitoring apps. It has a rich and useful feature set for monitoring the health of pretty much every aspect of your infrastructure from the servers delivering your web apps (and the actual apps running on them) all the way out acroos your network to your gateway routers. And, for an open source project, the documentation is incredibly rich and detailed, covering every aspect of running the software. It pretty much has the ability to monitor everything you run!
I’ve had several emails and no small number of searches on this site looking for information about ColdFusion and Nagios, so I’ve decided to place a short article here on how to monitor the health of your ColdFusion boxes and services using Nagios. Initially, this will only cover installs of CF on Windows servers.
First off, I’m assuming you have a working Nagios installation in place. There are extensive guides on setting up Nagios already out on the web. Read them, get Nagios running and then get familiar with Nagios before trying what’s detailed here.
Install NRPE_NT
First, you’ll need to install the NRPE_NT daemon on each of the Windows servers you have running CF. Follow the instructions within the zip to install. It just works.
Add CF hosts to hosts.cfg
If your CF servers aren’t already among the hosts you’re monitoring, add them to hosts.cfg. A typical host definition looks similar to the one below.
define host {
use generic-host
host_name cfserver1
alias main coldfusion server
address 172.16.3.129
parents Internet_Zone
check_command check-host-alive
max_check_attempts 3
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
Add host groups for your CF servers
Assuming all your CF servers are running the same CF version, you can make a Nagios host group for those servers. This is much less work than adding individual servers to the service check (detailed below). If you have servers running disparate CF versions, set up a host group for each. In our setup, we have a cf5servers group and a cfmxservers group. Add your host group defintion to hostgroups.cfg. A typical host group definition looks similar to the one below.
define hostgroup {
hostgroup_name cfmxservers
alias MCT ColdFusion MX Servers
contact_groups webdev online
members cfserver1, cfserver2, cfserver3
}
Add CF commands to checkcommands.cfg
In a default Nagios installation, checkcommands.cfg contains all the definitions of the commands Nagios uses to inspect services running on the hosts it monitors. The following list of command definitions should cover ColdFusion 5 and CFMX 7 (and will likely also cover CFMX 6, but I don’t remember…). Add the following lines to checkcommands.cfg.
define command {
command_name check_coldfusion5
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v SERVICESTATE -l "Cold Fusion Application Server"
}
define command {
command_name check_coldfusionjrun
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v SERVICESTATE -l "Macromedia JRun CFusion Server"
}
define command {
command_name check_coldfusionmx
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v SERVICESTATE -l "ColdFusion MX Application Server"
}
define command {
command_name check_coldfusionmx_process
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v COUNTER -l "\\Process(jrun)\\Private Bytes","JRun is using %.f bytes" -w 891289600 -c 1073741824
# comment Warning 850Mb, Critical 1024Mb (1Gb) YMMV
}
define command {
command_name check_coldfusionmx_threads
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v COUNTER -l "\\Process(jrun)\\Thread Count","JRun is using %.f Threads" -w 90 -c 110
# comment Warning 90 threads, Critical 110 threads YMMV
}
The last two commands in the above listing check actual CF processes running on your server. Note the comments. The values we use may well be very different in your environment. The easiest way to check is to have a look at what your servers are doing and make an educated guess at the needed levels in your setup. No harm done if you use these values, you just may end up getting copious notifications from Nagios that you don’t need (or conversely, no notifications at all). Test and tweak.
Adding CF-related services to services.cfg
Nagios’ monitoring model is centered on the concept of services, so this step is perhaps one of the most important. You need to add service definitions to services.cfg for each of the command definitions you built earlier. A couple of sample service definitions are shown below.
define service {
use NM-HTTP
hostgroup_name cfmxservers
service_description Check ColdFusion MX
contact_groups online,webdev
check_period 24x7
notification_interval 60
notification_options w,u,c,r
notification_period 24x7
check_command check_coldfusionmx
max_check_attempts 3
normal_check_interval 1
retry_check_interval 1
# comment Check if ColdFusion MX Server is responsive
}
define service {
use NM-HTTP
hostgroup_name cfmxservers
service_description Check ColdFusion MX Process Threads
contact_groups webdev online
check_period 24x7
notification_interval 60
notification_options w,u,c,r
notification_period 24x7
check_command check_coldfusionmx_threads
max_check_attempts 3
normal_check_interval 1
retry_check_interval 1
}
Restart Nagios
At this point, Nagios should be ready, willing and able to monitor CF on your Windows-based CF servers. Restart Nagios by whichever method you use. This can be done directly on the command line, or using a tool such as NagMin (we use it, it’s excellent and makes the business of configuring Nagios significantly easier).
After restarting, if you log into the Nagios web interface, you should be able to see the CF services you set up being monitored.
I’d be happy to answer questions or elaborate on any of the detail here. Just post a comment or email me offline.




{ 7 comments… read them below or add one }
Thanks for writing this up! I’ve been considering setting up Nagios for awhile. Now that I know I can also monitor my CF boxes – maybe I’ll finally break down and do it!
Jim
Do you have any experience monitoring CF on Linux with Nagios? How do you read the “\\Process(jrun)\\Private Bytes” and “\\Process(jrun)\\Thread Count” for example under RedHat Linux?
Erki, I don’t have a Linux-based CF box running at the moment, so I can’t take a look at how this would be done. Well, not exactly. It should involve setting up the Linux NRPE daemon on the server and determining the processes being run which equate to the Windows processes on a Windows server. Have your NRPE daemon monitor those processes and have your Nagios server contact the remote NRPE daemon for information.
That’s a really brief summary, and probably lacking in detail, but broadly right. I’d be happy to give you a hand when I get back from the course I’m on. Email me or call me on Skype when I get back (Saturday Australian time).
Awesome! This is great!
I’ve might have found another way to monitor your ColdFusion apps with nagios. You can use the check_http command, but instead of pointing it to one of your .html pages, point it at one of your .cfm pages.
Here is what I did:
So, for my check command, I use:
check_http!http://bookexchange.byu.edu/graphs/index.cfm(I use a .cfm page instead of a .html page)
If Coldfusion isn’t working, then this seems to work. (Note: this is just a work-around that seems to work for me… I am NO expert in Nagios OR coldfusion… I just try to solve problems).
Thx to autor and Nathan for this useful hints. Think it’ll save me lot of time.
This CF monitoring will add a new quality to our service monitoring.
Q: Do you have any experience monitoring CF on Linux with Nagios? ???\\Process(jrun)\\Thread Count????
A: We use “pstree” command in bash script:
/usr/local/bin/coldfusion-threads-snmp.sh
which prints a CFChildren variable like:
CFChildren=`/usr/bin/pstree | grep -e cfmx7.*cfmx7.*cfmx | sed -e ‘s/\*\[cfmx7\]$//’ -e ‘s/ *..cfmx7.*.cfmx7…//’`
and the results of the script can be exported within snmp; an entry in snmpd.conf like:
exec .1.3.6.1.4.1.2021.500 coldfusion-threads /usr/local/bin/coldfusion-threads-snmp.sh
will do it. I am confident that you can do it better than me, in a different way.
Thanks,
Steve.
Thanks for putting this up. Just a quick ? for you, would it be possible for you to post you template service definition for NM-HTTP?
I’m curious how you’ve implemented it and where it might differs from the generic one.