I seem to have a habit of using overly complicated tools to do simple things. I’m using HylaFAX to deliver the fax for a single number to a single email address (a topic for another time) and now I’m using Nagios to basically ping a single Windows server. Well, it’s not about the destination but rather the journey, right?
To make matters worse, I’m terrible at reading the full instructions. So, after haphazardly blasting through a Nagios install I began to encounter several problems. One problem prevented me from starting the service because it kept choking on the nt.cfg file. I was never able to get the service to start, even though I knew the nt.cfg file was okay. So, I did a complete uninstall and began again. This time I mostly followed the phenomenal “Monitoring Windows with Nagios” article on the awaseroot blog:
This got me 90% of the way to a working install, but I kept getting errors and struggled to find a solution. I politely asked Google every way I could think of but to no avail. So, if you’re here then you’re probably doing the same thing. For what it’s worth, here’s the basics of the environment:
- Linux Mint 14
- Nagios 3.4.1
- Windows 2003 R2 Server with
- NSClient++ 0.4.1.73 x64
After installing and configuring Nagios I saw the following errors (screenshot follows):
- C:\ Drive Space – Status UNKNOWN – Free disk space : Invalid drive
- Service Explorer – Status UNKNOWN – No handler for command: checkprocstate
- Service W3SVC – Status UNKNOWN – No handler for command: checkservicestate
- Memory Usage – Status CRITICAL – (Return code of 139 is out of bounds)
In my situation, the culprit turned out to be the “nsclient.ini” file on the Windows server. There are a series of configuration items that are set to “0” by default. To fix the errors noted above, open “nsclient.ini” from C:\Program Files\NSClient++ (for the x64 version) on the Windows server.
- To fix the drive error (Free disk space : Invalid drive) change CheckDisk from 0 to 1
- To fix the other errors (No handler… and Return code of 139) change CheckSystem from 0 to 1
- Now, restart the “NSClient++ (x64)” service on the Windows server
You should see the errors start to disappear (assuming everything else is groovy).
You should probably refer to the documentation here: http://www.nsclient.org/nscp/wiki/CheckCommands
It has tons of good information about the the different CheckCommands. If you’re some kind of Nagios Jedi, please take satisfaction knowing you’re better than me.