Configuring Nagios and Nolio for Automated Remediation
It is common practice for today's IT operations that use the Nagios monitoring system, to set thresholds for numerous monitored components such as devices, hosts, and services. When a preconfigured threshold is breached, an alert sends a notification to relevant people in the data center, NOC, or helpdesk. The problem resolution, troubleshooting, remediation and recovery is normally performed manually which not only take takes time, normally breaching SLAs, but in some cases too many people are involved, resulting with faults due to human errors (e.g.: someone forgot to restart a service on a certain host or someone forgot to configure a certain value in an xml file).
The Nolio-Nagios integration allows IT operations to automate any recovery or remediation process as well as automatically trigger problem resolution workflows such as collect logs from multiple servers or gather values from configuration files and databases, to name a couple.
This application remediation automation page will explain how to configure Nolio for remediation and recovery workflows triggered by Nagios alerts.
How to Configure Nolio-Nagios Integration in 3 Simple Steps
Step 1: Design the remediation or recovery process in Nolio
Use the Nolio workflow designer to create any remediation or recovery workflow and to define the data center environment in which this workflow should be activated in case of an alert.
Click on the image to enlarge
Step 2:Install Nolio CLI Component
On the Nagios Core server, install Nolio CLI.
Click on the image to enlarge
Follow the simple instructions in installation wizard
Click on the image to enlarge
Configure the properties of Nolio ASAP.
Click on the image to enlarge
Step 3: Configure Nagios Commands File
Go to Nagios commands file (/nagios/etc/objects/command.cfg) and add the following command:
define command{
command_name notify-nolio-iis-failure
command_line /opt/Nolio/NolioCLI/ExecutionRelay.sh -a <Application_Name>
-e <Environment_Name> -f <Process_Name> -u <User_Name> -p <Password>
}
With the exception of the optional -s (servers) and -r (parameters) arguments, all arguments listed below are mandatory:
- -u<User Name>
- -p<Password>
- -a<Application Name>
- -e<Environment Name>
- -f<Process Name>
- -s {ServerName1,ServerName2} - optional
Note: -r {ServerName/ParameterName,ParameterValue} (sometime it can be ServerName/FolderName/ParameterName) - optional
Click on the image to enlarge
Example
This example describes a scenario where Nagio is monitoring an IISADMIN service on certain servers in data center NY, which are part of the Online_Service_Application. Once Nagios triggers an alert, the Nolio workflows are activated and the recovery of the IISADMIN failure is performed automatically.
Step 1: Nagios Alert that IISADMIN Service has stopped
Click on the image to enlarge
Step 2: Nolio workflow is triggered automatically
Click on the image to enlarge
Step 3: Nagios indicates that problem has been resolved
Click on the image to enlarge
Learn more
Get Involved
Join the Application Service Automation community on LinkedIn and follow us on Twitter.

