Configuring Nagios and Nolio for Automated Remediation

It is common practice for today's IT operations that use the Nagios monitoring system, to set thresholds for numerous monitored components such as devices, hosts, and services. When a preconfigured threshold is breached, an alert sends a notification to relevant people in the data center, NOC, or helpdesk. The problem resolution, troubleshooting, remediation and recovery is normally performed manually which not only take takes time, normally breaching SLAs, but in some cases too many people are involved, resulting with faults due to human errors (e.g.: someone forgot to restart a service on a certain host or someone forgot to configure a certain value in an xml file).

The Nolio-Nagios integration allows IT operations to automate any recovery or remediation process as well as automatically trigger problem resolution workflows such as collect logs from multiple servers or gather values from configuration files and databases, to name a couple.

This application remediation automation page will explain how to configure Nolio for remediation and recovery workflows triggered by Nagios alerts.

How to Configure Nolio-Nagios Integration in 3 Simple Steps

Step 1: Design the remediation or recovery process in Nolio

Use the Nolio workflow designer to create any remediation or recovery workflow and to define the data center environment in which this workflow should be activated in case of an alert.

Nolio Screenshot: Nolio remediation workflow
Click on the image to enlarge

Step 2:Install Nolio CLI Component

On the Nagios Core server, install Nolio CLI.

Nolio Screenshot: Installing Nolio CLI
Click on the image to enlarge

Follow the simple instructions in installation wizard

Nolio Screenshot: Installing Nolio CLI
Click on the image to enlarge

Configure the properties of Nolio ASAP.

Nolio Screenshot: Installing Nolio CLI
Click on the image to enlarge

Step 3: Configure Nagios Commands File

Go to Nagios commands file (/nagios/etc/objects/command.cfg) and add the following command:

define command{
command_name notify-nolio-iis-failure
command_line /opt/Nolio/NolioCLI/ExecutionRelay.sh -a <Application_Name>
-e <Environment_Name> -f <Process_Name> -u <User_Name> -p <Password>
}

Command line parameters
With the exception of the optional -s (servers) and -r (parameters) arguments, all arguments listed below are mandatory:

Note: -r {ServerName/ParameterName,ParameterValue} (sometime it can be ServerName/FolderName/ParameterName) - optional

Nagios Commands file with Nolio remediation workflow configured
Click on the image to enlarge

Example

This example describes a scenario where Nagio is monitoring an IISADMIN service on certain servers in data center NY, which are part of the Online_Service_Application. Once Nagios triggers an alert, the Nolio workflows are activated and the recovery of the IISADMIN failure is performed automatically.

Step 1: Nagios Alert that IISADMIN Service has stopped

Nagios Alert that IISADMIN Service has stopped (time stamp:  2:59:45)
Click on the image to enlarge

Step 2: Nolio workflow is triggered automatically

Nolio workflow is triggered automatically
Click on the image to enlarge

Step 3: Nagios indicates that problem has been resolved

Nagios indicates that problem has been resolved
Click on the image to enlarge

Learn more

Get Involved

Join the Application Service Automation community on LinkedIn and follow us on Twitter.


Nagios Solutions

Release Management Tool & Nagios

Schedule a Demo

Just fill out this form and we will get back to you quickly.

30 Days Free Trial

Nolio Enterprise edition is now available for 30 days free trial.

Download Now