Documentation forServer & Application Monitor
Monitoring your applications and environment is a key capability of Hybrid Cloud Observability and is also available in a standalone module, Server & Application Monitor (SAM). Hybrid Cloud Observability and SAM are built on the self-hosted SolarWinds Platform.

Nagios Script Monitor

This SAM component monitor uses SSH to upload a Nagios script to a Linux/Unix server, runs the Nagios script on the server and then processes the script's exit code and text output. This monitor can return multiple values.

To learn about creating custom scripts with this monitor, see these topics in the SAM Custom Application Monitor Template Guide:

Sample scripts are provided in this default folder on the SolarWinds Platform server: C:\Program Files (x86)\SolarWinds\Orion\APM\SampleScriptMonitors

Statistic

The statistic for this component monitor is the value returned by the script. This component monitor can return multiple results and process the Nagios output. For reference, see Nagios Plugin API (© 2021 Nagios Enterprises, available at assets.nagios.com, obtained on June 22, 2021).

A maximum of 10 output pairs can be returned. If you exceed the maximum allowed, remove the excess output pairs or they will be ignored. For details, see Write scripts for SAM script monitors.

Prerequisites for SolarWinds Platform agent for Linux

To use the SolarWinds Platform agent for Linux, include credentials with permission to run scripts on target systems. Agentless monitoring does not require credentials.

Review Configure Linux/Unix systems for the SolarWinds Platform agent for Linux and set up target systems, which involves:

Troubleshooting tip

Like the Linux/Unix Script Monitor, the Nagios Script Monitor uses SSH to connect to target systems. If key exchange algorithms cause high CPU usage due to the number of active Job Engine Workers processes, you may notice gaps in polling data and slow responses from the SolarWinds Platform Web Console. If that occurs, try adjusting algorithms to reduce CPU usage, as described in Troubleshoot high CPU usage.

Changing key exchange algorithms can have security implications in some environments.

Field descriptions

Description

A default description of the monitor. To override the default description, add to or replace existing text. Changes are automatically saved. The variable to access this field is ${UserDescription}.

Customize descriptions to specify what will be monitored so related alerts and notifications are more meaningful later.

Component Type

Describes the type of monitor you are using.

Enable Component

Determines if the component is enabled. Disabling this component leaves it in the application as deactivated and does not influence application availability or status.

Authentication Type

Choose either User name and Password or User name and Private Key. The second option allows you to use certificates for authentication.

Credential for Monitoring

Select a Windows credential that is both a user who can log on to the SolarWinds Platform server, and has sufficient rights on the target node to do whatever the script needs to do. For example, if the script does something with WMI, the credentials also need WMI rights on the target node.

Click a credential in the list, or use the <Inherit credential from node> option. If the credential you need is not in the credentials list, add it in the SAM Credentials Library.

Port Number

Specify the port number used for the SSH connection. The default value is 22.

Script Working Directory

Specify the working directory of the script process.

Check Type

Set the check type to Service or Host. To prevent false positives, Nagios allows you to define how many times a service or host should be (re)checked before alerting for an issue. Depending on the select, the monitor checks using the configured value on the server for max_check_attempts option in the host and service definitions.

Count Statistic as Difference

Changes the statistic to be the difference in query values between polling cycles.

Command Line

Specify the script to run on the target node, followed by the arguments. To enter your script, click the Edit button to open the script editing field.

For Solaris systems, this field is limited to 266 characters, minus the length of the ${SCRIPT} variable after being resolved to a file name such as APM_937467589.pl. File names are typically 16 characters long, so the actual user Command Line input cannot exceed 250 characters (266 - 16), not including the length of the 9 characters for the “${SCRIPT}" variable itself). To pass a longer command line to the target node, create a shell script on the target node (for example: myscript.sh) that contains the long command line and place the call to this script in the Command Line field, for example: /opt/sw/myscript.sh

Script Body

Enter your script via typing or pasting. You can test the script to receive output definitions, and then save definitions to the component monitor for further configurations. Every saved definition is listed as Script Output with an assigned number and name. See Test script output for details.

Status Roll-Up

Specify how you want the monitor to report based on the output provided by the script. The default selection is “Show worst status.”

User Notes

Add notes for easy reference. You can access this field by using the variable, ${UserNotes}.

Troubleshoot high CPU usage

This component monitor uses SSH to connect to target systems. If key exchange algorithms cause high CPU usage by Job Engine Workers processes, you can reorder or remove algorithms to resolve issues. Contact Technical Support for assistance, if necessary.

Changing algorithms can have security implications in some environments.

  1. On the SolarWinds Platform server, navigate to this default folder: c:\Program Files (x86)\SolarWinds\Orion\APM.
  2. Create a backup copy of the SolarWinds.APM.Probes.dll.config file.
  3. In a text editor, open SolarWinds.APM.Probes.dll.config and locate the following section:
    <LinuxScriptSettings PromptWait="2" ColumnCount="200" TemporaryScriptFileNamePrefix="APM_" />.
  4. Modify algorithms in that section, as necessary.
    • Reorder example:
      <LinuxScriptSettings PromptWait="2" ColumnCount="200" TemporaryScriptFileNamePrefix="APM_" KeyExchangeList="diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1,diffie-hellman-group14-sha1,diffie-hellman-group-exchange-sha256" />

    • Removal example:
      <LinuxScriptSettings PromptWait="2" ColumnCount="200" TemporaryScriptFileNamePrefix="APM_" KeyExchangeList="diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1" />

  5. Save your changes.
  6. Use the SolarWinds Platform Service Manager to restart all services.

 

Scripts must report status through return codes

Nagios determines the status of a host or service by evaluating the return code. The following table shows a list of valid return codes, along with their corresponding service or host states.

To correctly create this component monitor, you must first return an exit code which results in an Up (0), Warning (2), or Critical (3) status. When one of these exit codes is received the appropriate dynamic evidence table structure is created and all further exit codes are handled correctly. If the component only returns Down (1) or Unknown (4) on first use, the appropriate dynamic evidence table structure is not created appropriately.

Return Code Service State Host State

0

OK

Up

1

Warning

Up or Down/Unreachable

2

Critical

Down/Unreachable

3

Unknown

Down/Unreachable

___________________
If the Use Aggressive Host Checking option is enabled, return codes of 1 will result in a host state of Down, otherwise return codes of 1 will result in a host state of Up.

Nagios scripts must exit with a valid return code and a line of text output. The exit code determines the status of the component. If the exit code is 0 (OK), the component status may be further modified by thresholds from the optional statistics. To return up to ten optional statistics, separate the statistics from the status message with the pipe (|) symbol using the following syntax:

statusMessage [|'statisticName'=value]

Below is an example of valid output with a status message and two statistics:

The script ran. | 'CPU%'=75.2 'MemoryRemainingInKB'=600784

Test script output

Test the script output while editing the script, before testing the script in the template or application component pages. If the output formatting or values are not correctly defined or missing, you may receive an error: "Script output values are not defined or improperly defined." This error displays if the named fields could not be located in the script output.

To test the script and save output definitions:

  1. Open the template or application monitor using the Nagios Script monitor. To open, click Settings > All Settings > SAM Settings > Manage Application Monitors. Locate and edit the application monitor or template with the Nagios Script monitor.
  2. Locate and expand the component using the Nagios Script Monitor type in the template.
  3. Locate the Script Body field and click Edit Script.
  4. On the Edit Script dialog box,click Get Script Output. You may be prompted to specify a test node and credentials.
  5. Wait for the Output Result. The results should populate with values returned by the script. Review the results to ensure all formatting is correct and fields properly load.

    For reference, see Nagios Plugin API.

  6. You can store the output definitions returned by the script test as Script Output in the component monitor. Click Save to add the output definitions. The component monitor will display the definitions with a unique ID, display name, and additional configuration options.
  7. To save changes to the template or application monitor, click Submit.