Documentation forServer & Application Monitor
Monitoring your applications and environment is a key capability of Hybrid Cloud Observability and is also available in a standalone module, Server & Application Monitor (SAM). Hybrid Cloud Observability and SAM are built on the self-hosted SolarWinds Platform.

Unix CPU Monitoring Perl

This SAM application monitor template uses Perl scripts to assess the CPU performance of computers running AIX 5.3 or 6.1; Solaris 8, 9, or 10; or HP-UX 11.0.

Prerequisites

SSH and Perl are installed on the target server.

If Perl is installed in a location different from /usr/bin/perl, edit paths in the first line of the field ”script body” (#!/usr/bin/perl) for component monitors, or create a symbolic link to Perl with ln command.

To determine where Perl is installed, use the following command: which perl

Credentials

Root credentials on the target server.

Some Unix implementations such as Solaris have a character limit in the input buffers of SSH sessions that prevents SAM from copying scripts. If a monitored component fails to return a result or returns error code 255, manually copy the script to the target machine and then update the path and filename in the Command Line field. For example, if you manually copy a script as /usr/script.pl, change the Command Line to: perl /usr/script.pl.

Component monitors

Components without predetermined threshold values have guidance such as "Use the lowest threshold possible" or "use the highest threshold possible" to help you find a threshold appropriate for your application. To learn more, see Manage thresholds in SAM.

CPU User Time (%)

Percent of CPU time spent running non-kernel code (user time). This represents the time spent executing user code. It depends on the programs that the user uses.

Use the lowest threshold possible.

CPU System Time (%)

Percent of CPU time spent running system kernel code (system time).

Use the lowest threshold possible.

CPU Idle Time (%)

Percent of CPU time spent idle.

Use the highest threshold possible at all times.

Interrupts/sec

The number of interrupts per second.

The threshold for this depends on the processor. For modern CPUs, a threshold of 1500 interrupts/sec is a good start. A dramatic increase in this counter value without a corresponding increase in system activity indicates a hardware problem.

System calls/sec

The number of system calls per second. This is a measure of how busy the system is taking care of applications and services. High System calls/sec indicates high utilization caused by software

Set the thresholds appropriately for your environment.

Context switches/sec

The number of context switches per second. High activity rates can result from inefficient hardware or poorly designed applications. The normal amount of Context Switches/Sec depends on your servers and applications.

To set the threshold, baseline the server. The threshold for Context Switches/sec is cumulative for all processors, so you need a minimum of 14000 per processor (single=14000, dual=28000, quad=56000 and so forth).

Kernel threads in run queue

AIX and Solaris: Average number of runnable kernel threads over the sampling interval. Runnable refers to threads that are ready but waiting to run and to those threads already running.

HP-UX: Rename this counter to “Processes in run queue” - average number of runnable processes over the sampling interval.

This should be as low as possible. If the run queue is constantly growing, it may indicate the need for a more powerful CPU or more CPUs.

Set the thresholds appropriately for your environment.

Kernel threads blocked waiting resources

AIX and Solaris: Average number of kernel threads placed in the VMM wait queue (awaiting resource, awaiting input/output) over the sampling interval.

HP-UX: Rename this counter to “Processes blocked waiting resources” - average number of processes blocked for resources (I/O, paging, and so on) over the sampling interval.

This should be as low as possible.

Set the thresholds appropriately for your environment.

Total amount of system calls after boot

The total number of system calls after boot.

Total amount of device interrupts after boot

The total number of interrupts after boot.

Total amount of CPU context switches after boot

The total number of CPU context switches after boot.