Documentation forServer & Application Monitor
Monitoring your applications and environment is a key capability of Hybrid Cloud Observability and is also available in a standalone module, Server & Application Monitor (SAM). Hybrid Cloud Observability and SAM are built on the self-hosted SolarWinds Platform.

HP-UX

This SAM application monitor template assesses the performance of the HP-UX operating system installed on the target server. It uses Perl scripts to monitor the performance of queries.

The template supports all versions of HP-UX. SolarWinds recommends installing and using the NET-SNMP to monitor. Download and install the NET-SNMP agent on the HP-UX server.

This template was tested on HP-UX B.11.23.

Prerequisites

SSH and Perl installed on the target server.

Credentials

Root credentials on the target server.

Port

Use port 1161 for the template.

Component monitors

Set thresholds for counters according to your environment. It is recommended to monitor counters for some period of time to understand potential value ranges and then set the thresholds accordingly.

If you have the wrong set of terminal characteristics, you will receive errors that resemble the following:

Can't modify constant item in scalar assignment at /tmp/APM_1933032963.pl line 1, near ");"
syntax error at /tmp/APM_1933032963.pl line 4, near "++) "
syntax error at /tmp/APM_1933032963.pl line 7, near "++) "
syntax error at /tmp/APM_1933032963.pl line 10, near "}"

Execution of /tmp/APM_1933032963.pl aborted due to compilation errors.

To resolve errors, add add the following line to /etc/profile:

stty erase "^H" kill "^U" intr "^C" eof "^D"

CPU statistic (%)

Returns the percentage of CPU time used. The returned values are as follows:

  • User – The percentage of CPU time spent running non-kernel code (user time). This represents the time spent executing user code. This statistic depends on the programs that the user is running. It is recommended to use the lowest threshold possible.
  • System – The percentage of CPU time spent running the system kernel code (system time). It is recommended to use the lowest threshold possible.
  • Idle – The percentage of CPU time spent idle. It is recommended to use the highest threshold possible at all times.

System faults statistic/sec

Returns the rate of system faults, per second. The returned values are as follows:

  • Interrupts – The number of interrupts per second. The threshold for this component depends on the processor. For modern CPUs, a threshold of 1,500 interrupts/sec is a acceptable. A dramatic increase in this value, without a corresponding increase in system activity, indicates a hardware problem.
  • System_Calls – The number of system calls per second. This is a measure of how busy the system is handling applications and services. High System Calls/sec indicates high utilization caused by software. With today's faster CPUs, 20,000 would represent a reasonable threshold.
  • Context_Switches – The number of context switches per second. High activity rates can result from inefficient hardware or poorly designed applications. The normal amount of Context Switches/Sec depends on your servers and applications. The threshold for Context Switches/sec is cumulative for all processors, so you need a minimum of 14,000 per processor (single=14,000, dual=28,000, quad=56,000, and so forth).

Processes statistic

Returns the number of processes in different states. The returned values are as follows:

  • In_Run_Queue – The average number of runnable kernel threads over the sampling interval. This should be as low as possible. If the run queue is constantly growing, it may indicate the need for a more powerful CPU or more CPUs. Set the thresholds appropriately for your environment.
  • Waiting_For_resources – The average number of kernel threads placed in the VMM wait queue (awaiting resource, awaiting input/output) over the sampling interval. This should be as low as possible. Set the thresholds appropriately for your environment.
  • Swapped – The number of runnable or short sleeper (< 20 secs) but swapped processes.

Memory and Swap statistic (MB)

Returns the memory and swap statistic in MB. The returned values are as follows:

  • Free_Memory – The amount of available memory in MB. Use the highest threshold possible at all times. Set the thresholds appropriately for your environment.
  • Used_Memory – The amount of used memory in MB. Use the lowest threshold possible.
  • Free_Swap – The amount of available swap in MB. Use the highest threshold possible at all times. Set the thresholds appropriately for your environment.
  • Used_Swap – The amount of used swap in MB. Use the lowest threshold possible.

Paging statistic/sec

Returns the different paging statistics. The returned values are as follows:

  • Address_Translation_Faults – The number of page address translation faults per second (valid page not in memory). Use the lowest threshold possible.
  • Paged_In – The rate of pages "paged in" from paging space in kB, per second. The operation of reading one inactive page or a cluster of inactive memory pages from the disk is called a "page in." Use the lowest threshold possible.
  • Paged_Out – The rate of pages "paged out" from paging space in kB, per second. The operation of writing one inactive page or a cluster of inactive memory pages to the disk is called a "page out." Use the lowest threshold possible. Values above 20 pages (80 kB), or so, indicate a significant performance problem. In this situation, more memory should be installed.

Processes in different states

Returns the different paging statistics. The returned values are as follows:

  • Sleeping – The number of processes that are waiting for an event to complete.
  • Runnable – The number of processes that are on run queue.
  • Zombie – The number of processes that are terminated and where the parent is not waiting. This should always be zero. If it is not zero, you should manually kill zombie processes. Use the following commands to see these zombie processes: ps -el | grep " Z "
  • Stopped – The amount of processes that are stopped, either by a job control signal or because it is being traced.

Space on root (/) partition (MB)

Returns the available and used space of the root (/) partition in MB. The returned values are as follows:

  • Available_Space – The available space on the root (/) partition in MB. Use the highest threshold possible at all times.
  • Used_Space – The used space on the root (/) partition in MB.

Using percentage of active system devices

Returns the name of the active system device and the percentage of time the device was busy servicing a transfer request.

Disk operations/sec of active system devices

Returns the name of the active system device and its read/write transfers to or from the device.

Top 10 active processes

Returns the top 10 active processes and share of CPU usage in percent.