Documentation forAppOptics

System Metrics

This integration was updated to V2 plugin framework with release of SolarWinds Snap Agent 3.0.0, but old way of configuring is supported for backwards compatibility. If you still use depreciated configuration files, please refer to previous Linux or Windows documentation.

CloudLinux limits number of system users that can read special files from /proc. This can cause that some metrics cannot be gathered. Please refer to CloudLinux OS documentation on instructions how to grant solarwinds group access.

Overview

SolarWinds Snap agent by default collect basic metrics from host, like cpu, memory, disk and network utilization. There should be no need to modify default configuration.

This integration was updated to V2 plugin framework. If you are still using depreciated way of configuring it, please refer to Linux or Windows documentation.

Setup

The system monitoring is accomplished by system plugin. If you need to modify its default settings please follow instructions below.

Configuration

The system task configuration is placed

On Windows:

Copy
> C:\Program Data\SolarWinds\Snap\tasks-autoload.d\task-aosystem.yaml

On the windows server family to have working metrics like /system/io/ you have to enable it by running command diskperf -y in command prompt.

For more information please refer to psutil disk io counters documentation.

On Linux:

Copy
$ /opt/SolarWinds/Snap/etc/tasks-autoload.d/task-aosystem.yaml

To make changes to default setup:

  1. Edit the task file with settings specific to your use case:

    Copy
    ---
    version: 2

    schedule:
      type: cron
      interval: "0 * * * * *"

    plugins:
      - plugin_name: aosystem

        config:
          ## mount_points allows filtering mount points that will be monitored.
          ## The default behavior is to monitor only physical devices (ie. hard disks, USB, etc.).
          ## To enable monitoring of all devices, use ‘*’.
          # mount_points:
          #   - /dev
          #   - /run
          #   - C:

          ## exclude_disks allows to define which mount points should not be monitored.
          ## Basic globbing patterns are supported.
          # exclude_disks:
          #   - /dev/loop*
          #   - /dev/sdb1
          #   - D:

          ## Timeout for collecting system metrics. Default value is 20s.
          # system_query_timeout: "20s"

        metrics:
          - /system/cpu/guest
          - /system/cpu/idle
          - /system/cpu/interrupt
          - /system/cpu/iowait
          - /system/cpu/steal
          - /system/cpu/system
          - /system/cpu/user
          - /system/cpu/utilization
          - /system/cpu/per_cpu/[cpu]/guest
          - /system/cpu/per_cpu/[cpu]/idle
          - /system/cpu/per_cpu/[cpu]/interrupt
          - /system/cpu/per_cpu/[cpu]/iowait
          - /system/cpu/per_cpu/[cpu]/steal
          - /system/cpu/per_cpu/[cpu]/system
          - /system/cpu/per_cpu/[cpu]/user
          - /system/cpu/per_cpu/[cpu]/utilization
          - /system/disk/[mount_point]/bytes/free
          - /system/disk/[mount_point]/bytes/total
          - /system/disk/[mount_point]/bytes/used
          - /system/disk/[mount_point]/percent/free
          - /system/disk/[mount_point]/percent/used
          - /system/io/[mount_point]/bytes/read
          - /system/io/[mount_point]/bytes/write
          - /system/io/[mount_point]/io_time
          - /system/io/[mount_point]/io_weighted_time
          - /system/io/[mount_point]/ops/read
          - /system/io/[mount_point]/ops/write
          - /system/io/[mount_point]/time/read
          - /system/io/[mount_point]/time/write
          - /system/load/15
          - /system/load/15_rel
          - /system/load/1
          - /system/load/1_rel
          - /system/load/5
          - /system/load/5_rel
          - /system/load/procs_blocked
          - /system/load/procs_running
          - /system/mem/buffered
          - /system/mem/cached
          - /system/mem/free
          - /system/mem/inactive
          - /system/mem/nonpaged
          - /system/mem/paged
          - /system/mem/percent/free
          - /system/mem/percent/used
          - /system/mem/total
          - /system/mem/used
          - /system/mem/wired
          - /system/net/all/bytes/rx
          - /system/net/all/bytes/tx
          - /system/net/all/drop/rx
          - /system/net/all/drop/tx
          - /system/net/all/errors/rx
          - /system/net/all/errors/tx
          - /system/net/all/packets/rx
          - /system/net/all/packets/tx
          - /system/swap/ins
          - /system/swap/outs
          - /system/swap/page/fault
          - /system/swap/page/ins
          - /system/swap/page/outs
          - /system/swap/percent/free
          - /system/swap/percent/used
          - /system/swap/total
          ## For backwards compatibility
          - /system/net/bytes/rx
          - /system/net/bytes/tx
          - /system/net/drop/rx
          - /system/net/drop/tx
          - /system/net/errors/rx
          - /system/net/errors/tx
          - /system/net/packets/rx
          - /system/net/packets/tx
          ## optional metrics
          # - /system/net/interface/[interface]/bytes/rx
          # - /system/net/interface/[interface]/bytes/tx
          # - /system/net/interface/[interface]/drop/rx
          # - /system/net/interface/[interface]/drop/tx
          # - /system/net/interface/[interface]/errors/rx
          # - /system/net/interface/[interface]/errors/tx
          # - /system/net/interface/[interface]/packets/rx
          # - /system/net/interface/[interface]/packets/tx
          # - /system/io/[mount_point]/io_merged/read
          # - /system/io/[mount_point]/io_merged/write

        publish:
          - plugin_name: publisher-appoptics
  2. Restart the agent:

    On Windows command line:

    Copy
    > net stop swisnapd
    > net start swisnapd

    On Linux command line:

    Copy
    $ sudo service swisnapd restart

Testing Integration

To check if and what metrics can be collected with given configuration, run system plugin in debug mode:

On Windows command line:

Copy
> "C:\Program Files\SolarWinds\Snap\bin\snap-plugin-collector-aosystem.exe" --debug-mode --plugin-config "{}"

On Linux command line:

Copy
$ /opt/SolarWinds/Snap/bin/snap-plugin-collector-aosystem --debug-mode --plugin-config "{}"

Metrics and Tags

The tables below list the default set, and optional extended set, of system metrics collected by the SolarWinds Snap Agent.

CPU Metrics

Metric Description
system.cpu.guest Time spent in guest mode by all CPUs (only for Linux)
system.cpu.idle Time spent in the idle task. This value should be USER_HZ times the second entry in the /proc/uptime pseudo-file by all CPUs
system.cpu.interrrupt Time servicing interrupts by all CPUs
system.cpu.iowait Time waiting for I/O to complete by all CPUs (only for linux)
system.cpu.steal Stolen time, which is the time spent in other operating systems when running in a virtualized environment by all CPUs (only for Linux)
system.cpu.system Time spent in system mode by all CPUs
system.cpu.user Time spent in user mode by all CPUs
system.cpu.utilization Total cpu utilization
system.per_cpu.guest Time spent in guest mode
system.per_cpu.idle Time spent in the idle task. This value should be USER_HZ times the second entry in the /proc/uptime pseudo-file
system.per_cpu.interrrupt Time servicing interrupts
system.per_cpu.iowait Time waiting for I/O to complete
system.per_cpu.steal Stolen time, which is the time spent in other operating systems when running in a virtualized environment
system.per_cpu.system Time spent in system mode
system.per_cpu.user Time spent in user mode
system.per_cpu.utilization Total cpu utilization

CPU Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias
cpu Number of the core for per_cpu metrics

Disk Metrics

Metric Description
system.disk.bytes.free Free user space which is available to use in mount point
system.disk.bytes.total Total space which is available to root in mount point
system.disk.bytes.used Used user space which is available to use in mount point
system.disk.percent.free User usage percent compared to the total amount of space the user can use in mount point
system.disk.percent.used User free percent compared to the total amount of space the user can use in mount point

If you want to control for which disks metrics are gathered, you can use mount_points and exclude_disks setting in task-aosystem.yaml.

Disk Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias
device Device name
mount_point Mount point

IO Metrics

Metric Description
system.io.bytes.read Bytes in read operations on given device
system.io.bytes.write Bytes in write operations on given device
system.io.io_time Time spend on IO (ms/s)
system.io.io_weighted_time Time spend on IO times the IO queue
system.io.ops.read Number of read operations on given device
system.io.ops.write Number of write operations on given device
system.io.time.read Cumulative duration of read operations on given device
system.io.time.write Cumulative duration of write operations on given device

IO Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias
device Device name

Load Metrics

Metric Description
system.load.load1 Load average over the last 1 minute
system.load.load15 Load average over the last 15 minutes
system.load.load5 Load average over the last 5 minutes
system.load.load1_rel Load average over the last 1 minute, normalized to number of cores
system.load.load15_rel Load average over the last 15 minutes, normalized to number of cores
system.load.load5_rel Load average over the last 5 minutes, normalized to number of cores
system.load.procs_blocked The number of processes currently blocked, waiting for I/O to complete
system.load.procs_running The number of processes currently running on CPUs

Load Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias

Memory Metrics

Metric Description
system.mem.buffered Cache for things like file system metadata (bytes)
system.mem.cached Cache for various things (bytes)
system.mem.free Memory not being used at all (zeroed) that is readily available (bytes); note that this doesn't reflect the actual memory available (use system.mem.available instead).
system.mem.inactive Memory that is marked as not used (bytes)
system.mem.total Total physical memory available (bytes)
system.mem.used Memory used, calculated differently depending on the platform and designed for informational purposes only (bytes)
system.mem.wired Memory that is marked to always stay in RAM (bytes). It is never moved to disk
system.mem.paged Memory that is used for objects that can be written to disk when they are not being used (only for Windows)
system.mem.nonpaged Memory that is used for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated (only for Windows)
system.mem.percent.free Percentage of memory that is available
system.mem.percent.used Percentage of memory that is not available

Memory Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias

Network Metrics

Metric Description
system.net.all.bytes.rx Number of bytes sent
system.net.all.bytes.tx Number of bytes received
system.net.all.packets.rx Number of packets received
system.net.all.packets.tx Number of packets sent
system.net.all.drop.rx Number of packets dropped instead of being sent
system.net.all.drop.tx Number of packets dropped instead of being received
system.net.all.errors.rx Number of packets errored while sending
system.net.all.errors.tx Number of packets errored while receiving
system.net.bytes.rx Number of bytes sent on given interface
system.net.bytes.tx Number of bytes received on given interface
system.net.packets.rx Number of packets received on given interface
system.net.packets.tx Number of packets sent on given interface
system.net.drop.rx Number of packets dropped instead of being sent on given interface
system.net.drop.tx Number of packets dropped instead of being received on given interface
system.net.errors.rx Number of packets errored while sending on given interface
system.net.errors.tx Number of packets errored while receiving on given interface

Interface metrics that allow filltering by interface name. If you would like to use them, please disable per interface metrics.

Metric Description
system.net.interface.bytes.rx Number of bytes sent on given interface
system.net.interface.bytes.tx Number of bytes received on given interface
system.net.interface.packets.rx Number of packets received on given interface
system.net.interface.packets.tx Number of packets sent on given interface
system.net.interface.drop.rx Number of packets dropped instead of being sent on given interface
system.net.interface.drop.tx Number of packets dropped instead of being received on given interface
system.net.interface.errors.rx Number of packets errored while sending on given interface
system.net.interface.errors.tx Number of packets errored while receiving on given interface

Network Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias
interface Interface 1
hardware_addr Hardware address 1
mtu Maximum transmission unit 1

1 Only on system.net.bytes.* and system.net.packets.* metrics.

Swap Metrics

Metric Description
system.swap.total Total amount of swap available (bytes)
system.swap.percent.free Percentage of swap available
system.swap.percent.used Percentage of swap used
system.swap.ins Number of kilobytes the system has swapped in from disk per second (only for Linux)
system.swap.outs Number of kilobytes the system has swapped out to disk per second (only for Linux)
system.swap.page.fault On Linux, number of page faults, the virtual memory statistics. On Windows, the average number of pages faulted per second
system.swap.page.ins On Linux, total number of kilobytes the system paged in from disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilobytes). On Windows, the rate at which pages are read from disk to resolve hard page faults
system.swap.page.outs On Linux, total number of kilobytes the system paged out to disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilobytes). On Windows, the rate at which pages are written to disk to free up space in physical memory

Swap Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias

Optional Metrics

Optinal metrics can be activated by editing the task yaml. For more information please read the SolarWinds Snap Agent Task File article.

Metric Description
system.cpu.guest_nice Time spent running a niced guest (virtual CPU for guest operating systems under the control of the Linux kernel)
system.cpu.nice Time spent in user mode with low priority (nice)
system.cpu.softirq Time spent servicing softirqs
system.cpu.stolen CPU cycles that are reclaimed by a virtual machine's hypervisor because it reached maximum processing capacity performing other tasks.
system.mem.active Memory currently in use or very recently used, and so it is in RAM
system.mem.available On Linux, the actual amount of available memory that can be given instantly to processes that request more memory in bytes; this is calculated by summing different memory values depending on the platform (e.g. free + buffers + cached on Linux) and it is supposed to be used to monitor actual memory usage in a cross platform fashion. On Windows, the actual amount of available memory that can be given instantly to processes that request more memory in bytes; this is calculated by summing different memory values depending on the platform (e.g. free + buffers + cached on Linux) and it is supposed to be used to monitor actual memory usage in a cross platform fashion

Optional Metric Tags

Tag Name Description
hostname Name of the host. Instead of using this tag we recommend using the @host alias
cpu (only on system.cpu.* metrics) cpu core number or total

Troubleshooting

Timeout For Querying System Statistics

If you encounter issues with timeouts collecting system metrics (such as disk metrics), you can use system_query_timeout setting. By default it is set to 20s.

The scripts are not supported under any SolarWinds support program or service. The scripts are provided AS IS without warranty of any kind. SolarWinds further disclaims all warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The risk arising out of the use or performance of the scripts and documentation stays with you. In no event shall SolarWinds or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the scripts or documentation.