Metrics
The tables below list the default set, and optional extended set, of system metrics collected by the SolarWinds Snap Agent.
Default Metrics
CPU Metrics
Metric |
Description |
system.cpu.guest |
Time spent in guest mode |
system.cpu.idle |
Time spent in the idle task. This value should be USER_HZ times the second entry in the /proc/uptime pseudo-file |
system.cpu.interrrupt |
Time servicing interrupts |
system.cpu.iowait |
Time waiting for I/O to complete |
system.cpu.steal |
Stolen time, which is the time spent in other operating systems when running in a virtualized environment |
system.cpu.system |
Time spent in system mode |
system.cpu.user |
Time spent in user mode |
system.cpu.utilization |
Total cpu utilization |
CPU Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
cpu |
Number of the core or total |
Disk Metrics
Metric |
Description |
system.disk.bytes.free |
Free user space which is available to use in mount point |
system.disk.bytes.total |
Total space which is available to root in mount point |
system.disk.bytes.used |
Used user space which is available to use in mount point |
system.disk.percent.free |
User usage percent compared to the total amount of space the user can use in mount point |
system.disk.percent.used |
User free percent compared to the total amount of space the user can use in mount point |
If you want to control for which disks metrics are gathered, you can use mount_points
and exclude_disks
setting. Add aosystem collector section with your custom settings to the main agent configuration file config.yaml
:
Copy
1
2
3
4
5
6
7
8
...
collector:
aosystem:
all:
mount_points: "*"
exclude_disks: "/dev/loop*|/dev/sdb1"
...
or place an additional config section in task-aosystem.yaml
:
Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
...
workflow:
collect:
config:
/system:
mount_points: "*"
exclude_disks: "/dev/loop*|/dev/sdb1"
metrics:
/system/cpu/guest: {}
/system/cpu/idle: {}
/system/cpu/interrup: {}
...
- The setting
mount_points
allows filtering mount points that will be monitored. The default behavior is to monitor only physical
devices (ie. hard disks, USB, etc.). To enable monitoring of all devices, use ‘*’. You can also define a list of devices to be monitored by entering multiple paths separated with ‘|’, i.e. "/|/dev|/run"
.
- The setting
exclude_disks
allows to define which mount points should not be monitored. Multiple exclude patterns can be separated by ‘|’. Basic globbing patterns are also supported.
Disk Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
device |
Device name |
mount_point |
Mount point |
IO Metrics
Metric |
Description |
system.io.bytes.read |
Bytes in read operations on given device |
system.io.bytes.write |
Bytes in write operations on given device |
system.io.io_time |
Time spend on IO (ms/s) |
system.io.io_weighted_time |
Time spend on IO times the IO queue |
system.io.ops.read |
Number of read operations on given device |
system.io.ops.write |
Number of write operations on given device |
system.io.time.read |
Cumulative duration of read operations on given device |
system.io.time.write |
Cumulative duration of write operations on given device |
IO Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
device |
Device name |
Load Metrics
Metric |
Description |
|
system.load.load1 |
Load average over the last 1 minute |
|
system.load.load15 |
Load average over the last 15 minutes |
|
system.load.load5 |
Load average over the last 5 minutes |
|
system.load.load1_rel |
Load average over the last 1 minute, normalized to number of cores |
|
system.load.load15_rel |
Load average over the last 15 minutes, normalized to number of cores |
|
system.load.load5_rel |
Load average over the last 5 minutes, normalized to number of cores |
|
system.load.procs_blocked |
The number of processes currently blocked |
waiting for I/O to complete |
system.load.procs_running |
The number of processes currently running on CPUs |
|
Load Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
Memory Metrics
Metric |
Description |
system.mem.buffered |
Cache for things like file system metadata (bytes) |
system.mem.cached |
Cache for various things (bytes) |
system.mem.free |
Memory not being used at all (zeroed) that is readily available (bytes); note that this doesn't reflect the actual memory available (use system.mem.available instead). |
system.mem.inactive |
Memory that is marked as not used (bytes) |
system.mem.total |
Total physical memory available (bytes) |
system.mem.used |
Memory used, calculated differently depending on the platform and designed for informational purposes only (bytes) |
system.mem.wired |
Memory that is marked to always stay in RAM (bytes). It is never moved to disk |
system.mem.percent.free |
Percentage of memory that is available |
system.mem.percent.used |
Percentage of memory that is not available |
Memory Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
Network Metrics
Metric |
Description |
system.net.all.bytes.rx |
Number of bytes sent |
system.net.all.bytes.tx |
Number of bytes received |
system.net.all.packets.rx |
Number of packets received |
system.net.all.packets.tx |
Number of packets sent |
system.net.bytes.rx |
Number of bytes sent on given interface |
system.net.bytes.tx |
Number of bytes received on given interface |
system.net.packets.rx |
Number of packets received on given interface |
system.net.packets.tx |
Number of packets sent on given interface |
system.net.drop.rx |
Number of packets dropped instead of being sent on given interface |
system.net.drop.tx |
Number of packets dropped instead of being received on given interface |
system.net.errors.rx |
Number of packets errored while sending on given interface |
system.net.errors.tx |
Number of packets errored while receiving on given interface |
Network Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
interface |
Interface 1 |
hardware_addr |
Hardware address 1 |
mtu |
Maximum transmission unit 1 |
1 Only on system.net.bytes.*
and system.net.packets.*
metrics.
Swap Metrics
Metric |
Description |
|
system.swap.total |
Total amount of swap available (bytes) |
|
system.swap.percent.free |
Percentage of swap available |
|
system.swap.percent.used |
Percentage of swap used |
|
system.swap.ins |
Number of kilobytes the system has swapped in from disk per second. |
|
system.swap.outs |
Number of kilobytes the system has swapped out to disk per second. |
|
system.swap.page.fault |
Number of page faults |
the virtual memory statistics. |
system.swap.page.ins |
Total number of kilobytes the system paged in from disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilobytes). |
|
system.swap.page.outs |
Total number of kilobytes the system paged out to disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilobytes). |
|
Swap Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
Optional Metrics
Optinal metrics can be activated by editing the task yaml. For more information please read the SolarWinds Snap Agent configuration article.
Metric |
Description |
system.cpu.guest_nice |
Time spent running a niced guest (virtual CPU for guest operating systems under the control of the Linux kernel) |
system.cpu.nice |
Time spent in user mode with low priority (nice) |
system.cpu.softirq |
Time spent servicing softirqs |
system.cpu.stolen |
CPU cycles that are reclaimed by a virtual machine's hypervisor because it reached maximum processing capacity performing other tasks. |
system.mem.active |
Memory currently in use or very recently used, and so it is in RAM |
system.mem.available |
The actual amount of available memory that can be given instantly to processes that request more memory in bytes; this is calculated by summing different memory values depending on the platform (e.g. free + buffers + cached on Linux) and it is supposed to be used to monitor actual memory usage in a cross platform fashion |
Optional Metric Tags
Tag Name |
Description |
hostname |
Name of the host. Instead of using this tag we recommend using the @host alias |
cpu |
(only on system.cpu.* metrics) cpu core number or total |
Timeout For Querying System Statistics
If you want to tune timeout used for collecting system statistics (i.e. disk metrics), you can use system_query_timeout
setting. By default it is set to 20s. Add aosystem collector section with your custom settings to the main agent configuration file config.yaml
:
Copy
1
2
3
4
5
6
7
...
collector:
aosystem:
all:
system_query_timeout: "20s"
...
or place an additional config section in task-aosystem.yaml
:
Copy
1
2
3
4
5
6
7
8
9
10
11
12
...
workflow:
collect:
config:
/system:
system_query_timeout: "20s"
metrics:
/system/cpu/guest: {}
/system/cpu/idle: {}
/system/cpu/interrup: {}
...
When the APM Integrated Experience is enabled, AppOptics shares a common navigation and enhanced feature set with the other integrated experiences' products. How you navigate AppOptics and access its features may vary from these instructions. For more information, go to the APM Integrated Experience documentation.
The scripts are not supported under any SolarWinds support program or service. The scripts are provided AS IS without warranty of any kind. SolarWinds further disclaims all warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The risk arising out of the use or performance of the scripts and documentation stays with you. In no event shall SolarWinds or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the scripts or documentation.