Documentation forSolarWinds Observability SaaS

Telegraf metrics

Many of the collected metrics from the IIS integration are displayed as widgets in SolarWinds Observability custom and pre-built dashboards; additional metrics may be collected and available in the Metrics Explorer.

Fluentd

For a comprehensive list of metrics, see Fluentd Input Plugin at GitHub.

MetricUnitsDescription
sw.metrics.healthscorePercent (%)

Health score. Health score is calculated based on entity anomalies, status, and alerts. Higher severity alerts may have a greater impact on health score. The score may fall into one of four categories: Good (70-100), Moderate (40-69), Bad (0-39), or NULL (displayed as '-')

See Entity health scores for more information.

fluentd_buffer_available_buffer_space_ratiosPercent (%)Available Buffer Space. The percentage of remaining available buffer space.
fluentd_buffer_queue_byte_sizeBytes (B)Buffer Queue Bytes. The current size of queued buffer chunks (in bytes).
fluentd_buffer_queue_length Buffer Queue Length. The length of the buffer queue.
fluentd_buffer_stage_byte_sizeBytes (B)Buffer Stage Bytes. The current size of staged buffer chunks (in bytes).
fluentd_buffer_stage_length Buffer Stage Length. The length of staged buffer chunks.
fluentd_buffer_total_queued_sizeBytes (B)Buffer Queue Size. The size of the buffer queue.
fluentd_emit_count{emits}Total Record Emit Count. The total number of emit calls.
fluentd_emit_records{records}Total Emit Records. The total number of emitted records.
fluentd_emit_sizeBytes (B)Total Emit Size. The total size of emit events.
fluentd_retry_count{retries}Retry Count. The number of retry attempts.
fluentd_rollback_count{count}Total Rollback Count. The total number of rollbacks. Rollbacks happen when write/try_write fails.
fluentd_slow_flush_count{count}Total Slow Flush Count. The total number of slow flushes. This count will be incremented when buffer flush is longer than slow_flush_log_threshold.
fluentd_write_count{count}The total number of writes.

HAProxy

For a comprehensive list of metrics, see HAProxy Input Plugin at GitHub and HaProxy documentation at docs.haproxy.org.

SolarWinds Observability SaaS expects that metrics return a number. Some HAProxy metrics, such as status, return strings, and thus are not supported.

MetricUnitsDescription
sw.metrics.healthscorePercent (%)

Health score. Health score is calculated based on entity anomalies, status, and alerts. Higher severity alerts may have a greater impact on health score. The score may fall into one of four categories: Good (70-100), Moderate (40-69), Bad (0-39), or NULL (displayed as '-')

See Entity health scores for more information.

haproxy_active_servers{servers}Active Servers. The number of currently active servers.
haproxy_backup_servers{servers}Backup Servers. The number of available backup servers.
haproxy_binbytesTotal In and Out Traffic. The cumulative total of incoming traffic.
haproxy_boutbytesTotal In and Out Traffic. The cumulative total of outgoing traffic.
haproxy_dreq{requests}Total Denied Requests. The cumulative number of requests denied because of security concerns.
haproxy_dcon{requests}Total Denied Requests. The cumulative number of requests denied by the 'tcp-request connection' rules.
haproxy_dses{requests}Total Denied Requests. The cumulative number of requests denied by the 'tcp-request session' rules.
haproxy_dresp{responses}Total Denied Responses. The cumulative number of responses denied because of security concerns. For HTTP, the responses are denied because of a matched http-request rule, or 'option checkcache'.
haproxy_eresp{responses}Total Denied Responses. The cumulative number of response errors, such as srv_abrt, or write errors on the client socket, or failure applying filters to the response.
haproxy_ereq{errors}Total Request Errors. The cumulative number of request errors, such as early termination from the client, read error, client timeout, client closed connection,.
haproxy_econ{errors}Total Request Errors. The cumulative number of request errors encountered when trying to connect to a backend server. The backend stat is the sum of the stat for all servers of that backend, plus any connection errors not associated with a particular server (such as the backend having no active servers).
haproxy_scur{sessions}Current Sessions. The number of current sessions per proxy
haproxy_slim{sessions}Session Limit. The currently configured session limit.
haproxy_stot{sessions}Total Sessions. The cumulative number of sessions.
haproxy_req_raterequests per secondRequest Rate. HTTP requests per second over the last elapsed second.
haproxy_rtimeMilliseconds (ms)Response Time. The average response time over the 1024 last requests (0 for TCP).
haproxy_req_tot{requests}Total Requests. The total number of received HTTP requests.
haproxy_ctimeMilliseconds (ms)Connection Time. The average connect time over the last 1024 responses.
haproxy_qtimeMilliseconds (ms)Queue Time. The average queue time over the last 1024 responses.
haproxy_ttimeMilliseconds (ms)Session Time. The average session time over the last 1024 responses.
haproxy_http_response.2xx{responses}Total Responses 2xx. The total number of HTTP responses with the 2xx code.
haproxy_http_response.3xx{responses}Total Responses 3xx. The total number of HTTP responses with the 3xx code.
haproxy_http_response.4xx{responses}Total Responses 4xx. The total number of HTTP responses with the 4xx code.
haproxy_http_response.5xx{responses}Total Responses 5xx. The total number of HTTP responses with the 5xx code.

NGINX Plus API

For a more comprehensive list of metrics, see Nginx Virtual Host Traffic (VTS) Input Plugin and Nginx Plus API Input Plugin at GitHub.

MetricUnitsDescription
sw.metrics.healthscorePercent (%)

Health score. Health score is calculated based on entity anomalies, status, and alerts. Higher severity alerts may have a greater impact on health score. The score may fall into one of four categories: Good (70-100), Moderate (40-69), Bad (0-39), or NULL (displayed as '-')

See Entity health scores for more information.

nginx_vts_connections{connections}The number of connections of individual types: active, reading, writing, waiting, accepted handled, requests.
nginx_vts_server, nginx_vts_filter  
nginx_vts_upstream 

 

nginx_vts_cache  

NTPQ

For a comprehensive list of metrics, see NTPQ Input Plugin at GitHub.

MetricUnitsDescription
sw.metrics.healthscorePercent (%)

Health score. Health score is calculated based on entity anomalies, status, and alerts. Higher severity alerts may have a greater impact on health score. The score may fall into one of four categories: Good (70-100), Moderate (40-69), Bad (0-39), or NULL (displayed as '-')

See Entity health scores for more information.

ntpq_delayMilliseconds (ms)Round Trip Delay. Round trip communication delay to the remote peer or server.
ntpq_jitterMilliseconds (ms)Jitter. Mean deviation (jitter) in the time reported for the remote peer or server (RMS or difference of multiple time samples).
ntpq_offsetMilliseconds (ms)Time Offsets. Mean offset (phase) in the times reported between this local host and the remote peer or server (RMS)
ntpq_pollMinutes (min)Polling Frequency. RFC5905 suggests that this ranges in NTPv4 from 4 (16s) to 17 (36h) (log2 seconds), however, the observation suggests the actual displayed value is seconds for a much smaller range of 64 (26) to 1024 (210) seconds.
ntpq_reachOctal numbersReach. An 8-bit left-shift shift register value recording polls (bit set = successful, bit reset = fail) displayed in octal by default. The type can be changed to decimal/count/ratio by configuring it in the ntpq input section inside telegraf.conf.
ntpq_whenMinutes (min)Last Poll. The time since the last poll.

StatsD

The StatsD integration does not include any default metrics. It supports all native StatsD metric types for custom metric submission. See StatsD Input Plugin at GitHub.

Varnish

For a comprehensive list of metrics, see Varnish Input Plugin at GitHub.

MetricUnitsDescription
sw.metrics.healthscorePercent (%)

Health score. Health score is calculated based on entity anomalies, status, and alerts. Higher severity alerts may have a greater impact on health score. The score may fall into one of four categories: Good (70-100), Moderate (40-69), Bad (0-39), or NULL (displayed as '-')

See Entity health scores for more information.

varnish_client_req{requests}Total Client Requests. The number of good client requests.
varnish_s_req_bodybytesbytesTotal Bytes. Total bytes for requests and responses.
varnish_s_req_hdrbytesbytesTotal Bytes. Total bytes for requests and responses.
varnish_s_resp_bodybytesbytesTotal Bytes. Total bytes for requests and responses.
varnish_s_ressp_hdrbytesbytesTotal Bytes. Total bytes for requests and responses.
varnish_sess_dropped{sessions}Total Failed and Dropped Sessions. The number of sessions dropped for thread. The number of times an HTTP/1 session was drpped because the queue was too long already. See thread_queue_limit.
varnish_sess_fail{sessions}Total Failed and Dropped Sessions. The number of sessions accept failure. The number of failures to accept a TCP connection. This counter is the sum of the sess_fail_* counters which give more detailed information.
varnish_sess_closed{operations}Total Session Operations. The number of closed sessions.
varnish_sess_herd{operations}Total Session Operations. The number of times the timeout_linger triggered.
varnish_sess_readahead{operations}Total Session Operations. The number of read ahead sessions.
varnish_sess_closed_err{operations}Total Session Operations. The number of sessions. closed with errors.
varnish_s_sess{sessions}Total Sessions. The total number of sessions that occurred.
varnish_n_expired{objects}Total Number of Objects. The number of objects expired because of old age.
varnish_n_lru_moved{objects}Total Number of Objects. The number of moved LRU objects (move operations done on the LRU list).
varnish_n_lru_nuked{objects}Total Number of Objects. The number of objects that have been forcefully evicted from the storage to make room for a new object (LRU nuked objects).
varnish_cache_miss{count}Total Cache Hits and Misses. The number of cache misses. A cache miss indicates that the object was fetched from the backend before delivering it to the client.
varnish_cache_hit{count}Total Cache Hits and Misses. The number of cache hits. A cache hit indicates that the object was delivered to a client without fetching it from a backend server.
varnish_backend_busy
{connections}Total Backed Connections. The number of times Varnish encountered a situation where it considered the backend to be too busy to handle additional connections.
varnish_backend_conn
{connections}Total Backed Connections. The number of successful backend connections.
varnish_backend_fail
{connections}Total Backed Connections. The number of failed backend connections.
varnish_backend_recycle
{connections}Total Backed Connections. The number of recycled backend connections.
varnish_backend_retry
{connections}Total Backed Connections. The number of retried backend connections.
varnish_backend_reuse
{connections}Total Backed Connections. The number of reused backend connections.
varnish_backend_unhealthy{connections}Total Backed Connections. The number of unhealthy backend connections.
varnish_fetch_length
varnish_fetch_bad
varnish_fetch_eof
varnish_fetch_failed
varnish_fetch_head
varnish_fetch_chunked
varnish_fetch_1xx
varnish_fetch_204
varnish_fetch_304
varnish_fetch_none
varnish_fetch_no_thread
{fetches}Total HTTP Request Fetches. The number of all request fetches by type.
varnish_shm_cont
{operations}Total Shared Memory Operations. The number of contention operations (when multiple threads compete for access to SHM resources).
varnish_shm_cycles
{operations}Total Shared Memory Operations. The number of times data cycles through the shared memory.
varnish_shm_flushes
{operations}Total Shared Memory Operations. The number of flush operations.
varnish_shm_records
{operations}Total Shared Memory Operations. The number of record operations.
varnish_shm_writes
{operations}Total Shared Memory Operations. The number of write operations.
varnish_thread_queue_len{count}Total Session Queue Length. The length of session queue waiting for threads.
varnish_threads{workers}Total Workers. The number of threads in all pools.
varnish_sess_queued{sessions}Total Queued Sessions. Sessions queued for thread. The number of times a session was queued waiting for a thread.
varnish_threads_created{threads}Total Worker Threads. The total number of threads created in all pools.
varnish_threads_destroyed{threads}Total Worker Threads. The total number of threads destroyed in all pools.
varnish_threads_failed{threads}Total Worker Threads. The number of times creating a thread failed.
varnish_threads_limited{threads}Total Worker Threads. The number of times more threads were needed but the limit was reached in a thread pool.