SolarWinds Platform server 2020.2.6 - Additional Polling Engine
This SAM application monitor template assesses the status of Windows services running on an Additional Polling Engine (APEs) in SolarWinds Platform 2020.2.6 or later. To learn more about APEs in general, see Scalability Engine Guidelines.
Prerequisites
WMI access to target servers.
Credentials
Windows Administrator on target servers
Component monitors
SolarWinds Administration Service
Returns the CPU and memory usage of the SolarWinds Administration Service that:
- Manages installed SolarWinds Platform products and upgrades.
- Supports the installation of Additional Polling Engines (APEs), Additional Web Servers, and High Availability (HA) backups.
- Controls the SolarWinds Platform Service Manager used to stop, start, and restart SolarWinds Platform services and websites, including dependencies.
SolarWinds Alerting Service V2
Returns the CPU and memory usage of the SolarWinds Alerting Service V2 that:
- Evaluates alert conditions,
- Triggers alerts, and
- Runs alert actions.
By default, this monitor is disabled.
SolarWinds Collector Service
Returns the CPU and memory usage of the SolarWinds Collector service that handles data synchronization between the polling engine and the SolarWinds Platform database.
SolarWinds Cortex
Returns the CPU and memory usage of the SolarWinds Cortex service that supports polling for PerfStack and other SolarWinds Platform features that collect data.
SolarWinds High Availability
Returns the CPU and memory usage of the SolarWinds High Availability (HA) service that monitors SolarWinds Platform health and mediates switchover of responsibilities between active and backup SolarWinds Platform instances inside an HA pool.
SolarWinds Information Service
Returns the CPU and memory usage of the SolarWinds Information service, used by websites to talk to the database. This service also controls how polling engines communicate with each other.
By default, this monitor is disabled.
SolarWinds Information Service V3
Returns the CPU and memory usage of the SolarWinds Information service V3 used by websites to talk to the database. This service also controls how polling engines communicate with each other.
SolarWinds JMX Bridge
Returns the CPU and memory usage of the SolarWinds JMX Bridge service that supports the monitoring of Java application servers such as WebSphere, WebLogic, or Apache Tomcat via JMX.
JMX monitoring is disabled if FIPS mode is enabled on the SolarWinds Platform server.
By default, this monitor is disabled.
SolarWinds Job Engine v2
Returns the CPU and memory usage of the SolarWinds Job Engine v2 service that performs recurring work. This service creates various Job Engine Worker processes for scalability and robustness. The job engine writes information about each job to its database.
SolarWinds Log Analyzer for SolarWinds Platform Polling Service
Returns the CPU and memory usage of the SolarWinds Log Manager for SolarWinds Platform Polling Service, which is responsible for logging events in log files.
SolarWinds Log Analyzer for SolarWinds Platform Syslog Service
Returns the CPU and memory usage of the SolarWinds Log Manager for SolarWinds Platform Syslog Service, which is responsible for logging events in log files.
SolarWinds Platform Module Engine
Returns the CPU and memory usage of the SolarWinds Platform Module Engine service that supports communication with the SolarWinds Platform database.
SolarWinds Recommendations
Virtualization Manager (VMAN) recommendations focus on the optimization of resource allocation based on performance metrics and storage capacity. Recommendations calculate trends and risks based on enabled strategies, providing plans of action to consider and apply to resolve immediate issues or preemptively prevent issues from occurring.
Job Engine v2: Jobs Lost
Returns the number of lost jobs. This value should be zero at all times.
Job Engine v2: Jobs Queued
Returns the number of jobs waiting for execution due to insufficient resources. This value should be zero at all times.
Job Engine v2: Jobs Running
Returns the number of jobs currently running.
Job Engine v2: Worker Processes
Returns the number of worker processes used. A value of 10 or lower is acceptable. If the returned value is 100 or greater, there may be problems with jobs hanging.
Job Scheduler v2: Average Execution Delay
Returns the average delay, in milliseconds, between the time when the job is supposed to be executed and the time that it actually is executed. This value should be less than 100,000.
Job Scheduler v2: Results Notified Error
Returns the number of errors that occurred when sending the results back. This value should be zero at all times.
MSMQ Folder Size
Returns the MSMQ folder size. This monitor should be less than 800 MB. The maximum size is 1 GB. If the 1-GB limit is reached, polling will stop working correctly.
Message Queuing (MSMQ) technology was deprecated in SolarWinds Platform 2020.2.6.
To increase the MSMQ size, open Computer Management > Features > Messaging Queuing. Right-click and change MSMQ Messaging 1 GB Limit to 1.5 GB. See this SolarWinds Success Center article for additional information: Microsoft Message Queue Fills Directory with Orphaned Files.
MSMQ Messages in Queue
The total number of Message Queuing messages that currently reside in the selected queue. When the Data Processor receives more results into MSMQ than it can process and pass to the Standard Poller, MSMQ continues growing. The size of MSMQ should be near 0 most of the time. Some spikes may appear, but the Data Processor must be able to clean up the MSMQ quickly; otherwise it may not be able to handle database blackouts or maintenance. (Standard Poller performance is affected by DB performance significantly.)
Before using this counter, set the correct instance beginning with: <HOSTNAME>\private$\solarwinds\collector\processingqueue
where <HOSTNAME>
is the hostname (without < >) of the target server.
For example: APMhost
By default, the instance is set to: <HOSTNAME>\private$\solarwinds\collector\processingqueue\solarwinds.node.hardwarehealth.wmi
To find all available instances, run the PerfMon utility and search for “Messages in Queue” counter in the “MSMQ Queue” category.
This monitor is disabled by default. Enable through the component monitor settings.
Perfmon DPPL Avg. Time to Process Item
Returns the time needed to process one item. If this number is 1, it means you are able to process one item per second. 0.01 means 100 items per second. The returned value should be as low as possible.
Note: This monitor is disabled by default.
Perfmon DPPL Waiting Items
Returns items in the queue pulled from the message queue but waiting for other results to be processed. This should be less than 40. If this number is holding at or above 40, this may indicate issues concerning DB response time, performance issues, or many down elements.
Note: This monitor is disabled by default.
Process Monitor - SWJobEngineWorker2.exe
Returns the number of Job Engine worker processes and its CPU and memory usage. A value of 10 or lower is acceptable. If the returned value is 100 or greater, there may be problems with jobs hanging.
Process Monitor - SWJobEngineWorker2x64
Returns the number of Job Engine worker processes and its CPU and memory usage. A value of 10 or lower is acceptable. If the returned value is 100 or greater, there may be problems with jobs hanging.
RabbitMQ Folder Size
Returns the SolarWinds Platform RabbitMQ folder size. If the folder is growing, RabbitMQ is writing messages not being delivered to disk, or the machine is under memory pressure.
Note: This monitor is disabled by default.
SWIS PubSub Messages Queued
The total number of Message Queuing messages that currently reside in the SWIS publish-subscribe (PubSub) queue. If more messages are sent than subscribers can process, or cannot be delivered, RabbitMQ continues growing. The size of the queue should be near 0 almost all of the time. Some spikes may appear, but SWIS needs to be able to clean up the message queues quickly.
Note: This monitor is disabled by default.
TCP Port Usage Count
The number of TCP ports in use.