Balance load on Additional polling engines in an HA pool
Engine load balancing automatically redistributes monitored nodes between polling engines in a high availability (HA) pool to prevent any single engine from becoming overloaded. It uses a health score and node reassignment limits to keep engines within a healthy operating range and avoid sudden, disruptive changes. The health score is calculated from engine metrics and is influenced by configurable settings, such as metric weights, thresholds, and reassignment limits.
Use engine load balancing to maintain consistent engine health, improve performance, and prevent uneven load across your environment without the ongoing administrative overhead of manually rebalancing polling engines.
Requirements
To use engine load balancing, the following requirements must be met.
-
Additional polling engines must be members of the same HA pool.
-
The HA pool must include at least two active servers and one standby server.
-
Engine load balancing must be enabled for the HA pool on the Pools tab.
-
Additional polling engines must be monitored as nodes, so that CPU, memory, and packet loss metrics are available. If some metrics are unavailable, engine load balancing uses the available metrics. An engine is skipped only if no valid metrics are available and a health score cannot be calculated.
-
Engines in the same HA pool must be able to reach the same monitored nodes.
Engine load balancing does not require a virtual IP (VIP) or virtual hostname. You only need to create an HA pool with at least one standby server.
Current limitations
-
Agent‑polled, WMI/WinRM‑polled, and Remote Collector‑polled nodes are never moved. Nodes with installed agents, orchestrators and their edge nodes, and nodes with automatic dependencies are also excluded from load balancing. These nodes remain on their original polling engine.
-
Advanced dependency‑aware balancing (keeping complex application dependencies on the same engine) is not available in this phase.
How engine load balancing works
Engine load balancing periodically evaluates polling engines in each eligible HA pool and moves a limited percentage of nodes away from overloaded engines toward healthier ones.
Engine health score and metrics
Each engine gets a health score from 0% to 100% (higher is healthier), based on the following metrics:
-
Polling completion (higher is better)
-
CPU load (lower is better)
-
Percent memory used (lower is better)
-
Percent packet loss (lower is better)
If some metrics are missing, engine load balancing uses only the available metrics and rescales weights. If no metrics exist for an engine, its health score is treated as null and the engine is skipped.
Health score threshold and safe margin
Two settings control when engines are considered unhealthy and which ones can receive more nodes:
-
Health score threshold: Engines below this value are unhealthy and can shed nodes.
-
Health score safe margin: Engines must be above (threshold + margin) to receive additional nodes.
Example: If the threshold is 80 and margin 5, only engines with health score > 85 can receive new nodes. Engines with scores between 80 and 85 will neither shed nor receive nodes.
Engine capacity and node distribution
When nodes are removed from an unhealthy engine, they are distributed across healthy engines in proportion to each engine’s available capacity. Engines with more capacity receive a larger share of the redistributed nodes.
Node reassign percentage
The node reassign percentage limits how many nodes can move from a single engine during one run. It is defined as a percentage of total number of nodes assigned to that engine; only eligible nodes are considered for the actual moves.
Example: If the node reassign percentage is 10 and an engine has 1,000 nodes, at most 100 eligible nodes can be moved from that engine in a single run. This reduces churn and lets load stabilize over time.
Enable engine load balancing for an HA pool
Engine load balancing is configured per HA pool.
-
In the SolarWinds Platform Web Console, click Settings > My Deployment.
-
Go to the Pools tab.
-
Select an HA pool that includes at least two active servers and at least one standby server.
-
Click Enable Engine Load Balancing.
When enabled, polling engine load is automatically balanced according to engine load balancing settings.
Configure global engine load balancing settings
The load balancing settings control how frequently load balancing occurs and how sensitive it is to engine health changes.
These settings usually don’t require adjustment. Change them only when technical support specifically recommends it.
-
In the SolarWinds Platform Web Console, click Settings > My Deployment.
-
Go to the Pools tab.
-
Click High Availability Settings.
-
On High Availability Settings, click Engine Load Balancing Settings.
-
Configure the following settings:
-
Task interval (engine load balancing task interval): How often the automatic task runs (in minutes).
-
Metric weights: The relative importance of CPU load, memory usage, polling completion, and packet loss in the health score. All weights must sum to 1.0.
-
Health score threshold: Engines below this score are considered unhealthy and can shed nodes.
-
Health score safe margin: The additional buffer above the threshold required for an engine to receive nodes.
-
Node reassignment percentage: The maximum percentage of nodes that can be moved from one engine per run.
-
Max node reassignments per time window: The maximum number of reassignments allowed within a specified time window.
-
Node reassignment time window: The time period used to calculate how many node reassignments occur.
-
Which pools were enabled/disabled, which nodes were included or excluded.
-
-
Save your changes.
If engine load balancing is too aggressive, lower the node reassignment percentage or increase the safe margin. If it is too conservative, consider gently lowering the threshold or increasing the node reassignment percentage.
Prevent repeated node reassignments
Configure the maximum number of reassignments within a specified time frame to maintain stable load distribution.
-
Go to Engine Load Balancing Settings (Settings > My Deployment > Pools tab > High Availability Settings > Engine Load Balancing Settings).
-
Specify the maximum node reassignments per time window. The default value is 2.
-
Specify the node reassignment time window. The default value is 24 hours.
-
Save your changes.
Monitor engine load balancing activity
Use audit events in the Message Center to verify and troubleshoot engine load balancing.
-
In the SolarWinds Platform Web Console, click Alerts & Activity > Message Center.
-
Ensure Show Audit Events is selected.
-
To see when jobs ran and what they did, filter for Engine Load Balancing Execution action type and click Apply.
Execution events show:
-
When an engine load balancing job started.
-
Which pools were processed or skipped and why (for example, all engines healthy, no eligible targets, too few engines, no nodes, or errors).
-
A per‑pool summary such as total nodes moved or a reason why no balancing occurred.
-
-
To see which nodes were re-assigned,
-
Filter by Engine Load Balancing Node Reassigned action type.
-
Narrow the time range to match the relevant execution events.
Each node‑level event shows the node name and IP, the previous polling engine, and the new polling engine.
-
Together, these events provide a complete, auditable history of engine load balancing actions.
Include or exclude nodes for load balancing
Including a node in load balancing removes only its explicit exclusion from Engine Load Balancing (ELB). It does not override the standard ELB eligibility rules. Some node types—such as agent-managed nodes, orchestrators and their edge nodes, and nodes with automatic dependencies—remain ineligible for reassignment, even when included in ELB.
-
In the SolarWinds Platform Web Console, click Settings > Manage Nodes.
-
Select one or more nodes.
-
Expand More actions and select what you want to do:
-
Include in Load Balancing: The nodes participate in load balancing and may be reassigned to another polling engine when the polling engine load is high.
-
Exclude from Load Balancing: The nodes are excluded from load balancing and will no longer be reassigned to another polling engine.
-
-
Save your settings.
Troubleshooting
Engine load balancing actions are not visible
-
Confirm the selected pool is a valid HA pool with additional polling engines and at least one standby server.
-
Verify that engine load balancing is enabled for that pool on the Pool tab.
Automatic engine load balancing does not run or has no effect
-
Ensure that the task interval is greater than zero and set to a reasonable value (for example, 60 minutes).
-
Confirm the background plugin is running on the main polling engine.
-
Use SolarWinds Log Adjuster to enable engine load balancing logs, then review
EngineLoadBalancing.BusinessLayer.logfor task start messages and pool decisions.
Engine load balancing runs but no nodes move
-
Review execution events for reasons such as “no reassignable nodes,” “no nodes are assigned,” or “no eligible target engines available.”
-
Confirm that overloaded engines host eligible nodes.
Non-reassignable nodes include:
-
Nodes polled via agents, via WMI/WinRM, or via a Remote Collector
-
Nodes with installed agents
-
Orchestrators and edge nodes
-
Nodes with automatic dependencies
-
Explicitly excluded nodes
-
Nodes temporarily blocked by reassignment limits
-
-
Verify that the node reassignment percentage does not round down to zero nodes.
-
Ensure at least one engine in the pool has a health score above (threshold + margin) to act as a target.
-
If all engines in a pool are below the safe threshold, engine load balancing will not move nodes because there are no safe targets.
Too many nodes moving (churn)
-
Lower the node reassignment percentage.
-
Increase the safe margin so only clearly healthy engines receive more nodes.
-
Increase the task interval if runs are occurring too frequently.
Tuning these settings and reviewing audit events and logs lets you control how quickly engine load balancing reacts and makes it easier to diagnose unexpected behavior.