Calculate node status in the Orion Platform
This topic applies to all Orion Platform products.
By default, node status is detected using ICMP: The Orion Platform sends a ping request. If the Orion Platform does not receive the response, it places the node into the Warning state and fast-polls the device for 120 seconds. If the node still does not respond, the Orion Platform notifies you that the node is Down.
ICMP only tells you the Orion Platform did not receive a response for the ping request. The device could be down, but there might also be a routing problem, an intermediary device could be down, or something could have blocked the packet on its way to or from the device. See Get more details about the node in the NPM Getting Started Guide for more details.
Status of sub-elements, such as interfaces and volumes, is detected using SNMP. This is more accurate, because the device tells you that the sub-element is Down.
Orion Platform status options
In Orion Platform 2018.4 and earlier, the node status icon consisted of two circles - the large one reflected whether the node was up or down, the small one provided information about additional metrics.
Starting with Orion Platform 2019.2, you can keep using this "classic" method for calculating node status, or switch to the "enhanced" node status calculation where you can select what "contributors" are reflected in the status. A contributor is a status of a metric or feature that can influence the node status, such as the status of an interface, a hardware health sensor, or even a threshold.
Based on your settings, the following items might be reflected in node status:
- Node thresholds: Both global thresholds and thresholds specified for individual nodes are now reflected in the node status.
- Child objects: The status of interfaces, hardware health, or applications monitored on a node is reflected in the node status.
Where can you see a change when you switch to the enhanced node status calculation?
- Orion Maps: The status of mapped objects reflects the status of components used to calculate the node status.
- Groups: With enhanced status calculation, you can only use nodes to form a group. Status of all child objects associated with them will be reflected in the node status.
- Alerts: Thresholds and child objects now influence node status so you no longer need alert definitions for individual metrics or child entities.
To keep you informed about what went wrong, new macros for root cause were added:
If you add these variables to the trigger action message, the notification will include any node thresholds which have been crossed, as well as a list of all child objects in a degraded state associated with the node.
Node tooltips: If you position your cursor over a node in the critical status in any widget, the child entities causing the problem are listed at the bottom of the tooltip.
The enhanced node status calculation is enabled by default on new installations.
If you upgraded from a previous Orion Platform version, you might need to configure the feature.
Before you enable enhanced status calculation, SolarWinds recommends that you disable alert actions in the Alert Manager.
Click Alerts & Activity > Alerts, and click More > Pause actions of all alerts. After you enable the feature, check active alert, tweak the alerts that should not trigger, and re-enable alert actions.
- Enable the feature.
- Specify what should influence your node status. This step is optional, as SolarWinds provides with a default combination of enabled contributors.
- Adjust the status rollup mode for individual nodes. This step is optional. The default option is Mixed status.
Starting with Orion Platform 2019.2, this is the default option.
- Click Settings > All Settings > Polling Settings.
- Scroll down to Node Status calculation, and select Enhanced.
- Submit your changes.
Including a group contributors, such as interfaces, into the node status calculation means that the node status turns to warning when any of the entities on the node is down.
- Click Settings > All Settings and scroll down to Thresholds & Polling.
- Click Node Child Status Participation.
- Review the list of components that can influence the node status. Available items depend on the Orion Platform products you have installed.
- Enable items to be included in node status calculation.
SolarWinds recommends that you keep the default settings.
Exclude specific entities from node status calculation
Excluding the status of specific entities, such as interfaces or applications, from node status calculation is not supported. If you do not want a child issue to participate in its parent's status, consider the following options:
- Remove the entity from monitoring.
- Unmanage the entity. For details on unmanaging interfaces, see Suspend collecting data for interfaces.
- Change the parent node's rollup mode not to be affected by this child status.
- Remove all entities of this type from participating in the node status.
You have now defined the components that influence the node status calculation. Decide how to use them in the status calculation. Orion Platform uses status roll-up mode to set how the partial components of node status should be evaluated.
The total node status is the worst status among the configured options.
The total node status is the best status among the configured options.
Mixed (default option)
The global node status combines all specified contributors.
Review the following table:
|Final Node Status||Polled Status||Child 1 Status||Child 2 Status|
|CRITICAL||UP or WARNING||UP||CRITICAL|
|CRITICAL||UP or WARNING||DOWN||CRITICAL|
|WARNING||UP or WARNING||UP||WARNING|
|WARNING||UP or WARNING||UP||DOWN|
|WARNING||UP or WARNING||UP||UNREACHABLE|
|WARNING||UP or WARNING||DOWN||WARNING|
|WARNING||UP or WARNING||DOWN||UNKNOWN|
|WARNING||UP or WARNING||DOWN||DOWN|
By default, the status rollup mode is set to mixed. To change it, edit it in the settings for individual nodes:
- Go to the node details page, or select the node on Manage Nodes.
- Click Edit Node.
- Select the status rollup mode for the node, and submit your changes.