Documentation forSolarWinds Observability

Examples of common alert definitions

The following sections provide instructions for creating commonly used alert definitions. These are provided as examples, and you can modify them to meet your needs.

Alerts that are created automatically when you configure a website are described under Website out of the box alerts.

Alert me when a SolarWinds Observability entity stops reporting

To be alerted when an entity stops reporting (for example, it is disconnected or down), select any metric value that the entity returns each time it is polled. Then create an alert that is triggered when the count for that metric is 0.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Metric condition.

  5. Under Alert on, select Entity.

  6. Under Select a scope, select the type of entity. Then specify which entities you want to alert on. For more information, see Create an alert definition.

  7. Under Condition 1, define a condition that triggers the alert when the entity has stopped reporting:

    1. Under Metric, select a metric whose value is returned regularly when the metric is being monitored. For example, for a Host entity, select system.cpu.utilization.aggregated.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select lower than. Then enter 1.

    3. Under During last, enter 1 and select hours. As the aggregation method, select count.

  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.

Alert me when disk utilization on a host is 90% or higher

This alert is triggered when the average file system utilization is 90% or higher.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Metric condition.

  5. Under Alert on, select Entity.

  6. Under Select a scope, select Host as the type of entity. Then specify which hosts you want to alert on. For more information, see Create an alert definition.

  7. Under Condition 1, define a condition that triggers the alert when disk utilization is equal to or above 90%:

    1. Under Metric, select system.filesystem.utilization.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select higher than or equal to. Then enter 90.

    3. Under During last, enter 1 and select hours. As the aggregation method, select average.

  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.

Alert me when either memory utilization or CPU utilization on a host is 90% or higher

This alert is triggered when the average memory utilization or the average CPU utilization is 90% or higher.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Metric condition.

  5. Under Alert on, select Entity.

  6. Under Select a scope, select Host as the type of entity. Then specify which hosts you want to alert on. For more information, see Create an alert definition.

  7. Under Condition 1, define a condition that triggers the alert when memory utilization is equal to or above 90%:

    1. Under Metric, select system.memory.utilization.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select higher than or equal to. Then enter 90.

    3. Under During last, enter 1 and select hours. As the aggregation method, select average.

  8. Click Add New Condition, and then click At least one condition is true (OR).

  9. Under Condition 2, define a condition that triggers the alert when CPU utilization is equal to or above 90%:

    1. Under Metric, select system.cpu.utilization.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select higher than or equal to. Then enter 90.

    3. Under During last, enter 1 and select hours. As the aggregation method, select average.

  10. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  11. On the Summary page, review the alert definition, and then click Create.

Alert when a host process is down, has high CPU utilization, or has high memory utilization

You can create quick alerts from the Process tab of a host's detail view. You can get alerts when the process is down or when the CPU or memory utilization is high. The alert automatically specifies the host and metric.

  1. In the left pane, click Explore.

  2. Locate the host you want to alert on, and click the host name to open the details view.

  3. Click the Processes tab.

  4. Locate the process you want to alert on.

  5. Hover over the table row, and click the vertical ellipsis () in the far-right column. Then click one of the following:

    • Create alert on Process Down
    • Create alert on CPU Utilization
    • Create alert on Memory Usage

    The Create Quick Alert dialog opens with default values for the Name and Severity.

  6. (Optional) Add a description and runbook URL, and change the default values on the Details page. Then click Next.

  7. Specify the trigger condition:

    • For a process down alert, under Trigger when metric is, select equal to. Then enter 0.

    • For a CPU or memory utilization alert, enter the threshold (for example, 90).

  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.

Alert me when a network interface's utilization peaks at 100%

This alert is triggered when a network interface's utilization reaches 100% during a 5 minute period.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Metric condition.

  5. Under Alert on, select Entity.

  6. Under Select a scope, select Network Interface as the type of entity. Then specify which network interfaces you want to alert on. For more information, see Create an alert definition.

  7. Under Condition 1, define a condition that triggers the alert when transmit utilization peaks at 100%:

    1. Under Metric, select Orion.NPM.InterfaceTraffic.InPercentUtil.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select equal to. Then enter 100.

    3. Under During last, enter 5 and select minutes. As the aggregation method, select maximum.

  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.

Alert me when a network device reports an anomalous metric value

This alert is triggered when a network device reports a value for CPU utilization, memory utilization, response time, or packet loss that is significantly higher than usual.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Event condition.

  5. Under Select a scope, select Network Interface as the type of entity. Then specify which network interfaces you want to alert on. For more information, see Create an alert definition.

  6. Under Event type, select Anomaly.

  7. Under Metric, select one of the following:

    • For CPU utilization: Orion.CPULoad.AvgLoad
    • For memory utilization: Orion.CPULoad.AvgPercentMemoryUsed
    • For response time: Orion.ResponseTime.AvgResponseTimed
    • For packet loss: Orion.ResponseTime.PercentLoss
  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.

Alert me when a network device is not responsive

This alert is triggered when a network device's average response time is higher than 80% for five minutes.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Metric condition.

  5. Under Alert on, select Entity.

  6. Under Select a scope, select Network Device as the type of entity. Then specify which network devices you want to alert on. For more information, see Create an alert definition.

  7. Under Condition 1, define a condition that triggers the alert when a device is not responsive:

    1. Under Metric, select Orion.ResponseTime.PercentLoss.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select higher than. Then enter 80.

    3. Under During last, enter 5 and select minutes. As the aggregation method, select average.

  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.

Alert me when a network device's health score low

This alert is triggered when a network device's health score is less than 90 for 10 minutes.

  1. Open the Active Alerts page (Alerts > Active Alerts) or the Alert Settings page (Alerts > Alert Settings).

  2. In the upper-right corner, click Create Alert.

    The Create Alert wizard opens.

  3. On the Details page, specify the name and severity. Optionally, enter a description and a runbook URL. Then click Next.

  4. Under Condition type, click Metric condition.

  5. Under Alert on, select Entity.

  6. Under Select a scope, select Network Device as the type of entity. Then specify which network devices you want to alert on. For more information, see Create an alert definition.

  7. Under Condition 1, define a condition that triggers the alert when a device is not responsive:

    1. Under Metric, select sw.metrics.healthscore.

      Enter part of the metric name to filter the list. For metric descriptions, see Metrics for SolarWinds Observability entities.

    2. Under Trigger when metric is, select lower than. Then enter 90.

    3. Under During last, enter 10 and select minutes. As the aggregation method, select average.

  8. Click Next to open the Notifications tab, and define one or more notifications to be sent when this alert is triggered. For more information, see Create an alert definition.

  9. On the Summary page, review the alert definition, and then click Create.