Monitor outages and downtime with Digital Experience
Synthetic monitoring checks help you monitor whether your entity is Up (available) or Down (unavailable). Outages are recorded based on synthetic monitoring test results. Currently, by default, outages occur when your website or URI entity monitored with SolarWinds Observability is down for at least two consecutive synthetic monitoring checks (Availablitiy Check, Ping check, or TCP port check) in any region within a selected time period.
When an outage is triggered an outage event is recorded in SolarWinds Observability. The outage is cleared after one successful synthetic check, and an outage cleared event is recorded.
Prolonged or intermittent outages can be detrimental to your website or URI by causing you to lose potential revenue or by frustrating your customer base. Outage details are available for all monitored entities configured with availability checks on the Availability page (Digital Experience > Availability) or on the Availability tab in the entity details view.
How are outages measured?
Digital Experience measures an outage after a monitored entity fails two consecutive Availability checks in any given region. For example, let's say that we've configured our website for HTTPS availability checks within the North America region that execute in five minute intervals. Outages don't take individual probes failures into account, because a region such as North America will have multiple probes. To avoid false positive outages, the default is set to two consecutive failures, or recorded down periods for your website.
In our example, an outage would be recorded for our website once two checks in the North America region record a down status in consecutive five minute test intervals. That means that if our first HTTPS availability check fails during one five minute interval, and then fails after an immediate second 5 minute interval (within a total of ten minutes), an outage would be recorded for our monitored website.
If instead we were testing from several regions, such as North America and Europe, one failure in North America, and then a second consecutive failure in Europe would result in an outage and any combinations of two consecutive failures would also result in an outage.
What are the best practices for recording website outages?
Outages are often time critical, and can indicate significant issues with your website or URI. SolarWinds Observability does not charge for the interval with which the availability checks are run. SolarWinds recommends setting your availability check testing interval to one minute to be aware of outages and potential issues as quickly as possible.
Remember, two consecutive failures (down statuses) result in an outage. If you're testing interval for your monitored website or URI entity is set to a longer interval, such as one hour, SolarWinds Observability would not record an outage until two consecutive failures, which in this example would be two hours of down time.
How do I view website or URI outages?
Outages for all entities that have a configured synthetic check can be viewed by going to Digital Experience > Availability. Outages for an individual website or URI entity can be viewed on the Availability tab in the entity details view.
Can I be alerted about an outage?
Yes, you can be alerted if your website or URI entity has an outage with Alert Settings Templates. Outages are recorded for a website or URI after two consecutive test failures in any region.
-
Click Alerts > Alert settings and then click the Templates tab.
-
Use the applicable alert template:
-
Click the vertical ellipsis for Critical Entity Metric URI down for URIs
-
Click the vertical ellipsis for Critical Entity Metric Website down for websites.
-
Create an alert from either template, and set to be alerted for two down statuses.
Can I change the outage defaults?
Currently, the defaults for an outage are two consecutive test failures in any region. In the future, you will be able to modify the failing test regions, and number of failures needed to record an outage.