Documentation forSQL Sentry

Performance Analysis AlwaysOn

Applies to: SQL Server and SSAS targets. 

Introduction

Access the SQL Sentry AlwaysOn Management page by right-clicking the All Targets node in the Navigator pane and selecting Open > AlwaysOn Management.

Open AlwaysOn Management in the Navigator pane

Open AlwaysOn Management from the context menu of any Site or Group level node as applicable. The AlwaysOn tab is available within SQL Sentry for any monitored SQL Server instance hosting an availability group replica.

The AlwaysOn Management page is divided into the following three areas:

AlwaysOn Management Page area Description
Overview area Offers several unique actionable views of your AlwaysOn environment, including easily digestible at-a-glance status information.
History area Displays both historical charts and AlwaysOn health events from your environment.
Details Grid view area Displays both high level and detailed level metrics concerning your environment.

Customize visual aspects of the AlwaysOn views around your unique environment, including the default thresholds for Log and Recovery Queues. For more information, see the Customization section.

AlwaysOn Management includes full alerting capability surrounding your AlwaysOn environment. Fully customizable Conditions alert you on both Health and Failover status. For more information, see the AlwaysOn Alerting topic.

Server Visibility

When you monitor the SQL Server instance hosting the Primary Replica of an availability group with SQL Sentry, the entire topology of that availability group displays as part of the Overview area. Unwatched instances hosting replicas display with a gray background.

Note:  To enable complete AlwaysOn monitoring and alerting, watch each replica in the availability group with SQL Sentry. Status information is unavailable for unwatched replicas, including historical metrics, charts, and data pertaining to any associated Send or Recovery Queues.

To monitor a server, right-click the desired node, and then select Watch from the context menu.

AlwaysOn tab right click and select Watch from the context menu

Overview Area

The Overview area contains several layouts for managing your AlwaysOn environment from several perspectives, including the Windows Server Failover Cluster level, the Availability Group level, the SQL Server instance level, and the Replica level.

Change the Layout Style by selecting options from the drop-down list located in the top corner of the Overview area.

AlwaysOn tab Layout Style drop-down list

Note:  Status information that's common to several views is found in the Status Information section.

Status Information

Each of the Overview diagrams contains a unique representation of individual availability replicas/WSFCs that are part of your managed AlwaysOn environment. Regardless of the selected view, these nodes and the pipes connecting them all share a common set of visual status information.

Context Menu Options

Context Menu Option Description Image
Watch  Watch the SQL Server instance hosting the replica with SQL Sentry Performance Analysis. AlwaysOn tab Watch context menu option

Node Specific

Nodes convey status and configuration information about the replica, including Queue Health, Connection mode, and overall Health Status.AlwaysOn tab Diagram

Labels

Label Description
Instance Name On applicable views, the node contains a label with the SQL Server instance name.
Availability Replica Role Each node is labeled indicating the role of the availability replica, as AG Primary, AG Secondary, or FCI Secondary.

Indicator bars

Indicator bar Description
Health Status The health status or availability group state is represented by the left bar of the node. Green represents a healthy availability group state. Pink represents an unhealthy availability group state. The Health Status bar is also be pink if connectivity is lost between the SQL Sentry monitoring service and the SQL Server instance.
Queue Indicator The Log Send Queue / Recovery Queue status is visually represented by the bar on the right. As the respective queue grows, the color changes from a dark green, to light green, to orange. Once the Queue KB Threshold passes, the color changes to red.

Status boxes

Status box Description
Failover Mode The Failover mode of the availability replica is indicated in the left status box.
Automatic Failover mode Represented with a play icon.
Manual mode Doesn't contain an icon.
Client Connection mode The client connection mode of the availability replica is represented in the center status box. If the availability replica is a readable secondary it's represented with a small glyph.
Failover Cluster Instances If the node is a Failover ClusterInstance (FCI), it's represented in the right status box.

Availability mode

The availability mode of the availability replica is represented as follows:

Availability mode Description
Asynchronous-commitmode Represented by two black lines drawn on the pipe between availability replicas.
Synchronous-commit mode Represented by the absence of these two black lines drawn on the pipe.
Additional Information: For information about troubleshooting data movement latency between synchronous-commit AlwaysOn availability groups, see the following MSDN article.

Pipe and Color Specifics

The visual characteristics of the pipes connecting the nodes convey status information about the health relationship between nodes. Pipe saturation is an indicator of the amount of information moving between replicas. Pipe color is dependent on the recovery queue of the secondary.

SQL Sentry AlwaysOn Pipe Saturation Diagram     

Pipe Saturation

The pipe becomes saturated as the amount of data being transferred increases. More notably, the pipe becomes saturated as the KB To Replica/sec value approaches the KB To Replica/sec threshold.

When opening the AlwaysOn tab, the KB To Replica/sec threshold value is dynamic and is the highest global value seen between all availability groups and replicas for the current session by default.

Customize the KB To Replica/sec threshold with the AlwaysOn.AvailabilityReplicaTransferThreshold table in the SQL Sentry database. For more information, see the Customization section.

Pipe Color

The pipe color changes as the recovery queue grows; the color changes from a dark green, to light green, to orange, and finally once the RecoveryQueue KB Threshold is passed the color changes to red.

By default, the Recovery Queue KB Threshold value is one MB or 1024 KB. Customize the Recovery Queue KB Threshold with the AlwaysOn.AvailabilityReplica table in the SQL Sentry database. For more information, see the Customization section.

History Area

The History area contains tabs for Charting, Error Log, and Replica StateChanges. Information in each tab displays based on the selected time range. In Live mode, each tab includes the last 10 minutes of collected data. Updating the time range on the tool bar changes this displayed range. Each of the History tabs is context aware and displays information based on the node or row selected in the Overview or Grid view area. Selecting on the diagram background displays log information for all nodes and replicas.

AlwaysOn tab History area highlighted

AlwaysOn Health Collection

The Error Log and Replica State Changes tabs are initially disabled because AlwaysOn health collection is disabled by default. This prevents innocuous entries associated with Extended Events from being repeatedly logged to the SQL Server Error Log without your knowledge or approval. For more information, see the following update for a link to CU 6 for SQL Server 2012 SP1, that contains a fix for this issue.

Note:  To enable health collection, select Enable on the Error Log or Replica State Changes tab.
Important:  Enabling automated AlwaysOn health (error and state change info) collection causes innocuous entries associated with Extended Events to be repeatedly logged to the SQL Server Error Log, as described in Microsoft Connect item 783036. We are working closely with Microsoft to correct this unfortunate behavior. Until it's corrected,  the entry Using 'dbghelp.dll' version '4.0.5', is sent to the Error Log on all SQL Servers with Availability Groups monitored by SQL Sentry, at the rate specified by the collection interval.
Additional Information: Cumulative Update package 6 for SQL Server 2012 SP1 addresses this issue. Once the update is applied, innocuous entries associated with Extended Events are no longer logged to the SQL Server Error Log.

Adjust the enabled status and interval from the Advanced Global Settings (Navigator > Configuration > Global SettingsAdvanced tab).

SQL Sentry Monitoring Service Settings Advanced tab Enable AlwaysOn Health Collection setting

Charting Area

Various charts related to your managed AlwaysOn environment are available in the expandable History area of the AlwaysOn Management tab. The Charting area is context aware and displays charts based on the selected node or row.

AlwaysOn tab corresponding grids and graphs

For example, from the WSFC Node Group/Matrix layout, select a Primary Availability Replica, to display charts for the KB to Replicas/sec and Log Send Queue KB metrics.

AlwaysOn tab example one

Selecting a Secondary Availability Replica displays charts with the KB from Replicas/sec and Recovery Queue KB metrics.

AlwaysOn tab example two

Selecting a WSFC in the WSFC Members layout displays relevant metrics for each respective Availability Group hosted on the WSFC, including KB To Replicas/sec and KB from Replicas/sec metrics.

AlwaysOn tab WSFC Members layout

Error Log

The Error Log allows you to view any AlwaysOn Health events that are classified as errors in your managed environment. This log is equivalent to viewing AlwaysOn Error Health Events accessed from the native AlwaysOn Dashboard, but offers a number of advantages.

The Error Log is context aware and displays only errors related to the selected node or row. In Live mode, this includes the last 10 minutes of collected data. Updating the time range on the toolbar changes this displayed range, allowing you to view a historical subset of errors for any server.

Replica State Changes

The Replica State Changes tab displays AlwaysOn Health events that are classified as Replica State Changes. This log is equivalent to viewing AlwaysOn availability_replica_state_change health events accessed from the native AlwaysOn Dashboard, but offers several advantages.

The Replica State Changes tab is context aware and displays only state changes related to the selected node or row. In Live mode, this includes the last 10 minutes of collected data. Updating the time range on the toolbar changes this displayed range, allowing you to view a historical subset of state changes for any server.

AlwaysOn Alerting

SQL Sentry AlwaysOn Management includes the ability to configure customizable alerts specific to the status and health of your monitored Availability Groups.

The following Failsafe Conditions are available:

  • SQL Availability Group Failover
  • SQL Availability Replica Healthy
  • SQL Availability Replica Unhealthy

Each of these conditions exposes several Condition Filters that help you customize your desired alerts.

The following Advisory Conditions are available:

  • Availability Replicas Hosted on Same Virtual Host
  • Availability Replicas With Disks in Same Datastore
  • High Database Replica Send or Recovery Queue
  • High Redo Completion Time

See the Advisory Conditions Pack article for details on each.

Configuring Alerts

  1. Select the desired node for the level you'd like to configure the action for in the Navigator pane (View > Navigator), and then select Failsafe Conditions from the drop-down list in the Conditions pane (View > Conditions).
    Select the desired site in the Navigator and Failsafe Conditions in the Conditions pane
  2. Select Add to open the Select Action window, and then expand the applicable object and condition. Add Failsafe Condition in the Conditions pane
  3. Use the check box(es) to select which actions should be taken in response to this condition being met, and then select OK to save your settings. Actions Selector Failsafe Conditions
For more information about actions that can be taken when a condition is met, see the Actions topic.

Customization

Customizing Pipe Saturation

When opening the AlwaysOn tab, the KB To Replica/sec threshold value is dynamic and is the highest global value seen between all availability groups and replicas for the current session by default. Customize the KB To Replica/sec Threshold with AlwaysOn.AvailabilityReplicaTransferThreshold table in the SQL Sentry database.

Note:  The AlwaysOn.AvailabilityReplicaTransferThreshold table acts as a mapping table allowing you to define a specific value/threshold for the maximum allowed bytes (KB to Replica/sec Threshold). By default, the table doesn't have any records, and in this state the threshold is dynamic.

Example One

Set a new KB To Replica/sec threshold value specific for all replicas involved in a WSFC node – WSFC node relationship, by completing the following:

  1. Add a record with the PrimaryNodeName, SecondaryNodeName, and MaxBytesSentToSecondaryPerSec values defined. The WSFC node names are found in the AlwaysOn.ClusterNode and AlwaysOn.AvailabilityReplica tables.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold]
				(PrimaryNodeName, SecondaryNodeName, MaxBytesSentToSecondaryPerSec)
		VALUES ('MyPrimaryNode','MySecondaryNode', 1024);
ID PrimaryNodeName SecondaryNodeName MaxBytesSentToSecondaryPerSec
1 MyPrimaryNode MySecondaryNode 1024

Success: The KB To Replica/sec threshold value is now 1 KB for the defined replicas.

Example Two

Set a new global KB To Replica/sec Threshold value by completing the following steps:

  1. Add a record with just a MaxBytesSentToSecondaryPerSec value. For example, to set a global value of 5 KB.
  2. Set a global value of 5 KB.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold]
				(MaxBytesSentToSecondaryPerSec)
		VALUES (5120);
ID PrimaryNodeName SecondaryNodeName MaxBytesSentToSecondaryPerSec
1 Null Null 5120

Success: The new global KB To Replica/sec threshold value is now 5 KB.

Example Three

Set a new KB To Replica/sec Threshold value specific to a Primary Replica - Secondary Replica relationship, by completing the following steps:

  1. Add a record with the PrimaryNodeName, SecondaryNodeName, and MaxBytesSentToSecondaryPerSec values defined.
  2. Set a value specific to a Relationship.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold]
				(PrimaryNodeName, SecondaryNodeName, MaxBytesSentToSecondaryPerSec)
		VALUES ('MyPrimary','MySecondary', 5120);
ID PrimaryNodeName SecondaryNodeName MaxBytesSentToSecondaryPerSec
1 MyPrimary MySecondary 5120

Success: The KB To Replica/sec threshold value for the specific MyPrimary and MySecondary is now 5 KB.The default global value is used for all other replicas.

Example Four

Set a new KB To Replica/sec Threshold value for a specific Primary Replica, to be used with all of its Secondary Replicas, by completing the following steps:

  1. Add a record with just the PrimaryNodeName and MaxBytesSentToSecondaryPerSec values defined.
  2. Set a value specific to a Primary Replica to be used with all of its Secondaries.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold]
				(PrimaryNodeName, SecondaryNodeName, MaxBytesSentToSecondaryPerSec)
		VALUES ('MyPrimary', Null, 5120);
ID PrimaryNodeName SecondaryNodeName MaxBytesSentToSecondaryPerSec
1 MyPrimary Null 5120

Success: The KB To Replica/sec Threshold value for all Secondary Replicas in a relationship with the MyPrimary replica is now 5 KB.

Note:  Add multiple records to the AlwaysOn.AvailabilityReplicaTransferThreshold table, allowing you to easily define KB To Replica/sec thresholds specific to your environment.

Customizing Pipe and Queue Indicator Colors

By default, there's a global Recovery Queue and Log Send Queue KB Threshold value of 1 MB or 1024 KB. These values can be customized with the AlwaysOn.AvailabilityReplica table in the SQL Sentry database using the LogSendQueueKBMax and RecoveryQueueKBMax columns.

Each Replica in your monitored environment has an entry in the AlwaysOn.AvailabilityReplica table. By default, the LogSendQueueKBMax and RecoveryQueueKBMax columns are null. When this is the case, the default global value (1 MB) is used for each respective queue. Set these columns to your desired value for each of your replicas.

Example One

Update the Log Send Queue KB Threshold value for the replica named MyPrimary to 10 MBs.

Set a new Log Send Queue KB Threshold value for my replica MyPrimary

UPDATE [SentryOne].[AlwaysOn].[AvailabilityReplica]
				SET LogSendQueueKBMax = 10240 
		WHERE ReplicaServerName = 'MyPrimary';

Success: The replica MyPrimary now has a Log Send Queue KB Threshold value of 10,240 KB or 10 MB.

ID ReplicaServerName LogSendQueueKBMax RecoveryQueueKBMax
1 MyPrimary 10240 Null

Distributed Availability Groups

Note:  See the information above in this article for details on icons and the data flow pipeline as the general features for Availability Groups within the AlwaysOn tab remain the same for Distributed Availability Groups.

DAG-01 Example View

The following Distributed Availability Group setup is used in the SQL Sentry AlwaysOn tab examples below:

SQL Sentry Performance Analysis Always On Distributed Availability Group Setup Example

DAG-01 has a primary availability group (AG-02) and a secondary availability group (AG-01). Within those availability groups, there's a primary to secondary replica relationship (e.g. Node 008 primary flowing to Node 007 secondary).

AG-01 View

Looking at AG-01, AG-02 is listed under the Distributed AG column:

SQL Sentry Performance Analysis Always On Distributed Availability Group Data Flow

In this view, AG-02 is labeled with the name of the Distributed AG (DAG-01). The data pipeline animates the data flow between the Distributed AG (Node 008) and the primary replica of AG-01 (Node 005).

Note:  Rows represent availability groups (e.g. AG-01) and columns represent replicas within that availability group.

AG-02 View

When selecting AG-02, the relationship between the two availability groups displays in the opposite direction. The data flow pipeline goes from WSFC Node 008 to the distributed node (005). The links continue to work between the distributed nodes and their associated replicas as described for Availability Groups in this article.

SQL Sentry Performance Analysis Always On Distributed Availability Group Data Flow

Note:  Hover over the pipeline for more information about the connection (e.g. data flow metrics and direction).

SQL Sentry Performance Analysis Always On Distributed Availability Group Data Flow Details