Documentation forServer & Application Monitor
Monitoring your applications and environment is a key capability of SolarWinds Observability Self-Hosted (formerly Hybrid Cloud Observability) and is available in the Essentials edition. Server & Application Monitor (SAM) is also available in a standalone module.

Microsoft Azure Batch Account API poller template

Use this SAM API poller template to monitor Microsoft Azure Batch Account performance and statistics counters, including node, pool, and task statistics. Azure batch schedules compute-intensive tasks and adjusts resources to optimize performance.

Links and screenshots herein are attributed to © 2021 Microsoft Corp., available at docs.microsoft.com.

Prerequisites

  • Use the following parameters to specify the API endpoint in the request URL:

    • ${SUBSCRIPTION_ID}: Your Azure subscription ID. For example, 6a4208fe-5200-417e-9365-99781c6133c3
    • ${USERGROUP_ID}: A resource group ID
    • ${BATCHACCOUNT_ID}: Your Batch Account ID

    Use the following example to help locate values in Azure:

  • Configure OAuth 2.0 Azure credentials with the following values:

    • Scope: https://management.azure.com/.default
    • Access Token URL: https://login.microsoftonline.com/{TENANTID}/oauth2/v2.0/token

      Although "(optional)" appears next to the Scope field in the UI, this value is required for API pollers based on this template.

  • The Batch Account has access control to the Application with at least the Reader role.


Notes

  • Default thresholds are not set for this template.
  • For reference, see Microsoft.Batch/batchAccounts.
  • Here is a request example: https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${USERGROUP_ID}/providers/Microsoft.Batch/batchAccounts/${BATCHACCOUNT_ID}/providers/microsoft.insights/metrics?interval=PT1H&metricnames=CoreCount,TotalNodeCount,IdleNodeCount,LeavingPoolNodeCount,OfflineNodeCount,PoolCreateEvent,PoolDeleteCompleteEvent,PoolDeleteStartEvent,PoolResizeCompleteEvent,PoolResizeStartEvent,RebootingNodeCount,ReimagingNodeCount,RunningNodeCount,StartTaskFailedNodeCount,StartingNodeCount,TaskCompleteEvent,TaskFailEvent,TaskStartEvent,UnusableNodeCount,WaitingForStartTaskNodeCount&aggregation=Total&api-version=2018-01-01

Available metrics

Dedicated Core Count

The total number of dedicated cores in the batch account.

Unit: Count

Dedicated Node Count

The total number of dedicated nodes in the batch account.

Unit: Count

Idle Node Count

The number of idle nodes.

Unit: Count

Leaving Pool Node Count

The number of nodes leaving the application pool.

Unit: Count

Offline Node Count

The number of offline nodes.

Unit: Count

Pool Create Events

The total number of pool create events that occurred.

To learn more about this metric and other pool-related metrics, see Batch Analytics.

Unit: Count

Pool Delete Complete Events

The total number of pool delete events that occurred.

Unit: Count

Pool Delete Start Events

The total number of pool delete start events that occurred.

Unit: Count

Pool Resize Complete Events

The total number of pool resize complete events that occurred.

Unit: Count

Pool Resize Start Events

The total number of pool resize start events that occurred.

Unit: Count

Rebooting Node Count

The number of nodes that are being rebooted.

Unit: Count

Reimaging Node Count

The number of nodes that are being reimaged.

Unit: Count.

Running Node Count

The number of nodes currently running.

Unit: Count

Start Task Failed Node Count

The number of start tasks that failed for a node.

Unit: Count

Starting Node Count

Total number of nodes that are starting.

Unit: Count

Task Complete Events

The total number of task complete events, regardless of exit code.

Unit: Count

Task Fail Events

The total number of task fail events with non-zero exit codes.

Unit: Count

Task Start Events

The total number of tasks that have started.

Unit: Count

Unusable Node Count

The number of unusable nodes.

Unit: Count

Waiting For Start Task Node Count

The number of nodes waiting for the Start Task to complete.

Unit: Count