Monitor cloud instances and VMs
Cloud service platforms provide on-demand computing resources to third-party organizations over the Internet. As organizations migrate systems to the cloud to distribute workloads, deliver applications, and expand resources for growing databases, infrastructure can become difficult to map in sprawling environments, leading to lost resources or hidden instances.
To support hybrid environments, the Orion Platform can retrieve data from the Amazon Web Services (AWS) and Microsoft Azure cloud service platforms to track availability, performance, applications, and more for instances and VMs. Examples of data gathered include status, storage capacity, memory usage, and IP addresses.
- Manage hybrid environment metrics and status through a single console. Displaying on-premises, virtual, and cloud systems together helps you compare performance, locate bottlenecks, and better plan capacity and resource allocation.
- Track end user and business context for performance by using SolarWinds SAM to gather extended metrics that provide visibility into cloud and on-premises systems.
- Dynamically monitor cloud instances and VMs to better handle resource churn during provisioning. Instances and VMs can be removed as needed to support expanding environments or performance peaks.
- Determine usage trends and troubleshoot issues. Captured metrics over time provide historical references to track trends for resource consumption (such as CPU spikes and lulls) and help determine when those trends become issues.
- Use cloud monitoring data, Orion alerts, and the Performance Analysis dashboard (PerfStack) to review historical performance and pinpoint when significant usage changes began to trigger issues.
To enhance cloud monitoring, configure cloud instances/VMs as managed nodes in the Orion Platform so that you can:
- Poll specific metrics beyond the basic metrics gathered by cloud service APIs, including OS, memory, and other detailed metrics retrieved by SAM application monitors.
- Use SAM application monitors and templates to poll applications deployed in the cloud.
- Display cloud instance/VM details in AppStack for quick troubleshooting across your environment.
- Develop and deploy custom script monitors for PowerShell, Nagios, Linux/UNIX, and Windows.
Assign Custom Properties to nodes.
To learn more, see Manage a cloud instance/VM as an Orion Platform node.
For optimal performance, SolarWinds recommends the following limits for cloud monitoring:
- Up to 10 cloud service accounts
- Up to 1,000 instances/VMs to monitor
- Up to 1,000 volumes to monitor
- Up to 1,000 instances/VMs managed as nodes
- Up to 1,000 Orion agents deployed on managed nodes
Before exceeding recommended limits, consider the impact on polling load, costs incurred due to API request overages, and the possible need to expand hardware, CPU resources, memory, etc.
|Monitor AWS cloud metrics||
|Monitor Azure cloud metrics||
The Orion server must be configured to communicate with public services to collect data from cloud service APIs. Use the default setting — public — in community strings for polled devices to allow read access.
After you configure a cloud account and add an initial cloud account to the Orion Platform, cloud services start polling for metrics, as displayed on the Cloud Summary page in the Orion Web Console. See Explore cloud instances and VMs on the Cloud Summary page.
Cloud services APIs, such as the Amazon CloudWatch API and Azure Rest API, capture basic metric data for instances/VMs and volumes so you can allocate resources as needed, such as partial CPU processing and disk space across multiple instances/VMs. These resources can change through direct interactions and automation. For example, when the Amazon EC2 web service reports data to the Orion Platform, it calculates the percentage of assigned resources shared between instances.
Cloud metrics differ with OS metrics due to the fluid nature of cloud computing. OS metrics directly capture values from the core system, not the assigned amounts. This data does not calculate shared resources or other users attached to the instances and volumes. This data directly displays the actual usage at a polled point in time.
Both cloud metrics and OS metrics provide insight into potential and actual issues with performance and resources. Metrics report vastly different information to the cloud and OS based on allocated resources and metric calculations.
CPU steal is an example of cloud vs. OS metrics. When CPU usage and metrics spike in a cloud environment, multiple processes and instances/VMs in the cloud may access the CPU as multiple owners. Typically, OS metric spikes tend to look like noisy neighbors. The cloud metric data better represents the data as shared resources usage across multiple owners with metrics broken down by owner.
To better define resource usage and alerts, SAM and integrated VMAN display cloud instance/VM metrics throughout all cloud resources in Orion Web Console views, resources, hover-over data, and reports. Cloud metrics, including calculated health status, CPU load, and IOPS data, are used to apply global cloud thresholds that trigger alerts and status changes. For a list of cloud metrics gathered by cloud service APIs, see the table included in the Edit global thresholds for cloud monitoring topic.
For instances and VMs managed as nodes, the Orion Platform pulls specific OS data for memory and provides additional data through Orion agent, WMI, and SNMP polling methods.