Monitor hardware health in SAM
You can use SAM to monitor the health of Dell, HP, HPE ProLiant, IBM, Cisco UCS, and Nutanix hardware components such as temperature, fan speed, power supply, CPU, memory, disk space, and more. SAM provides instant visibility of the status (up, critical or warning), allows you to set baseline values, and alerts you if levels fall below set thresholds.
To get started monitoring hardware health for Dell, HP, HPE ProLiant, and IBM devices:
- Review SAM hardware health monitoring requirements.
- Download, install, and configure agent software from third-party vendors so SAM can gather details that are not available natively from server operating systems. Click here for details.
- Run Discovery to detect third-party agent software and hardware health sensors on servers, and automatically enable hardware health monitoring across multiple nodes.
When Discovery enables hardware health monitoring for eligible devices, Asset Inventory data collection is also enabled to track each node's hardware and software daily.
Although Hardware Health and Asset Inventory can both be enabled automatically during Discovery, they can function independently of each other. For example, you can collect Asset Inventory daily for a node without polling for Hardware Health every 10 minutes.
To monitor hardware health for UCS devices, start by adding the parent UCS controller to the Orion Platform. See Monitor Cisco UCS Devices for details.
To monitor hardware health in Nutanix environments, add Hyper-V or VMware nodes for monitoring, add the parent Nutanix cluster, and provide Controller VM (CVM) credentials. See Monitor Hardware Health for Nutanix clusters.
Documentation about Orion Platform features available in multiple modules is stored in the Orion Platform Administrator Guide. Because Hardware Health monitoring for UCS and Nutanix is a shared feature, you'll find related information in that document.
Note the following details about Hardware Health monitoring in SAM:
- To improve polling performance, consider how often you need to poll hardware statistics, and how long you need to archive them. See Update polling settings in the Orion Platform.
- Certificate errors found during polling are ignored by default, but you can change that setting.
- For tips on monitoring HPE Proliant servers, see this THWACK post.
- For troubleshooting tips, see Troubleshooting Hardware Health.
- Troubleshoot Hardware Health monitoring in SAM (SAM online help)
- Troubleshoot hardware issues in the Orion Platform (Orion Platform Administrator Guide)
- Difference in Hardware Health by manufacturer and polling method for servers (Success Center)
- Hardware monitoring for HPE Proliant Gen10 (THWACK)
Orion Web Console widgets
To learn about widgets shared by several products, see Orion Platform online help. For example, the following Hardware Health widgets are documented in Orion Platform online help.