Documentation forOrion Platform

Troubleshoot container monitoring

This Orion Platform topic applies only to the following products:

SAMVMAN

This topic provides tips to resolve issues you may encounter when monitoring containers in the Orion Web Console. You can also search the Success Center or THWACK.

SolarWinds documentation describes how to display container data in the Orion Web Console. To learn about manipulating containers directly, refer to third-party documentation provided by the vendor. For example, to learn about swarm mode, see Docker docs (© 2020 Docker, Inc., available at docs.docker.com, obtained on October 26, 2020.)

Before proceeding, review Container monitoring requirements.

Locate logs

Log files related to container monitoring, which utilizes the SolarWinds Cortex service, are stored in the following default directory: C:\ProgramData\SolarWinds\Logs\Cortex. Look for the following keywords: CMAN, ContainerMgmt, and ContainerManagement.

To find logs about data collected from every node in a cluster that is sent to the Orion Platform via Cortex, use the following command on the node that hosts the OrionAggregator container:

sudo docker logs -f [container_id]

To determine the container_id for the OrionAggregator container, execute this command: sudo docker ps

Resolve orion-aggregator log messages

If messages similar to Node with IP: {IP_address} is not added to Orion. Data is not polled appear in logs, verify that the Linux server that hosts containers exists as a managed Orion node, with ICMP configured as the Polling Method. To add a node, see Add a single node for monitoring to the Orion Platform.

Container data does not appear in the Orion Web Console

If expected data about containers does not appear, review port requirements in the following sections to rule out firewall issues:

Cannot add a container service

Starting in Orion Platform 2020.2, you cannot add a container service if FIPS mode is enabled. If containers were added before FIPS mode was enabled, remove them from nodes and then delete the container service. Otherwise, container polling will continue.

Containers are missing after an upgrade

After upgrading SAM or VMAN, delete the related container service and then add it back to the Orion Platform again to refresh YAML files and agents plugins. The Orion Monitor and Orion Aggregator containers will detect all containers running on servers during the next polling cycle. See Update an existing container service

Containers switch to Unknown status

As described in Add a container service, several items are added to a Linux node when you deploy a container service, including an Orion Monitor container to track node status and metrics, and an Orion Aggregator container to collect data for the Orion Platform. A StatusSetUp agent plug-in is also deployed to checks the status of the Orion Aggregator container every minute.

  • If the Orion Aggregator container fails to connect to the Orion server for two, consecutive five-minute intervals, the container service status changes to Down and the container status switches to Unknown. Check the connection between the node and the Orion server.
  • If the Orion Aggregator container fails to report metrics from Orion Monitor containers for two, consecutive five-minute intervals, the container status switches to Unknown.

Container status switches to Unknown if the Orion admin account credentials used to add the related container service change. Containers use those credentials to push data from container environments to the Orion Platform. If the password changes, delete the container service and then add it back again.

Choose a polling engine for container monitoring

Currently, there is no direct way in the Orion Web Console to change the polling engine for container monitoring in the Orion Web Console. Instead, change the Orion URL property in the script that runs on the host machine when you add a container service. By default, that value is set to the IP address of the Orion server, which acts as the Main Polling Engine. Change the IP address to match an Additional Polling Engine (APE) instead.

The exact property name varies in supported script files:

  • Docker Compose file: ORION_URL
  • Kubernetes YAML file: ORION_URL
  • Apache Mesos file: ORION_CONSOLE_URL

Docker hosts cannot reach the Orion website using an IP address, domain name, or host name

When you add a container service, the Orion Platform generates a script that is copied to the Linux host server via an SSH client. This script contains a link to a configuration file that is downloaded to the host to set up the Orion Agent when the script runs. If the host cannot access the Orion Web Console or resolve the IP address, domain name, or host name provided, the script fails.

Here are possible workarounds: 

  • If the container host cannot reach the Orion Web Console using the domain name or host name in the script, edit the script on the host to change the domain or host name to the IP address of the Orion Web Console. If the opposite is true, and the connection cannot be made using the IP address in the script, edit the script to change the IP address to the domain/host name.
  • Navigate to the URL in the script and download the file manually. Copy the script to the host server, and then run it without the curlcommand that transfers data automatically.