Troubleshoot container monitoring

This topic applies only to the following products:

SolarWinds Observability Self-Hosted

SAM — VMAN

This topic provides tips to resolve issues you may encounter when using the Container Monitoring feature. You can also search the Success Center or THWACK.

Before proceeding, note these details:

(Recommended) Review Container monitoring requirements.
SolarWinds documentation describes how to display container data in the SolarWinds Platform Web Console. To manipulate containers directly, refer to third-party vendor documentation. For example, to learn about swarm mode, see Docker docs (© 2021 Docker, Inc., available at docs.docker.com, obtained on June 14, 2021.)

Starting in Orion Platform 2020.2.6, use SolarWinds Tokens for container monitoring. Update any container services added in earlier versions. Otherwise, polling stops.

Issues

Cannot add a container service
Resolve orion-aggregator log messages
Container data does not appear in the SolarWinds Platform Web Console. Container service status appears as Unknown and Last Seen time does not update
What happened to my container data?
Containers switch to Unknown status, or inactive containers cause AppStack and PerfStack errors
Containers are missing after an upgrade
Docker hosts cannot reach the Orion website using an IP address, domain name, or host name
AKS container monitoring stops

Cannot add a container service

Containers are not supported in High Availability (HA) or FIPS-enabled environments. If containers were added before HA or FIPS was enabled, remove them from nodes and delete container services. Otherwise, container polling continues.

Resolve orion-aggregator log messages

If messages similar to Node with IP: {IP_address} is not added to Orion. Data is not polled appear in logs, verify that the Linux server that hosts containers exists as a managed Orion node, with ICMP configured as the Polling Method. To add a node, see Add a single node for monitoring to the SolarWinds Platform. See also Locate logs.

Container data does not appear in the SolarWinds Platform Web Console. Container service status appears as Unknown and Last Seen time does not update

After an upgrade, polling stops until you edit container services added in earlier versions to use SolarWinds Tokens.

You can also review port requirements in the following sections to rule out firewall issues:

What happened to my container data?

By default, the SolarWinds Platform clears all data, including images, for containers that report as being deleted for over 7 days.

Containers switch to Unknown status, or inactive containers cause AppStack and PerfStack errors

Verify credentials for the container service and edit, if necessary. After an upgrade, you might need to edit existing container services to use SolarWinds Tokens.

See Manage container services to about the following types of containers that are deployed to container environments:

Orion Monitor containers track status and metrics for each node in a cluster.
Orion Aggregator containers on orchestrator master nodes collect data from Orion Monitor containers in the cluster and reporting status to the SolarWinds Platform server every five minutes

Containers are missing after an upgrade

After an upgrade, edit container services added in earlier versions to use SolarWinds Tokens.

In earlier versions, you can delete the related container service and then add it back to the SolarWinds Platform again to refresh YAML files. The Orion Monitor and Orion Aggregator containers will detect containers during the next polling cycle.

Docker hosts cannot reach the Orion website using an IP address, domain name, or host name

When you add a container service, a configuration file with details about the SolarWinds Platform Web Console is downloaded to the host. If the host cannot access the SolarWinds Platform Web Console or resolve the IP address, domain name, or host name provided, the script fails.

Here are possible workarounds:

If the container host cannot reach the SolarWinds Platform Web Console using the domain name or host name in the script, edit the script on the host to change the domain or host name to the IP address of the SolarWinds Platform Web Console. If the opposite is true, and the connection cannot be made using the IP address in the script, edit the script to change the IP address to the domain/host name.
Navigate to the URL in the script and download the file manually. Copy the script to the host server, and then run it without the curl command that transfers data automatically.

AKS container monitoring stops

After Azure Kubernetes Service (AKS) switched to Containerd instead of Docker for node pools, container monitoring stops and the following messages appear in logs:

Cannot connect to Docker endpoint
Error doing controls for orionaggregator-service.orion.svc.cluster.local

No workaround is currently available. See the latest SAM Release Notes for updates.

Adjust default settings

You can edit the following settings on the Advanced Configuration page, if necessary.

IncludeExceptionDetailsInWcfWeb: Disabled by default, SolarWinds Support may ask you to enable this option to gather stack trace and exception details related to container monitoring.
PollingInterval: By default, container services are polled every five minutes.
WebHttpsServicePort: The SolarWinds Platform server uses port 38012 internally to send data received from Orion Aggregator containers to the SolarWinds Orion API.

Polling interval changes only apply to new container services.

You can edit default data retention settings in the SolarWinds.Orion.ContainerMgmt.BusinessLayer.dll.config file, typically stored in C:\Program Files\SolarWinds\Orion:

CleanupJobInterval: 480 minutes
MaxDeletedContainerAge: 60 minutes

Locate logs

Logs are typically stored here:

C:\ProgramData\Solarwinds\Logs\Orion\ContainerManagement

In earlier versions, logs were located in C:\ProgramData\SolarWinds\Logs\Cortex.

When examining logs, search for the following keywords:

CMAN
ContainerMgmt
ContainerManagement.

For logs about data collected from each node in a cluster, run the following command on the node that hosts the OrionAggregator container:

sudo docker logs -f [container_id]

To determine the container_id for the OrionAggregator container, run: sudo docker ps.

In YAML files, Orion Aggregator containers are referenced as "Scope2Orion".

Search SolarWinds Support