Troubleshooting Linux
If you run into any issue installing the SolarWinds Snap Agent or getting metrics reported, please check the troubleshooting techniques below.
Restart the agent
The agent supports standard service control commands including status
, start
, stop
and restart
. For example, you can restart the agent by running:
sudo service swisnapd restart
View the agent log
The agent log file is located at /var/log/SolarWinds/Snap/swisnapd.log
. By default only messages at or above the warning level are reported. To increase logging verbosity:
-
Set the log level to
debug
in the agent config file -
Restart the agent
-
Check for new messages in the log file, for example:
tail -f /var/log/SolarWinds/Snap/swisnapd.log
Check the loaded plugins
The SolarWinds Snap Agent includes the swisnap
command line tool to interact with the snap daemon on which our agent is based. Out of the box, our agent will automatically load two plugins that enable collecting system metrics (aosystem) and publishing to AppOptics (publisher-appoptics). To check that they are loaded, you can run swisnap plugin list
and confirm they are listed as loaded under the STATUS column:
swisnap plugin list
NAME TYPE API VERSION RUNTIME SIGNED STATUS LOADED TIME publisher-appoptics publisher PluginsV1 48.0.0 false loaded Mon, 18 May 2020 17:44:47 UTC aosystem collector PluginsV2 58.0.0 local false running Mon, 18 May 2020 17:44:48 UTC processes collector PluginsV2 11.0.0 local false running Mon, 18 May 2020 17:44:48 UTC publisher-appoptics publisher PluginsV2 48.0.0 local false running Mon, 18 May 2020 17:44:48 UTC publisher-processes publisher PluginsV2 48.0.0 local false running Mon, 18 May 2020 17:44:48 UTC
Check the state of tasks
Similar to checking loaded plugins, you can use the swisnap
command line tool to check the state of tasks, which define the metrics collection and publishing jobs run by the agent. Out of the box, our agent will automatically define a task to report system metrics continuously every minute.
-
Use
swisnap task list
to get a list:swisnap task list
+----------------------------------------+-----------+-------------+--------------+-----------+--------+------------+ | TASK ID / PLUGIN | TYPE | LAST ACTION | DURATION | PROCESSED | STATUS | LAST ERROR | +----------------------------------------+-----------+-------------+--------------+-----------+--------+------------+ | 1a037d34-00-task-aosystem.yaml | | | | | | | | 1a037d34-ff0b-423c-a70f-dd2d13e286a2 | | | | | | | | aosystem | collector | load | 0s | 0 | | N/A | | publisher-appoptics | publisher | load | 0s | 0 | | N/A | | bf0d3744-00-task-processes.yaml | | | | | | | | bf0d3744-fe49-457c-b7ae-b7283319cc39 | | | | | | | | processes | collector | collect | 3.005011279s | 28 | + | N/A | | publisher-processes | publisher | publish | 1.453608565s | 28 | + | N/A | | c61f29c7-00-task-aosystem-warmup.yaml | | | | | | | | c61f29c7-b096-4033-8f7e-26f78effd171 | | | | | | | | aosystem | collector | collect | 1.687354ms | 1 | ++++++ | N/A | | publisher-appoptics | publisher | publish | 438.179707ms | 1 | ++++++ | N/A | +----------------------------------------+-----------+-------------+--------------+-----------+--------+------------+
-
The above output shows a running task. To further confirm that it is the one reporting system metrics, you can either use the
swisnap task details <task id>
command to print to console the task details, or use theswisnap task watch <task id>
command which logs to console the metrics being gathered at each task interval. An example of the watch command:swisnap task watch 53c0afb1-1e47-471f-b1c0-af69207842eb
[2020-05-18T17:47:51.667573043Z] /processes/snap-plugin-publisher-appoptics/cpu = 0 [2020-05-18T17:47:51.667581169Z] /processes/snap-plugin-publisher-appoptics/memory = 1.0773205757141113 [2020-05-18T17:47:51.66758757Z] /processes/snap-plugin-publisher-appoptics/count = 1 ... (ctl-c to quit)
Run the plugin directly
If you're experiencing issues with a specific integration and the agent logs are not providing much help, you can also run the binary for the plugin independently of the swisnapd service to attempt a collection. This could reveal errors or permission issues that are being obscured.
Plugins compatible with V1 Plugin API
To run plugin directly, execute following command:
sudo -u solarwinds /opt/SolarWinds/Snap/bin/<plugin_binary> --config '<config to use in JSON format>'
For example, the following reveals a config issue with the consul plugin. The service is running on port 8500 on the host, but the config is looking at port 80.
sudo -u solarwinds /opt/SolarWinds/Snap/bin/snap-plugin-collector-bridge-consul --config '{"address": "localhost:80"}'
... Config Policy: NAMESPACE KEY TYPE REQUIRED DEFAULT MINIMUM MAXIMUM bridge.consul datacentre string false bridge.consul ssl_key string false bridge.consul token string false bridge.consul password string false bridge.consul address string false bridge.consul username string false bridge.consul ssl_ca string false bridge.consul scheme string false bridge.consul ssl_cert string false bridge.consul insecure_skip_verify bool false false printConfigPolicy took 3.238904ms 2017/12/13 13:05:49 Bridge.init: configured telegraf input consul Metric catalog will be updated to include: Namespace: /consul/*/all printMetricTypes took 446.017µs 2017/12/13 13:05:49 Error gathering /consul/*/all: Get http://localhost/v1/health/state/any: dial tcp 127.0.0.1:80: getsockopt: connection refused Metrics that can be collected right now are: ...
Fixing the config resolves the issue, and shows a successful collection.
sudo -u solarwinds /opt/SolarWinds/Snap/bin/snap-plugin-collector-bridge-consul --config '{"address": "localhost:8500"}'
... Config Policy: NAMESPACE KEY TYPE REQUIRED DEFAULT MINIMUM MAXIMUM bridge.consul datacentre string false bridge.consul ssl_key string false bridge.consul token string false bridge.consul password string false bridge.consul address string false bridge.consul username string false bridge.consul ssl_ca string false bridge.consul scheme string false bridge.consul ssl_cert string false bridge.consul insecure_skip_verify bool false false printConfigPolicy took 2.408139ms 2017/12/13 13:12:31 Bridge.init: configured telegraf input consul Metric catalog will be updated to include: Namespace: /consul/*/all printMetricTypes took 314.801µs Metrics that can be collected right now are: Namespace: /consul/consul_health_checks/service_id Type: string Value: Namespace: /consul/consul_health_checks/status Type: string Value: passing Namespace: /consul/consul_health_checks/passing Type: int Value: 1 Namespace: /consul/consul_health_checks/critical Type: int Value: 0 Namespace: /consul/consul_health_checks/warning Type: int Value: 0 Namespace: /consul/consul_health_checks/check_name Type: string Value: Serf Health Status printCollectMetrics took 2.14235ms ...
Plugins compatible with V2 Plugin API
To run plugin directly, execute following command:
sudo -u solarwinds /opt/SolarWinds/Snap/bin/<plugin_binary> --debug-mode --plugin-config '<config to use in JSON format>'
Having simliar example as for V1 Plugins API, the following reveals a config isue with the consul plugin. The service is running on port 8500 on the host, but the config is looking at port 80.
sudo -u solarwinds /opt/SolarWinds/Snap/bin/snap-plugin-collector-bridge --debug-mode --plugin-config '{"consul": {"address": "localhost:80"}}'
Error occurred during metrics collection in a standalone mode (reason: user-defined Collect method ended with error: error gathering metrics from consul: Unexpected response code: 404 (<html> <head><title>404 Not Found</title></head> <body bgcolor="white"> <center><h1>404 Not Found</h1></center> <hr><center>nginx/1.14.0 (Ubuntu)</center> </body> </html> ))
Fixing the config resolves the issue, and shows a successful collection.
sudo -u solarwinds /opt/SolarWinds/Snap/bin/snap-plugin-collector-bridge --debug-mode --plugin-config '{"consul": {"address": "localhost:8500"}}'
Gathered metrics (length=6): /consul/consul_health_checks/passing 1 {map[check_id:serfHealth collector_plugin:consul node:485a1dc4851c service_name:]} /consul/consul_health_checks/critical 0 {map[check_id:serfHealth collector_plugin:consul node:485a1dc4851c service_name:]} /consul/consul_health_checks/warning 0 {map[check_id:serfHealth collector_plugin:consul node:485a1dc4851c service_name:]} /consul/consul_health_checks/check_name Serf Health Status {map[check_id:serfHealth collector_plugin:consul node:485a1dc4851c service_name:]} /consul/consul_health_checks/service_id {map[check_id:serfHealth collector_plugin:consul node:485a1dc4851c service_name:]} /consul/consul_health_checks/status passing {map[check_id:serfHealth collector_plugin:consul node:485a1dc4851c service_name:]}
Navigation Notice: When the APM Integrated Experience is enabled, AppOptics shares a common navigation and enhanced feature set with other integrated experience products. How you navigate AppOptics and access its features may vary from these instructions.
The scripts are not supported under any SolarWinds support program or service. The scripts are provided AS IS without warranty of any kind. SolarWinds further disclaims all warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The risk arising out of the use or performance of the scripts and documentation stays with you. In no event shall SolarWinds or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the scripts or documentation.