Varnish
Overview
Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents.
This integration is only available for Linux platforms.
Setup
The varnish
monitoring is accomplished by bridge
plugin which is included with the SolarWinds Snap Agent by default. Follow the directions below to enable it for an agent instance.
The bridge
plugin utilize Telegraf Varnish plugin.
Prerequisites
The Varnish daemon must be started beforehand and configured to point to your HTTP service. By default, the Varnish service will run on port 6081, and the configuration file at /etc/varnish/default.vcl
will need to be edited to point to your HTTP service.
Once the configuration file has been updated, reload the service:
service varnish reload
If you updated the service file to change the port Varnish runs on, you will have to reload the service file:
systemctl daemon-reload
Visit the Varnish website for more information.
Configuration
The agent provides an example task file to help you get started quickly, but requires you to provide the correct settings for your Varnish installation. To enable the task:
-
Make a copy of the Varnish example task file
/opt/SolarWinds/Snap/etc/tasks-autoload.d/task-bridge-varnish.yaml.example
, renaming it to/opt/SolarWinds/Snap/etc/tasks-autoload.d/task-bridge-varnish.yaml
:sudo cp -p /opt/SolarWinds/Snap/etc/tasks-autoload.d/task-bridge-varnish.yaml.example /opt/SolarWinds/Snap/etc/tasks-autoload.d/task-bridge-varnish.yaml
-
Edit the task file to set which metrics to report and to ensure
varnishstat
is executable if needed, for example:If you wish to also collect logs for this service, uncomment the last section in the example task file. For more information on collecting logs, see the logs collector docs.
--- version: 2 schedule: type: cron interval: "0 * * * * *" plugins: - plugin_name: bridge config: varnish: ## If running as a restricted user you can prepend sudo for additional access: # use_sudo: false ## The default location of the varnishstat binary can be overridden with: # binary: "/usr/bin/varnishstat" ## By default, telegraf gather stats for 3 metric points. ## Setting stats will override the defaults shown below. ## Glob matching can be used, ie, stats: "MAIN.*" ## stats may also be set to "*", which will collect all stats stats: - "MAIN.backend_*" - "MAIN.cache_*" - "MAIN.client_req" - "MAIN.fetch_*" - "MAIN.losthdr" - "MAIN.n_expired" - "MAIN.n_lru_*" - "MAIN.s_re*" - "MAIN.s_sess" - "MAIN.sess_*" - "MAIN.shm_*" - "MAIN.thread*" - "MAIN.uptime" ## Optional name for the varnish instance (or working directory) to query ## Usually appened after -n in varnish cli # instance_name: instanceName publish: - plugin_name: publisher-appoptics ## If you want to gather logs for this integration, uncomment the following section. # - plugin_name: log-files # config: # file_paths: # - /var/log/varnish/varnishncsa.log # publish: # - plugin_name: loggly-http-bulk
-
Restart the agent:
sudo service swisnapd restart
-
Enable the Varnish integration in AppOptics
On the Integrations Page you will see the Varnish integration available if the previous steps were successful. It may take a couple minutes before the Varnish integration is identified. Select the Varnish integration to open the configuration menu in the UI, and enable it. If you do not see it, see Troubleshooting Linux.
Testing Integration
To check if and what metrics can be collected with given configuration, run bridge
plugin in debug mode:
/opt/SolarWinds/Snap/bin/snap-plugin-collector-bridge --debug-mode --plugin-config "{\"varnish\": {\"stats\": [\"MAIN.backend_*\"]}}"
Permission issues with running varnishstat
If you're seeing errors related to running varnishstat
, you may need to set additional permissions for the solarwinds user.
-
First, try adding the
solarwinds
user to thevarnish
group:usermod -a -G varnish solarwinds
Confirm that the user has been updated:
groups solarwinds solarwinds : solarwinds varnish
Then restart the service:
sudo service swisnapd restart
-
If that doesn't work, try giving the
solarwinds
user sudo privileges:Set
use_sudo
totrue
in the/opt/SolarWinds/Snap/etc/tasks-autoload.d/task-bridge-varnish.yaml
file.## If running as a restricted user you can prepend sudo for additional access: use_sudo: true
Then restart the service:
sudo service swisnapd restart
Metrics and Tags
Default Metrics
The table below outlines the default set of metrics collected by the Varnish integration.
Namespace | Description |
---|---|
varnish.backend_conn | Successful backend connections |
varnish.backend_unhealthy | Backend connections not attempted due to unhealthy backend state |
varnish.backend_busy | Unsuccessful backend connections due to exceeding the maximum number of connections |
varnish.backend_fail | Backend connection failures |
varnish.backend_reuse | Backend connection reuses |
varnish.backend_recycle | Backend connection recycles |
varnish.backend_retry | Backend connection retries |
varnish.cache_hit | Number of requests served by the cache server |
varnish.cache_hitpass | Requests passed to the backend from the cache by choice |
varnish.cache_miss | Number of requests served by the backend server |
varnish.client_req | Good client requests received |
varnish.fetch_head | Backend HEAD requests |
varnish.fetch_length | Backend responses with Content-Length in the body |
varnish.fetch_chunked | Backend responses that were chunked |
varnish.fetch_eof | Backend responses with EOF in the body |
varnish.fetch_bad | Backend responses with a bad body |
varnish.fetch_none | Backend responses with no body |
varnish.fetch_1xx | Backend responses with no body due to 1xx responses |
varnish.fetch_204 | Backend responses with no body due to 204 responses |
varnish.fetch_304 | Backend responses with no body due to 304 responses |
varnish.fetch_failed | Backend response fetches that failed |
varnish.fetch_no_thread | Backend response fetches that failed due to no thread being available |
varnish.losthdr | HTTP header overflows |
varnish.n_expired | Number of expired objects |
varnish.n_lru_nuked | Number of LRU nuked objects |
varnish.n_lru_moved | Number of LRU moved objects |
varnish.s_req | Total requests seen |
varnish.s_req_hdrbytes | Total request header bytes |
varnish.s_req_bodybytes | Total request body bytes |
varnish.s_resp_hdrbytes | Total response header bytes |
varnish.s_resp_bodybytes | Total response body bytes |
varnish.s_sess | Total sessions seen |
varnish.sess_conn | Accepted client connections |
varnish.sess_drop | Dropped client connections |
varnish.sess_fail | Failed client connections |
varnish.shm_records | SHM records |
varnish.shm_writes | SHM writes |
varnish.shm_flushes | SHM flushes due to overflow |
varnish.shm_cont | SHM MTX contention |
varnish.shm_cycles | SHM cycles through buffer |
varnish.threads | Total number of threads |
varnish.threads_limited | Number of threads that hit the thread pool limit |
varnish.threads_created | Number of threads created |
varnish.threads_destroyed | Number of threads destroyed |
varnish.threads_failed | Number of threads that failed to be created |
varnish.thread_queue_len | Length of session queue for threads |
varnish.uptime | How long the child process has been running |
Optional Metrics
All stats that varnishstat
reports are available in AppOptics. To expose these, you will need to edit the stats
configuration setting in the /opt/SolarWinds/Snap/etc/tasks-autoload.d/task-bridge-varnish.yaml
file. It is highly recommended to only expose metrics you are interested in as setting stats
to "*"
will cause all metrics (including those reporting no value/zero values) to start reporting to AppOptics at a 60 second resolution.
MAIN Metrics
Namespace | Description |
---|---|
varnish.client_req_400 | Client requests received, subject to 400 errors |
varnish.client_req_417 | Client requests received, subject to 417 errors |
varnish.pools | Number of thread pools |
varnish.busy_sleep | Number of requests sent to sleep because a busy object was found |
varnish.busy_wakeup | Number of requests woken after sleep on busy and rescheduled |
varnish.busy_killed | Number of requests killed after sleep on busy |
varnish.sess_queued | Client connections queued waiting for a thread |
varnish.sess_dropped | Client connections dropped waiting for a thread |
varnish.n_object | Number of object structs made |
varnish.n_vampireobject | Number of unresurrected objects |
varnish.n_objectcore | Number of objectcore structs made |
varnish.n_objecthead | Number of objecthead structs made |
varnish.n_waitinglist | Number of waitinglist structs made |
varnish.n_backend | Number of backends |
varnish.s_pipe | Total pipe sessions seen |
varnish.s_pass | Total passed requests seen |
varnish.s_fetch | Total backend fetches initiated |
varnish.s_synth | Total synthethic responses made |
varnish.s_pipe_hdrbytes | Total pipe request header bytes |
varnish.s_pipe_in | Total piped bytes from client |
varnish.s_pipe_out | Total piped bytes to client |
varnish.sess_closed | Number of client connections closed |
varnish.sess_closed_err | Number of client connections closed with an error |
varnish.sess_readahead | Number of client connections read ahead |
varnish.sess_herd | Number of times the timeout_linger triggered |
varnish.sc_rem_close | Number of client connection closes with REM_CLOSE |
varnish.sc_req_close | Number of client connection closes with REQ_CLOSE |
varnish.sc_req_http10 | Number of client connection closes with error REQ_HTTP10 |
varnish.sc_rx_bad | Number of client connection closes with error RX_BAD |
varnish.sc_rx_body | Number of client connection closes with error RX_BODY |
varnish.sc_rx_junk | Number of client connection closes with error RX_JUNK |
varnish.sc_rx_overflow | Number of client connection closes with error RX_OVERFLOW |
varnish.sc_rx_timeout | Number of client connection closes with error RX_TIMEOUT |
varnish.sc_tx_pipe | Number of client connection closes with TX_PIPE |
varnish.sc_tx_error | Number of client connection closes with error TX_ERROR |
varnish.sc_tx_eof | Number of client connection closes with TX_EOF |
varnish.sc_resp_close | Number of client connection closes with RESP_CLOSE |
varnish.sc_overload | Number of client connection closes with error OVERLOAD |
varnish.sc_pipe_overflow | Number of client connection closes with error PIPE_OVERFLOW |
varnish.sc_range_short | Number of client connection closes with error RANGE_SHORT |
varnish.backend_req | Backend requests made |
varnish.n_vcl | Total number of loaded VCLs |
varnish.n_vcl_avail | Total number of VCLs available |
varnish.n_vcl_discard | Total number of discarded VCLs |
varnish.bans | Count of bans |
varnish.bans_completed | Number of bans marked ‘completed' (no longer active) |
varnish.bans_obj | Number of bans using obj.* |
varnish.bans_req | Number of bans using req.* |
varnish.bans_added | Bans added to ban list |
varnish.bans_deleted | Bans deleted from ban list |
varnish.bans_tested | Bans tested against objects during lookup |
varnish.bans_obj_killed | Objects killed by bans during lookup |
varnish.bans_lurker_tested | Bans tested against objects by the ban-lurker |
varnish.bans_tests_tested | Ban tests tested against objects during lookup |
varnish.bans_lurker_tests_tested | Ban tests tested against objects by the ban-lurker |
varnish.bans_lurker_obj_killed | Objects killed by the ban-lurker |
varnish.bans_dups | Bans superseded by bans added later to the ban list |
varnish.bans_lurker_contention | Number of times the ban-lurker waited for lookups |
varnish.bans_persisted_bytes | Bytes used by the persisted ban lists |
varnish.bans_persisted_fragmentation | Extra bytes in persisted ban lists due to fragmentation |
varnish.n_purges | Number of purge operations executed |
varnish.n_obj_purged | Number of purged objects |
varnish.exp_mailed | Number of objects mailed to expiry thread |
varnish.exp_received | Number of objects received by expiry thread |
varnish.hcb_nolock | HCB Lookups without lock |
varnish.hcb_lock | HCB Lookups with lock |
varnish.hcb_insert | HCB Inserts |
varnish.esi_errors | Edge Side Includes (ESI) parse errors |
varnish.esi_warnings | Edge Side Includes (ESI) parse warnings (unlock) |
varnish.vmods | Loaded VMODs |
varnish.n_gzip | Number of Gzip operations |
varnish.n_gunzip | Number of Gunzip operations |
varnish.vsm_free | Free VSM space |
varnish.vsm_used | Used VSM space |
varnish.vsm_cooling | Soon to be freed VSM space |
varnish.vsm_overflow | Data which does not fit in the VSM space |
varnish.vsm_overflowed | Total data which did not fit in the VSM space |
MGT Metrics
Namespace | Description |
---|---|
varnish.child_start | Child processes started |
varnish.child_exit | Child processes that were stopped normally |
varnish.child_stop | Child processes that exited with an unexpected return code |
varnish.child_died | Child processes that died due to signals |
varnish.child_dump | Child processes that produced core dumps |
varnish.child_panic | Child processes that panicked |
MEMPOOL Metrics
Namespace | Description |
---|---|
varnish.busyobj.live | In use |
varnish.busyobj.pool | In Pool |
varnish.busyobj.sz_wanted | Size requested |
varnish.busyobj.sz_actual | Size allocated |
varnish.busyobj.allocs | Allocations |
varnish.busyobj.frees | Frees |
varnish.busyobj.recycle | Recycled from pool |
varnish.busyobj.timeout | Timed out from pool |
varnish.busyobj.toosmall | Too small to recycle |
varnish.busyobj.surplus | Too many for pool |
varnish.busyobj.randry | Pool ran dry |
varnish.req0.live | In use |
varnish.req0.pool | In Pool |
varnish.req0.sz_wanted | Size requested |
varnish.req0.sz_actual | Size allocated |
varnish.req0.allocs | Allocations |
varnish.req0.frees | Frees |
varnish.req0.recycle | Recycled from pool |
varnish.req0.timeout | Timed out from pool |
varnish.req0.toosmall | Too small to recycle |
varnish.req0.surplus | Too many for pool |
varnish.req0.randry | Pool ran dry |
varnish.sess0.live | In use |
varnish.sess0.pool | In Pool |
varnish.sess0.sz_wanted | Size requested |
varnish.sess0.sz_actual | Size allocated |
varnish.sess0.allocs | Allocations |
varnish.sess0.frees | Frees |
varnish.sess0.recycle | Recycled from pool |
varnish.sess0.timeout | Timed out from pool |
varnish.sess0.toosmall | Too small to recycle |
varnish.sess0.surplus | Too many for pool |
varnish.sess0.randry | Pool ran dry |
varnish.req1.live | In use |
varnish.req1.pool | In Pool |
varnish.req1.sz_wanted | Size requested |
varnish.req1.sz_actual | Size allocated |
varnish.req1.allocs | Allocations |
varnish.req1.frees | Frees |
varnish.req1.recycle | Recycled from pool |
varnish.req1.timeout | Timed out from pool |
varnish.req1.toosmall | Too small to recycle |
varnish.req1.surplus | Too many for pool |
varnish.req1.randry | Pool ran dry |
varnish.sess1.live | In use |
varnish.sess1.pool | In Pool |
varnish.sess1.sz_wanted | Size requested |
varnish.sess1.sz_actual | Size allocated |
varnish.sess1.allocs | Allocations |
varnish.sess1.frees | Frees |
varnish.sess1.recycle | Recycled from pool |
varnish.sess1.timeout | Timed out from pool |
varnish.sess1.toosmall | Too small to recycle |
varnish.sess1.surplus | Too many for pool |
varnish.sess1.randry | Pool ran dry |
SMA Metrics
Namespace | Description |
---|---|
varnish.s0.c_req | Allocator requests |
varnish.s0.c_fail | Allocator failures |
varnish.s0.c_bytes | Bytes allocated |
varnish.s0.c_freed | Bytes freed |
varnish.s0.g_alloc | Allocations outstanding |
varnish.s0.g_bytes | Bytes outstanding |
varnish.s0.g_space | Bytes available |
varnish.Transient.c_req | Allocator requests |
varnish.Transient.c_fail | Allocator failures |
varnish.Transient.c_bytes | Bytes allocated |
varnish.Transient.c_freed | Bytes freed |
varnish.Transient.g_alloc | Allocations outstanding |
varnish.Transient.g_bytes | Bytes outstanding |
varnish.Transient.g_space | Bytes available |
VBE Metrics
Namespace | Description |
---|---|
varnish.boot.default.happy | Number of happy health probes |
varnish.boot.default.bereq_hdrbytes | Total request header bytes |
varnish.boot.default.bereq_bodybytes | Total request body bytes |
varnish.boot.default.beresp_hdrbytes | Total response header bytes |
varnish.boot.default.beresp_bodybytes | Total response body bytes |
varnish.boot.default.pipe_hdrbytes | Total pipe request header bytes |
varnish.boot.default.pipe_out | Total piped bytes to backend |
varnish.boot.default.pipe_in | Total piped bytes from backend |
varnish.boot.default.conn | Number of concurrent connections to backend |
varnish.boot.default.req | Number of backend requests sent |
LCK Metrics
Namespace | Description |
---|---|
varnish.backend.creat | Created locks |
varnish.backend.destroy | Destroyed locks |
varnish.backend.locks | Lock Operations |
varnish.backend_tcp.creat | Created locks |
varnish.backend_tcp.destroy | Destroyed locks |
varnish.backend_tcp.locks | Lock Operations |
varnish.ban.creat | Created locks |
varnish.ban.destroy | Destroyed locks |
varnish.ban.locks | Lock Operations |
varnish.busyobj.creat | Created locks |
varnish.busyobj.destroy | Destroyed locks |
varnish.busyobj.locks | Lock Operations |
varnish.cli.creat | Created locks |
varnish.cli.destroy | Destroyed locks |
varnish.cli.locks | Lock Operations |
varnish.exp.creat | Created locks |
varnish.exp.destroy | Destroyed locks |
varnish.exp.locks | Lock Operations |
varnish.hcb.creat | Created locks |
varnish.hcb.destroy | Destroyed locks |
varnish.hcb.locks | Lock Operations |
varnish.lru.creat | Created locks |
varnish.lru.destroy | Destroyed locks |
varnish.lru.locks | Lock Operations |
varnish.mempool.creat | Created locks |
varnish.mempool.destroy | Destroyed locks |
varnish.mempool.locks | Lock Operations |
varnish.objhdr.creat | Created locks |
varnish.objhdr.destroy | Destroyed locks |
varnish.objhdr.locks | Lock Operations |
varnish.pipestat.creat | Created locks |
varnish.pipestat.destroy | Destroyed locks |
varnish.pipestat.locks | Lock Operations |
varnish.sess.creat | Created locks |
varnish.sess.destroy | Destroyed locks |
varnish.sess.locks | Lock Operations |
varnish.smp.creat | Created locks |
varnish.smp.destroy | Destroyed locks |
varnish.smp.locks | Lock Operations |
varnish.vbe.creat | Created locks |
varnish.vbe.destroy | Destroyed locks |
varnish.vbe.locks | Lock Operations |
varnish.vcapace.creat | Created locks |
varnish.vcapace.destroy | Destroyed locks |
varnish.vcapace.locks | Lock Operations |
varnish.vcl.creat | Created locks |
varnish.vcl.destroy | Destroyed locks |
varnish.vcl.locks | Lock Operations |
varnish.vxid.creat | Created locks |
varnish.vxid.destroy | Destroyed locks |
varnish.vxid.locks | Lock Operations |
varnish.waiter.creat | Created locks |
varnish.waiter.destroy | Destroyed locks |
varnish.waiter.locks | Lock Operations |
varnish.wq.creat | Created locks |
varnish.wq.destroy | Destroyed locks |
varnish.wq.locks | Lock Operations |
varnish.wstat.creat | Created locks |
varnish.wstat.destroy | Destroyed locks |
varnish.wstat.locks | Lock Operations |
varnish.sma.creat | Created locks |
varnish.sma.destroy | Destroyed locks |
varnish.sma.locks | Lock Operations |
Tags
The table below outlines the default set of tags provided for each metric.
Tag Name | Description |
---|---|
hostname | Name of the host. Instead of using this tag we recommend using the @host alias. |
section | Prefix of the original varnish stat (one of main, mgt, mempool, sma, vbe, lck) |
Navigation Notice: When the APM Integrated Experience is enabled, AppOptics shares a common navigation and enhanced feature set with other integrated experience products. How you navigate AppOptics and access its features may vary from these instructions.
The scripts are not supported under any SolarWinds support program or service. The scripts are provided AS IS without warranty of any kind. SolarWinds further disclaims all warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The risk arising out of the use or performance of the scripts and documentation stays with you. In no event shall SolarWinds or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the scripts or documentation.