High Availability in SolarWinds Platform products
This topic applies only to the following products:
SolarWinds Observability Self-Hosted
DPAIM — EOC — IPAM — LA — NAM — NCM — NPM — NTA — SAM — SCM — SRM — UDT — VMAN — VNQM — WPM
SolarWinds Platform High Availability (HA) provides failover protection for your SolarWinds Platform server and Additional polling engines to reduce data loss when your primary server goes down. If your primary server fails, the HA feature allows your secondary server to take over all services, such as polling and alerting, with minimal downtime. SolarWinds HA protects your main server, also known as your main polling engine, and Additional polling engines. It does not protect your databases or your Additional web servers.
SolarWinds supports physical-to-physical, physical-to-virtual, virtual-to-physical, and virtual-to-virtual failover in an IPv4 single subnet (High Availability) or multi-subnet (Disaster Recovery) environment. You can deploy SolarWinds Platform High Availability on both a single subnet and multiple subnets using the same SolarWinds installation.
How does SolarWinds Platform High Availability work?
Single subnet (LAN)
When you configure your environment for SolarWinds Platform High Availability on a single subnet, place your secondary server on the same subnet as the primary server. Configure the secondary server to use the same network and database resources as the primary server. In the SolarWinds Platform Web Console, add both servers to an HA pool, which is accessed through a single Virtual IP (VIP) address or virtual hostname to route incoming requests and messages to the current, active server.
The SolarWinds HA software monitors the health of both servers in the pool, and both servers keep open communication channels over TCP port 5671 to exchange information. When a critical service goes down, such as the SolarWinds Information Service, the software starts the service. If the service goes down a second time within an hour, the software initiates a failover to the standby server.
After a failover to the secondary server is complete, the secondary server becomes the active server and continues to act as the active server until another failover event occurs. The secondary server assumes all of the responsibilities of the primary server, including receiving syslogs, SNMP traps, and NetFlow information through the VIP or virtual hostname. You can manually failover to your primary server to return it to active service.
If you have deployed SolarWinds Platform Agents, agents that report to the primary server are updated with the IP addresses of the HA pool members. When the server fails over, the agents send data to the active HA pool member's IP address.
Multiple subnets (WAN)
When you configure your environment for SolarWinds Platform High Availability over a WAN (Disaster Recovery), place your secondary server in the same DNS zone as your primary server. Configure the secondary server to use the same database resources as the primary server. In the SolarWinds Platform Web Console, add both servers to an HA pool, which is accessed through a single virtual hostname to route incoming requests and messages to the current, active server. You can have only two servers in a pool.
The SolarWinds HA software monitors the health of both servers in the pool, and both servers keep open communication channels over TCP port 5671 to exchange information. When a critical service goes down, such as the SolarWinds Information Service, the software starts the service. If the service goes down a second time within an hour, the software initiates a failover to the standby server and edits the DNS host entry to point to the standby server.
After a failover to the secondary server is complete, the secondary server becomes the active server and continues to act as the active server until another failover event occurs. The secondary server assumes all of the responsibilities of primary server, including receiving syslogs, SNMP traps, and NetFlow information through the virtual hostname. You can manually failover to your primary server to return it to active service.
If you have deployed SolarWinds Platform Agents, agents that report to the primary server are updated with the IP addresses of the HA pool members. When the server fails over, the agents send data to the active HA pool member's IP address.
Learn more: