Troubleshoot a network issue caused by a network config change
Changes to configs can range from simple (using a template to create a login banner) to complex (using a script to change VLAN membership). Errors introduced in complex config changes can result in a network outage. By monitoring your network, you can learn about network problems before they affect your business critical applications. Instant notification of an error is essential to resolving a problem before business is disrupted and support calls are logged.
In the following scenario, a support organization uses NetSuite® as their customer relationship management system. The organization uses NetPath to monitor the path from the NetSuite service to the support organization. One day during business hours, a system administrator receives an alert notification email that the NetSuite service is unavailable and needs to investigate and resolve the problem immediately.
The system administrator begins by reviewing the details of the alert.
Review the Active Alert Details page
An alert is a notification that indicates a problem with a monitored element. There are different options for how to receive an alert. See How alerts work in NPM for more information on alerts in Orion Platform products.
In an alert notification email, click the provided link to open the Active Alert Details page. This alert is critical and needs to be addressed immediately. The system administrator clicks the link next to Triggered by to open NetPath and find if the problem is on an internal monitored device or caused by an external provider.
Identify root cause of the problem
Use NetPath to discover and troubleshoot network paths, hop-by-hop, of the networks that you manage and the nodes and links of your providers. NetPath provides performance metrics and device details of the nodes, interfaces, and connectors it finds. Point to objects to see more details using the Object Inspector, or drill down on managed nodes.
The color red indicates where on the path there is a problem.
The system administrator notices the node is:
- Part of the organization's internal network
- Experiencing high latency
- Showing a recent configuration change
To explore the configuration change, the system administrator clicks Config Change. The system compares the current config to last backed up config.
The config comparison shows that in line 180, an IP address is added to the current config. This routing change prevents traffic from accessing its destination and creates network performance issues.
The system administrator identified the root cause of the problem and knows that one solution is to revert to the last backed up config.
Revert a config
The system administrator clicks the name of the node (in this case, R9) on the right side of the NetPath page to open the Node Details page. This capability is also useful if someone makes an unauthorized or incorrect config change, and you want to revert to a prior version.
There are two steps the system administrator needs to perform to revert the config:
Click the Configs tab.
Select the configuration item to revert back to and then click Upload.
NetPath refreshes a path during each polling interval. In this example, the polling interval is 10 minutes. The change to revert the node is made immediately and the service is restored, but NetPath does not show the updated path until the next polling interval completes.
For more information on the setup necessary to replicate the troubleshooting and solution steps in this topic, see How was it done? Troubleshoot a network issue caused by a config change.