Documentation forSolarWinds Platform

Group status

When a group is created, you can set the status rollup mode for the group. The rollup modes allow you to make choices on how the groups status will be determined, using the status of the group members. The choices are Best status, Worst status and Mixed status. Best status shows the group status the same as the best status of any group member(s), disregarding the status of all other members. Worst status shows the group status as the worst status of any member(s), again disregarding the status of all other members. Mixed status will show the group as a single status when all members are in that particular status and warning when there are members with different status. Here are some conclusions you can make according to the status rollup you choose.

  • For Best status rollup:
    • If the group status is Critical, all members are in the critical state.
    • If the group’s status is anything other than Critical, at least one member is in the displayed state. No members are better than the displayed status. It is also possible that all members are in this state.
  • For Worst status rollup:
    • If the group status is Up, all members are in the Up state.
    • If the group’s status is anything other than Up, at least one member is in the displayed state. No members are worse than the displayed status. . It is also possible that all members are in this state.
  • For Mixed status rollup:
    • If all members are in the same status, that status will be the group status.
    • If the group status is warning, either the group contain items of differing status (most common), or all members have the status of Warning (uncommon).

While this may seem complicated, the logic for choosing which type of rollup is fairly straight forward.

  • For a group where every direct member (not member of a subgroup) is critical to be in the Up state, choose Worst status rollup. This will ensure that if any member has an issue, you will see that reflected in the group status and any alerts or reports created for that group.

  • For groups with redundant member resources, such a dual attached WAN, choose Mixed or Worst status rollup, depending on the criticality of a worst-case, single-member failure.
  • For groups with a high level of redundancy throughout all direct members, choose best status.

The reason why we specify direct members and subgroup members is to allow for the group status rollup to be an additive rollup, from the lowest level subgroup to the top-level group. Take the following datacenter (DC1) for example.

Let’s create one top-level group called DC1 and then create member subgroups for all the like items. The choices are many. We could create groups for each redundant server pair or a group for all redundant server pairs. The non-redundant servers could exist as individual objects or as one or more groups. To make a plan on how to arrange these, we will first consider our goal; to manage the data center network. This means that at this time we are not concerned with the processes that are enabled by the network, just that the network is available and performing well. In examining the naming conventions for the data center switches we find that they have well planned and consistent device names as follows:

  • All core switches are named DC1-core-xx, where xx is the core switch number.
  • All service switches are named DC1-service-xx, where xx is the service switch number.
  • All distribution switches are named DC1-dist-xx, where xx is the distribution switch number. With this in mind we create the three dynamic service groups for the above items.

Dynamic service group DC1 Core where a query for “system name contains DC1-core”. Likewise groups are made for the service switches and distribution switches. Now we’ll look at the servers. In speaking with the server owners, they state that they don’t care if a server is redundant or not. If a server is having a problem, they want to be able to see that from all levels. With this information we set the servers sub groups to all show worst member status as group status. We find we can add the redundant servers using a dynamic query, but unfortunately we are unable to identify any common and unique qualifiers for the non-redundant servers, so these servers will be added statically, as individual members of a DC1- non- redundant-servers subgroup of DC1. The only items left are the switch ports and links to the corporate network. These have good consistent port descriptions which allow us to create port and port type groups. Seeing this, we create dynamic groups for the corporate network ports, the core ports, and the distribution ports.

By looking at all the network equipment, heavy use of connection redundancy, we take the same path as the server teams and set the subgroups status in each case to reflect the worst member status. Then we also set the status of the DC1 group to the status of the worst member. Here we have taken the most conservative approach to managing the group status. When any object in any of these groups fails or slows enough to trigger a threshold, we will see that status reflected as the status of DC1. But is this a wise idea? While we do want to quickly find and identify the failed element in DC1, having the group status set to the worst status will probably indicate that DC-1 status in Up (green) or the DC-1 status is Down (red), Warning (yellow) when in all three of these possible cases DC-1 as a whole is perfectly operational. A better choice would be to keep the subgroups as Worst status and set the status of DC-1 to Mixed. In so doing, DC-1 will be green when every element of the group is Up and will have a warning status if there are elements with a status lower that Up. Perhaps the worst choice would be to set the DC- 1 group status to Best. If we did this, DC-1 would always have an Up status, even if every member but only one has failed.

Here, we have created one top-level group and ten subgroups, all at the second level. You can choose to embed subgroups as far as you want into other subgroups. There is no hard limit, but as you embed groups deeper, the logic becomes more complex. Objects can also be members of multiple groups.

Implementing reports indicating group membership and careful examination of existing and planned groups is recommended.

A couple rules of thumb should be considered when creating subgroups.

  1. Determine if you can accomplish the same goal without using subgroups.
  2. Keep the subgrouping as flat and as simple as possible. The more subgrouping levels, the more difficult it is to understand the logic flow from one level to higher levels. Depending on the complexity of the subgroups, the logic can increase as much as n2, where n is the number of group layers. The dependencies logic will also become complicated.

Perhaps after setting up this grouping and rolling out the status to maps and user views, you get a complaint from the inventory management department. Their complaint is that it is hard to see in the current grouping, if there is a problem directly affecting their data processing done in the datacenter. Because they are such a small portion of the data center, they must investigate what caused DC-1’s status to change to see if any of their critical devices are in trouble. This is time consuming and causes what they are calling false positives.

This department uses two non-redundant application servers, the clustered database, an IP path to the input web portal offsite and an IP path to a business partner connection. They don’t really care to see that a redundant link or redundant equipment is down. They just want to know if the inventory management system is working or not. With this in mind, here is what we create. First, we create a DC-1 Inventory- Mgmt group. Then, we add the same groups for the entire redundant network infrastructure as we did in the DC-1 group, but we set the status of each to Best. This is because they are only interested in knowing the datacenter network works for what they need. With the high level of redundancy, chances are, the best rollup status for these items will always be Up. We don’t need to add the redundant servers, as they don’t use those. Then, we add the DB cluster and individual application servers as individual static members of DC-1 Inventory-Mgmt group. Now, we set the DC-1 Inventory-Mgmt group rollup status. Because the servers are non-redundant, we need to show that there is a problem with those or with the database. Therefore, we set the DC-1 Inventory-Mgmt top-level rollup to Worst. Now if any single, non-redundant portion fails or if there are any complete failures across a redundant portion of the network, the top-level group will indicate a failure in Inventory Management within the datacenter. But, what about the partner connection and web interface? If we could add the business partner interface management and portal testing as part of our group, this would give a much more complete picture of the abilities of the network to support the Inventory Management business task.

The group function built into Orion core allows you to add objects from Orion modules.

What we do now is create an ICMP echo IP SLA operation in Orion IP SLA Manager from a point in the datacenter to the internal business partner connection port. Intra-data center IP traffic normally has round trip times measured in microseconds, so it doesn’t really matter where in the database we place the operation. After creating the operation, we add it to the DC-1 Inventory-Mgmt group. Next, using the Orion Application Monitor, we add a user experience test for logging into the inventory manager web interface. This test is then added as an object into the DC-1 Inventory-Mgmt group. We may also add statistics on the application servers’ volumes to the DC-1 Inventory-Mgmt group. Here is the final grouping.

  • DC-1 Inventory-Mgmt Group. Status Rollup = Worst member status
    • App server #1 as member
    • App server #2 as member
    • All server volumes (#1 and #2) as members
    • All direct connections to server as individual members
    • IP SLA ICMP echo operation as member
    • APM web test as member
    • Core switch group as member. Status rollup = Best member status
    • Core switch ports group as member. Status rollup = Best member status
    • Service switch group as member. Status rollup = Best member status
    • Service switch ports group as member. Status rollup = Best member status
    • Distribution switch ports group as member. Status rollup = Best member status
    • Distribution switch group as member. Status rollup = Best member status

You would probably further group the switch ports by function (inter core connection, core to service, service to distribution, et cetera), but I have not added those as they are not necessary to show the functions of groups and subgroups. So, we have created a datacenter network management group and an inventory management business process management group. Each has its own goal and functions to meet the needs of each party. There are objects that are members on multiple groups, groups with different rollup status as well as static and dynamic members of the groups. Using groups is a powerful feature for organizing objects and adding logic to the relationships between objects. It also enables another powerful feature of Orion – dependencies.