Saturday 3 August 2019

BIG-IP HIGH AVAILABILITY PART THREE

The previous post we went through configuring HA between the two devices. In this post we'll look at how we can verify ConfigSync is working as expected. As noted on the very first post, I won't be convering failover or mirroring in this series.

Verification & Testing

At this point we should now have successfully setup an active/standby BIG-IP system as indicated by both devices showing as Online and In Sync.
We can also confirm this on the CLI:

root@(ltm-1)(cfg-sync In Sync)(Standby)(/Common)(tmos)# show cm sync-status
-------------------------------------------------------------------------------------
CM::Sync Status
-------------------------------------------------------------------------------------
Color     green
Status    In Sync
Mode     high-availability
Summary  All devices in the device group are in sync
Details
         ltm-2.lab.com: connected
         LAB-SYNC-FAILOVER-GRP (In Sync): All devices in the device group are in sync
         device_trust_group (In Sync): All devices in the device group are in sync


Now let's make a change on the active node and observe what happens. I'll create a dummy node object. Before I commit the change let's see what the status of the commit IDs are between the devices by using the run /cm watch-devicegroup-device command:

We can see the Commit ID all showing the same number which tells us the devices are fully synchronised. It also tells us which device originated the last Commit ID.

As soon as I create the object the Commit ID is incremented by 2 on the device I made the change. 

This information is also sent to the peer via the MCP channel:

root@(ltm-1)(cfg-synChanges Pending)(Standby)(/Common)(tmos)# run /cm sniff-updates
Listening for commit_id_update on -i internal:h port 6699 (^C to exit)
[09:43:05ltm-2.lab.com (v11.6.0) -> LAB-SYNC-FAILOVER-GRP: UPDATE CID 46.0 (ltm-2.lab.com) at 09:43:05


At this point the system should tell us that changes are pending:

If we click the Changes Pending hyperlink and then perform the sync we can see that the Commit IDs are now back in sync between the devices:

If we run the same command again we see this information again passed over the MCP channel. The output below tells us the new Commit ID (46.0) and the previous (44.0):

[09:48:34ltm-2.lab.com (v11.6.0) -> LAB-SYNC-FAILOVER-GRP: UPDATE CID 46.0 (ltm-2.lab.com) at 09:43:05 FORCE_SYNC
[09:48:34] 10.128.2.110:49090 -> LAB-SYNC-FAILOVER-GRP: SYNC_REQ CID 44.0 (ltm-2.lab.com) at 09:37:01
[09:48:35] ltm-1.lab.com (v11.6.0) -> LAB-SYNC-FAILOVER-GRP: UPDATE CID 46.0 (ltm-2.lab.com) at 09:43:05


If we go to Device Management  ››  Overview, in the Devices section if we click Show Advanced View we can see some more useful information:

Here is what some of the less obvious items mean:

  • CID Originator: The Commit ID originator displays the device that performed that last successful sync between the devices.
  • CID Time: The Commit ID Time displays the most recent Commit ID time. This matches the output in the command above when the we first created the dummy node and clicked Create.
  • Last Sync Time: Displays the time we initiated the ConfigSync. You'll see it matches the timestamp on the output just above the screenshot.
  • Last Sync Type: Displays the last type of sync that was performed. If this is the first sync it will show as 'Manual Full Load'. If it is an incremental sync it will display 'Manual Incremental'. Note, it uses the word 'manual' because I have disabled Auto Sync.
  • LSS Originator: This field displays the device that most recently performed a successful sync operation.  Typically this will match the CID Originator.
  • LSS Time: This field displays the most recent successful sync. Typically this will match the CID Time.
Lastly on this point, you should see the following log, either locally or on your Syslog/SIEM:

Apr 15 09:48:34 ltm-2 notice mcpd[6806]: 0107168c:5: Incremental sync complete: This system is updating the configuration on device group /Common/LAB-SYNC-FAILOVER-GRP device %cmi-mcpd-peer-/Common/ltm-1.lab.com from commit id { 44 6409145281621881942 /Common/ltm-2.lab.com } to commit id { 46 6409146844437164723 /Common/ltm-2.lab.com }.

At this point the new configuration should now be on both boxes.

Summary

High availability on a BIG-IP system is clearly a big topic as there are a number of different moving parts. If implemented properly and if you have a good understanding of the process it should make troubleshooting any issues a little easier.

Thanks for reading.






No comments:

Post a Comment

iRule

  iRule: -- o iRule is a powerful and flexible feature within the BIG-IP local traffic management (LTM). o IRule is a powerful & flexibl...