Because I forget this and it always seems to cause me more pain than it should to have to rebuild one. I’ve had this happen 3 times in the 8 years of dealing with the physical 1600 LTMs all of them have failed due to some power problem that won’t let them startup completely and I end up spending 8 or more hours having to rebuild them and figure out what the heck happened to them. Luckily they have always been in a fault tolerant pair so I haven’t been down completely, but have never wanted to push the amount of time one is down because of how important they are to my company.
Call into Support and open a ticket with the s/n of the failed unit and the error message on the screen.
If you don’t already have enhanced 4 hour replacement ask for an upgrade to it via credit card. Waiting more than 4 hours is very painful and dangerous for us.
Wait 4 hours for the new unit to come in.
Unrack the currently failed unit making sure that all of the cables are correctly labeled and ready to be plugged into the new unit.
Download the current version ISO along with any hot fixes to match the current install version. Download your latest backup for the unit and have it all ready and waiting to go on your laptop.
On the active unit make sure to clear out any ssh keys if needed from the failover interface
Also Reset the Device Trust under Device Management/Device Trust on the active unit
When the new unit finally arrives rack it and plug in at least the serial cable and the management ethernet cable. Before powering on plug in the recovery USB stick if it came with one that has the version of LTM that you need on it. This will greatly simplify the upgrade process and get it to at least the major version you need.
Once the unit has been upgraded to at least the major base version that you need. Login via the serial console with root/default and type config. This will let you set the management IP address for the unit.
Once the management address is set, connect to it via the browser with admin/default and start going through the licensing and configuration process.
Upload the hot fixes if necessary to the replacement unit and update to the version needed to restore the backup file. Once the hot fixes are done updating go ahead and restore the backup to the failed unit.
Hookup the failover ethernet cable.
Set backup the HA configuration between the units and ensure that you can ssh between the units on their failover interfaces.
Push the configuration from the Active unit to the new unit with an override, if it fails or there is any issue during the time run this command on the failed unit to see what the issue is:
tmsh show cm sync-status
Once it’s all done and happy it should be back in sync and in an active/standby state.
Then plug in the last of the cables for the internal/external interfaces and then you should be done.
Pack the old unit up and ship it out.