In this post, I’ll discuss how to protect your income by using the FEX pre-provisioning capability of NXOS. I discovered the hard way that not pre-provisioning your FEX can have catastrophic side effects. What better story to post on Friday the 13th?
Attaching a FEX to a Nexus switch is relatively simple; a few commands on each of the two switches the FEX connects to and it’s up and running. It’s also possible to pre-provision the FEX modules in the configuration. The documentation doesn’t make it entirely clear why this would be desirable, beyond the rather cryptic:
In some Virtual Port Channel (vPC) topologies, pre-provisioning is required for the configuration synchronization feature. Pre-provisioning allows you to synchronize the configuration for an interface that is online with one peer but offline with another peer.
Got that? In other words, pre-provisioning makes it possible to configure a FEX module that isn’t there yet, or that is powered down, or is only connected to one side of a VPC pair for some inexplicable reason. Maybe I’ve ordered some
(plural of FEX) and want to configure the ports ahead of time? Whatever the rationale for doing so, I’ve never previously needed pre-provisioning for FEX modules, and working this way has never bitten me. Or, I should say, had never bitten me.
Replacing a Nexus Switch
I wrote a post earlier this year called No Hassle Hardware Replacement with DCNM. I stand by that post, but there is one really important issue which I did not take into account.
A few months ago I had to RMA a Nexus switch which had FEXen attached. I followed the same process I described in my DCNM post; I configured a serial number, identified the NXOS version to install and uploaded the configuration from the switch which was being replaced. I powered up the switch and DCNM performed its magic, upgrading the code, and sending the configuration to the switch. I checked the uplinks to the fabric spine, the peering with neighboring switches, and the connectivity to the attached compute stacks and all was fine. Five minutes later, the red alert klaxon was sounding and it was obvious that something had gone very wrong.
Here’s a simplified version of what happens when DCNM performs Power On Auto Provisioning (POAP):
- Loads the desired software image to the Nexus switch
- Sets the boot parameters to load the new software on next reload
- Installs the switch config as a ‘scheduled configuration’ to read after reload
The scheduled configuration is smarter than it might sound. Imagine that the Nexus switch is currently running 4.x, and the desired version of code is 5.x, and the configuration contains commands that are only available in 5.x. If the configuration were applied while the switch was still running 4.x, the 5.x-only commands would fail. Thus a scheduled configuration is only loaded after the switch has reloaded and booted from the new software version (5.x) and the commands will be valid. Clever stuff.
And The Problem Is?
The scheduled configuration on the Nexus switch gets parsed and installed before the attached FEXen have completed loading and are online. As a result, all configurations referring to ports on FEX modules are rejected because they refer to invalid port numbers. That’s not good, but let’s not worry because the other switch in the Nexus VPC pair is still up and running with the full configuration, right?
VPC consistency is an interesting beast. FEX ports have to be configured identically on both switches in order for them to work; if they aren’t configured identically, they, uh, get suspended. The scheduled configuration–which has loaded with all the FEX port configuration rejected–now means my two switches are out of sync, so all of the FEX ports on the second switch go into suspended mode. This is not good, as is probably obvious, because now all my FEX ports on all attached FEXen have gone down, which means so did all the servers connected to the FEX.
The Solution: Pre-Provisioning
Pre-provision your FEXen! For example:
provision model N2K-C2248T
provision model N2K-C2248T
When the scheduled configuration loads with pre-provisioning commands in it, the Nexus now knows how many ports (and what kind) to pre-allocate on what virtual slot, so the configuration doesn’t get rejected. Problem solved!
It Is Known
I should note that this is not a bug; this is expected behavior and Cisco notes this in the VPC Operations Guide which I’m sure we’ve all read carefully. The documentation provides a good set of steps to follow, but they are impractical where DCNM is doing POAP. The guide also discusses needing configuration sync enabled. For a variety of reasons I don’t use configuration sync, but the rest of the steps are still relevant.
The Cisco Nexus engineering team says that this behavior with a scheduled configuration is expected and is by design, so there’s nothing to fix in NXOS. I would argue that it wouldn’t be rocket science for the switch to look at the config and say Oh, wait a moment, these fex-fabric ports suggest that maybe we have a FEX coming online and I should wait to apply anything on this slot until I see something. Maybe that would make things worse.
The DCNM engineering team also sees nothing to fix on the DCNM side; it is successfully delivering the configuration as requested. Consequently I’ve made a feature request. In DCNM when a configuration is uploaded for POAP, DCNM looks at the configuration and extracts the hostname and the management IP so that those data can be displayed in the POAP status tables. I’ve asked that DCNM goes one step further, and looks for the commands
switchport mode fex-fabric in the configuration while not also seeing
slot XXX\provision ... in the same. If that’s the case, it’s evident that a FEX is, or is intended to be, connected to the switch but pre-provisioning has not been configured. While DCNM would not be able to automatically insert pre-provisioning commands for you, would it hurt to pop up an alert which says You have uploaded a configuration containing FEX interfaces but no pre-provisioning configuration exists. This may take down all FEX ports when deployed! Maybe it will happen, though I’m sure it’s a low priority feature request.
In conclusion: pre-provision your FEXen if you’re going to do POAP or otherwise activate a configuration prior to the FEX modules coming online. Read the manuals, perhaps? Either way, lesson learned.