Juniper In-Service Software Downgrade (ISSD)

Junos

In which we discover that the “U” in “ISSU” really does mean “Upgrade”.

I was experimenting with ISSU (In-Service Software Upgrade) on the Juniper SRX platform this week. When it works, it’s great; a seamless upgrade of both devices in a cluster, handily allowing the SRX firewalls to share state half way through the process even though the two cluster members are, at that moment, running different versions of Junos.

But what if something breaks after your upgrade and you need to return to the old version? ISSD? I’m afraid not.

Computer Says No

So it turns out that you cannot run ISSU if the version of code you are trying to install is an earlier version than the code currently running. There is, I’m sure, a good reason for this, but oddly the error messages I got did not indicate that this was the problem.

This testing was performed on an SRX5600 cluster, so the slot numbers are 0–5 (node 0) and 6–11 (node 1).

What I Hoped Would Happen

john@LabFW01> request system software in-service-upgrade <file> reboot
...<downgrade occurs>...
Installation complete.

What Should Happen

john@LabFW01> request system software in-service-upgrade <file> reboot
Error: Cannot use ISSU to install an earlier version of Junos.

What Actually Happened

LabFW01> request system software in-service-upgrade <file> reboot
Chassis ISSU Started
Chassis ISSU Started
ISSU: Validating Image
Initiating in-service-upgrade
Initiating in-service-upgrade
Checking compatibility with configuration
Initializing...
Verified manifest signed by PackageProduction_11_4_0
Verified junos-11.4R9.4-domestic signed by PackageProduction_11_4_0
Using /var/tmp/junos-srx5000-11.2R5.4-domestic.tgz
Checking junos requirements on /
Available space: 290116 require: 4796
Saving boot file package in /var/sw/pkg/junos-boot-srx5000-11.2R5.4.tgz
Verified manifest signed by PackageProduction_11_2_0
Hardware Database regeneration succeeded
Validating against /config/juniper.conf.gz
/config/juniper.conf:145:(12) fpc value outside range 0..5 for '9/0/0' in 'ge-9/0/0' at 'ge-9/0/0'
  [edit interfaces]
    'ge-9/0/0 {'
      fpc value outside range 0..5 for '9/0/0' in 'ge-9/0/0'
/config/juniper.conf:150:(5) error recovery ignores input until this point at '}'
  [edit interfaces]
    '}'
      error recovery ignores input until this point
/config/juniper.conf:151:(12) fpc value outside range 0..5 for '9/0/2' in 'ge-9/0/2' at 'ge-9/0/2'
  [edit interfaces]
    'ge-9/0/2 {'
      fpc value outside range 0..5 for '9/0/2' in 'ge-9/0/2'
/config/juniper.conf:156:(5) error recovery ignores input until this point at '}'
  [edit interfaces]
    '}'
      error recovery ignores input until this point
/config/juniper.conf:167:(25) fpc value outside range 0..5 for '9/0/15' in 'ge-9/0/15' at 'ge-9/0/15'
  [edit interfaces fab1 fabric-options member-interfaces]
    'ge-9/0/15;'
      fpc value outside range 0..5 for '9/0/15' in 'ge-9/0/15'
warning: statement must contain additional statements
Validation failed
Validating against /config/rescue.conf.gz
/config/rescue.conf.gz:145:(12) fpc value outside range 0..5 for '9/0/0' in 'ge-9/0/0' at 'ge-9/0/0'
  [edit interfaces]
'ge-9/0/0 {'
  fpc value outside range 0..5 for '9/0/0' in 'ge-9/0/0'
/config/rescue.conf.gz:150:(5) error recovery ignores input until this point at '}'
  [edit interfaces]
    '}'
      error recovery ignores input until this point
/config/rescue.conf.gz:151:(12) fpc value outside range 0..5 for '9/0/2' in 'ge-9/0/2' at 'ge-9/0/2'
  [edit interfaces]
'ge-9/0/2 {'
  fpc value outside range 0..5 for '9/0/2' in 'ge-9/0/2'
/config/rescue.conf.gz:156:(5) error recovery ignores input until this point at '}'
  [edit interfaces]
    '}'
      error recovery ignores input until this point
/config/rescue.conf.gz:167:(25) fpc value outside range 0..5 for '9/0/15' in 'ge-9/0/15' at 'ge-9/0/15'
  [edit interfaces fab1 fabric-options member-interfaces]
    'ge-9/0/15;'
      fpc value outside range 0..5 for '9/0/15' in 'ge-9/0/15'
warning: statement must contain additional statements
Validation failed
WARNING: Current configuration not compatible with /var/tmp/junos-srx5000-11.2R5.4-domestic.tgz
Exiting in-service-upgrade window
Exiting in-service-upgrade window
Chassis ISSU Aborted
Chassis ISSU Aborted
Chassis ISSU Aborted
ISSU: IDLE
ISSU aborted; exiting ISSU window.

{primary:node0}
john@LabFW01>

My favorite error message from that lot?

warning: statement must contain additional statements

I’ll get right on that. So anyway, if you didn’t notice, the config validation failed because apparently references to slot 9 aren’t valid in a chassis that only has 6 slots. True enough, unless you’re running in a cluster configuration, in which case, uh, yeah it is. And there was no issue with that same configuration during the previous upgrade process, was there?

Juniper documentation (reference below) confirms that:

“The ISSU process is aborted if the Junos OS version specified for installation is a version earlier than the one currently running on the device.”

Good to know.

No-Validate

But John, surely we can just skip the validation that’s claiming to make the upgrade fail, by using the “no-validate” option? Well, you might, but not on the SRX5600 which does not offer that option with ISSU. It seems that this command option is in fact only available on branch SRX for some inexplicable reason (reference below).

How To Downgrade Your SRX Cluster?

A colleague very kindly threw a link in my direction showing Juniper’s recommended way to downgrade the cluster with minimal outage. Sadly it’s not as good as ISSU and your traffic will take a hit, but it’s at least smart enough to avoid a ‘dual active’ situation, which is a high risk when you downgrade a cluster and the two devices (running different code versions) will not talk to each other and thus assume they are both the boss.

In short the solution is to use the regular request system software add <file> no-validate reboot command on each side, but with a number of steps added in oder to isolate each side during the process before finally bringing everything back up. The process is a huge pain (especially compared to ISSU), but worth following if you’re in that situation and need to take as small a hit as possible.

Why Bother Mentioning It?

I’m sharing this because when I searched the intarwebs for this ISSU validation error, there were precious few references explaining what this all meant. So now it’s here, hopefully it will be indexed and the next lucky person to see something like this will understand that the configuration is not the problem here, it’s that you can’t do “ISSD”. And perhaps that’s why ”ISSD” it not an acronym that gets used much.

References

Be the first to comment

Leave a Reply

Your email address will not be published.


*


 

This site uses Akismet to reduce spam. Learn how your comment data is processed.