Layer 2 Routing (sort of) and TRILL

May 16, 2011 John Herbert Networking 11

The post title alone is cause for fighting in some circles (it’s just an invitation for argument and I know it’s more of a marketing thing than a technically accurate description), but work with me here. On one level or another, there is growing interest and marketing around the concept of being able to eliminate Spanning Tree Protocol (STP) in layer 2 networks and enabling multipathing in bridged networks. It’s hard to have missed Cisco’s plugging of their FabricPath technology, and underneath all the marketing, routing frames from A to B is pretty much what it is about.

For the purposes of this post, we’ll look at why STP is such a beast to begin with (and let’s face it, that could be a multi-part post on its own), then I’ll look in a series of posts at three competing options that would allow you to get rid of it:

TRILL (in this post)
Shortest Path Bridging / 802.1aq
Juniper’s QFabric

Let’s dive straight in then.

Why Spanning Tree Sucks

When Radia Perlman designed the Spanning Tree algorithm one fateful night in 1985, she created a protocol that is arguably both a huge networking enabler and the most evil beast in existence. The fact that it came with a free “Algorhyme” pretty much guaranteed its acceptance of course:

Algorhyme
I think that I shall never see
A graph as lovely as a tree.
A tree which must be sure to span.
So packets can reach every LAN.
First the root must be selected.
By ID, it is elected.
Least cost paths from Root are traced.
In the tree these paths are placed.
A mesh is made by folks like me.
Then bridges find a spanning tree.
–Radia Perlman

As a quick overview here’s a very abbreviated pros/cons evaluation of STP as I see it:

The Good

You can join Ethernet segments using Bridges and deliberately introduce loops for resiliency, but without traffic looping around infinitely;
STP is a distributed algorithm that scales in terms of CPU/memory requirements regardless of network size;
STP is plug and pray (within reason);
STP has allowed campus and datacenter networks to scale with resilient links.

The Bad

STP recalculations => potentially long data interruption;
Many ‘enhancements’ and proprietary variations over the years;
Resiliency yes, but hideously inefficient utilization of the available bandwidth – no way to load balance traffic.

There was a great diagram I saw in the slide deck for Cisco Live 2010 session BRKDCT-2079 demonstrating the last point above. To some extent it’s a case of reductio ad absurdum, but the point was made that STP “takes a perfectly good meshed network and reduces it to a tree”. With thanks (and apologies) I’m going to shamelessly borrow the illustration used as it’s a rather handy visualization of the issues with STP. The Cisco Live slide had a diagram something along these lines (assuming 10Gbps links):

The bold lines represent the paths that have not been blocked by STP, i.e. actively forwarding links. In this case, a VLAN that is trunked across all these wonderfully meshed links is ultimately reduced from a potential of 160GBps capacity at the access layer to 20gbps through the Core. When you consider what is often referred to as “East-West” switching (i.e. servers talking to other servers on different access switches), a server on the left-most access switch talking to a server on the right-most access switch would see its frames going up to the Core and all the way back down, and it would be bottle necked by the fact that this ‘tree’ structure has to have one – and only one – root node at the top. Ok, this also assumes that we trunk through our core – and we wouldn’t do that, would we? – but the point is well made. In order to ensure a loop-free topology, Spanning Tree effectively wastes much of the potential bandwidth available to you, and doesn’t necessarily make the best decision in terms of the “best” path between any two given points.

Over the years, STP was deemed too slow to converge, and for ports connected to end device like PCs, the whole blocking, listening, learning, forwarding sequence of STP on those ports meant that end stations found themselves unable to complete DHCP requests (for example) because the port was not yet forwarding when they sent the request. Similar issues arise in the wider network, as topology changes cause STP recalculations during which time again traffic is frozen as the network has to re-converge as loop free. As a result, Cisco in particular drove a number of very handy enhancements like PortFast, BackboneFast and UplinkFast. Trunking between switches (ISL and later on 802.1q) saw vendors taking different approaches to STP for multiple VLANs – some chose to run a single STP instance, and Cisco chose to go down the Per-VLAN Spanning Tree (PVST) path – more complex in many ways, but more controllable, and certainly more effective where different VLANs were trunked over different devices to each other and needed a unique tree. Of course that in turn didn’t scale so well when you started talking about thousands of VLANs (and if you have a Cisco Catalyst 6500 or 7600 switch with a few hundred VLANs being trunked in and out of a given linecard and you’ve most likely hit this warning:

%PM-SP-4-LIMITS: The number of vlan-port instances on module 1 exceeded the recommended limit of 1800

The solution? Multiple Spanning Tree (MST) – a nice compromise that sits somewhere between PVST (which has an instance per VLAN), and a single-STP solution, allowing you to map multiple VLANs into a single STP instance, but still allowing you to have multiple STP instances and thus multiple trees. In theory this does allow you to create a primitive form of load sharing between VLANs, but in the case of link failures I’m not sure I’d like to try and predict the net effect in a complex network.

So let’s assume then that while functional, STP is fundamentally evil. Even Perlman is on record saying that she didn’t think STP was the right solution, and that she believed that traffic should always have been routed between segments, not bridged. So where do we go if we want to maintain the crazy level of meshing that was shown in the earlier diagram, but also to avoid letting STP have its wicked way with your links?

TRILL / L2MP / FabricPath

First let’s get the definitions out the way:

TRILL = TRansparent Interconnection of Lots of Links (surely vying for a spot at the Most Contrived Acronym awards), an IETF protocol;
L2MP = Layer 2 Multi-Pathing;
FabricPath = what happens when Cisco marketing wants to sell a technology that doesn’t have a cool enough name. I wonder sometimes what would happen to a technology like BGP if it were new today?

What’s the difference between these three? Let’s disambiguate L2MP and FabricPath first; FabricPath appears to be the Cisco marketing name for L2MP. In turn, L2MP is basically TRILL with – inevitably – some extra Cisco proprietary knobs and features which they claim you will want. The bottom line is though that the underlying technology for these is pretty much TRILL, so if you understand TRILL, you’ll have a pretty good idea what’s going on with FabricPath. So let’s do that.

The “Problem and Applicability Statement” for TRILL is published in RFC5556, authored by Joe Touch (ISI) and Radia Perlman (Sun):

Interesting to see Radia Perlman there again, isn’t it? I like to think that she’s doing this in order to make up for the scourge of STP, but whatever her reasons I’m not complaining. If I might be irreverent for a moment, as I looked at Perlman’s picture I did notice an uncanny resemblance to another famous person and wondered if they were in some way related:

Oddly, I feel like a lightning bolt is about to strike in my vicinity (perhaps reaching out through the Ethernet cabling). Anyhow, Radia’s son, Ray Perlner, was kind enough to write Algorhyme V2 in an attempt to once again ensure acceptance of Radia’s baby:

Algorhyme V2
I hope that we shall one day see
A graph more lovely than a tree.
A graph to boost efficiency
While still configuration-free.
A network where RBridges can
Route packets to their target LAN.
The paths they find, to our elation,
Are least cost paths to destination!
With packet hop counts we now see,
The network need not be loop-free!
RBridges work transparently,
Without a common spanning tree.
— Ray Perlner

You really can’t argue against a poem like that, but before you go deploying TRILL with those rhyming couplets dancing around in your head we should probably look at how it works so you know what you’re in for. In terms of headline features, TRILL is:

perhaps most simply described as “layer 2 routing” (controversy alert!);
capable of ECMP (Equal Cost Multi Path) delivery of frames;
the protocol allows “unlimited” equal cost paths
unclear if there may be hardware dependent limits
a replacement for the Spanning Tree Protocol ;
transparent to end stations;
a “zero configuration” protocol;
compatible with existing bridges (can do incremental deployment);
scalable.

When I say Layer 2 routing, in order to avoid the inevitable backlash, it’s only true up to a point – in theory, frames are routed from end to end across the layer 2 domain based on a knowledge of where a remote MAC address is located. Reality though means that there is still need to flood frames at times, and that’s a behavior you don’t see with true routing protocols, right? Imagine a routing protocol where if you don’t have a route, you simply send the packet everywhere in the hopes that eventually it gets to the right destination? Ok, so not quite truly routing then, but certainly within the layer 2 domain, we’re going to try to route as many frames as possible. Maybe it’s just Smart Bridging?

“Routing” frames means you can make smart decisions and use ECMP to load share traffic, meaning that the network diagram we saw earlier with Core bandwidth limited to 20Gbps (which affected East-West traffic flows), now looks like this:

Hooray, we can trunk VLANs through our Core with impunity! Of course, that East-West traffic will likely never go to the Core now anyway, as all paths from Access to Distribution are now available.

A TRILL capable switch is called an RBridge (Routing Bridge). Each RBridge runs IS-IS in order to communicate with other RBridges and establish a routed topology. IS-IS is chosen for a few reasons:

IS-IS runs at Layer 2, so switches require no Layer 3 configuration to make this work
Compare with OSPF, which requires IP addresses to be configured, and uses IP Multicast to communicate!
IS-IS is extensible (consider for example Integrated IS-IS where IP prefix information is piggy-backed on the underlying IS-IS connectivity).
IS-IS is a Link State Protocol, so it’s pretty fast, efficient and scalable.

What if you don’t know IS-IS, or your engineers only know OSPF? It doesn’t matter! TRILL is intended to be plug and play – you don’t need to configure anything to make it work, you just need to enable it and let it discover its own neighbors and do the hard work automatically. It’s unlikely that you would ever need to even see anything other than who your neighbors are, and I sincerely doubt that there will ever be commands implemented to tweak TRILL IS-IS parameters.

But wait – if I already have IS-IS running on my network and I enable TRILL (which uses IS-IS too) on the same VLAN as one of my IS-IS routers, won’t they find each other and get confused? No; the TRILL team thought of that one and use a different multicast MAC for TRILL’s IS-IS implementation, so they will be like ships in the night. A new range of multicast MAC addresses has been assigned for TRILL (01-80-C2-00-00-40 through 01-80-C2-00-00-4F) with, for example, 01-80-C2-00-00-41 assigned for “All IS-IS RBridges”. There’s a new Network Layer Protocol Identifier (NLPID) defined – 0xC0 – and it looks like the TLV Code Points have been approved. IS-IS is basically being extended to support the unique requirements of TRILL, and to ensure that there is no conflict between regular IS-IS and TRILL’s use of IS-IS. You can read more about TRILL’s use of IS-IS in the IETF Internet-Draft “TRILL Use of IS-IS (draft-ietf-isis-trill-05)”.

Lots of theory for sure, but I’m guessing you’re still wondering how TRILL actually works, so let’s go there next.

How TRILL Works

We know that TRILL uses IS-IS as its underlying protocol; this is used to ensure that every RBridge has a picture of the network so that an optimal routing tree can be created, including support for Equal Cost Multi Path (ECMP) to a destination. The tree that’s created isn’t based on the actual MAC addresses, but rather on the RBridge IDs that are exchanged. IS-IS is used to figure out the best path or paths to get frames from RouterA to RouterB through any number of intermediate RBridges.

When an RBridge stores a mapping (my terminology) for a destination MAC address, it stores the MAC address along with the ID (or ‘Nickname’) of the RBridge that that MAC is connected to (i.e. the destination RBridge):

This level of abstraction means that when there’s a topology change in the network, only the RBridge tree needs to be recalculated and a new path to the other RBridges installed in the forwarding tables; MAC entries don’t necessarily need to change, as they point to the RBridge ID and are not directly part of the tree. By way of analogy, this is similar to a redistributed External OSPF route advertised with a forwarding address set to 0.0.0.0 – the routing decision is made based on recursive lookup of the Advertising Router’s IP; consequently a Shortest Path does not need to be calculated for the prefix contained in the External route, just to the Advertising Router IP.

Incidentally, with Spanning Tree Protocol a network link failure leads to a Spanning Tree recalculation, and to accelerated CAM table aging (i.e. the bridging / MAC address forwarding tables are flushed). With TRILL if a network link fails, the underlying topology is recalculated in IS-IS, but assuming that there is still an alternate path available to get to the egress RBridge, the MAC mappings do not need to be flushed. This helps reduce flooding after a failure, and because of the rapidity with which IS-IS recalculates paths, interruption to traffic flow is minimal.

“Routing” Frames

Let’s assume that I have an RBridge – RBridge3 which has the MAC mappings and ISIS forwarding tables illustrated above. From the perspective of RBridge3, to get to 00-00-fe-11-22-33, it knows it needs to send traffic to RBridge1. If I have other RBridges in between RBridge1 and RBridge3, there are two ways to ensure that the traffic gets to the other end as planned. One way is to ensure that every RBridge in the network has a full list of MAC mappings, and can therefore make their own forwarding decision. This is good, but means that RBridges in the network Core may face a large burden in terms of the size of tables they have to maintain – kind of the opposite of how we normally like to utilize a Core. The alternative is to encapsulate the original frame in a new frame whose destination is RBridge1. Intermediate RBridges don’t need to know the destination MAC, they just need to know how to forward traffic to RBridge1. Well that’s easy – IS-IS has already ensured that the RBridges all know about each other! And so that’s the choice that was made – the ingress RBridge encapsulates the frame using a special TRILL Ethertype (0x22F3) and sends it to the destination RBridge. It’s still a “Hop by Hop” routing decision – each RBridge makes its own decision how best to get to the destination RBridge – but the choice of egress RBridge was determined by the ingress RBridge.

As a TRILL-encapsulated frame traverses For each hop, the destination address of the frame must change (just like it does when routing IP), and inside the frame there must be a record of where the frame is ultimately going (the egress RBridge), again kind of like the destination IP address in an IP frame. This is accomplished using an outer (Transport) header and an inner (TRILL) header prepended to the original frame. You can read in more sickening detail and with fewer crass generalizations about the frame formats in the IETF Internet-Draft “RBridges: Base Protocol Specification”

Here’s a representation of a frame that ingressed at RBridge1, is ultimately destined for RBridge3, and is encapsulated by TRILL to traverse RBridge2 on the way:

The frame as shown between RBridge1 and RBridge2 can be decoded thus:

The egress RBridge – i.e. destination – (from TRILL Header) is RBridge3.
The ingress RBridge (from TRILL Header) is RBridge1.
The sending device for this link (from Transport Header) is RBridge1.
The destination device for this link – i.e. the next hop – (from Transport Header) is RBridge2.

Once RBridge2 receives is and forwards it on towards RBridge3, the TRILL Header remains unchanged, and the Transport Header will show a source of RBridge2 and a destination on that link of RBridge 3. This is all suspiciously similar to watching an IP packet encapsulated in Ethernet traversing some routers 😉

This is all good for unicast MACs, but TRILL also must support the transport of Broadcast/Multicast frames. TRILL creates optimal Broadcast/Multicast trees, and uses Reverse Path Forwarding checks to ensure that there are no loops. I’ve honestly not seen much more information about the nature of those trees, so I’m going to leave it there and try to be content with the rather skimpy knowledge that they are supported as efficiently as possible.

We now know:

that IS-IS is used to build a topology of RBridges;
that frames are TRILL encapsulated from ingress to egress in a TRILL network;
that RBridges make a hop-by-hop routing decision;
that MAC addresses map to egress RBridges;
and that a MAC address mapping is only required at ingress RBridges that need to talk to the destination MAC;
that broadcast/multicast MAC flooding is supported.

So far so good, but how do we learn those MAC addresses in the first place?

Learning MAC Addresses

This is, amusingly enough, the part of the protocol that seems to have got the least attention when it’s being presented, which is a little odd when you consider that without the ability to learn MAC mappings, nothing will work. Since again we have a rather important part of the protocol with only scant information available, I will share what I know with a little bit of extrapolation and hopefully we’ll be in the right ballpark.

RBridges learn MAC addresses through four basic mechanisms:

Locally received native (non-TRILL) frames. RBridges have to deal with end stations just like regular bridges, so they will build a regular MAC address forwarding table for locally connected devices.
Static (manual) configuration. Yep, you can force a MAC address mapping.
From received TRILL frames. Remember the TRILL header within the frame format I showed earlier? It includes the source (ingress) RBridge ID and the destination (egress) RBridge ID. If an RBridge receives a TRILL encapsulated frame, it can extract the ingress RBridge ID and the Source MAC address, and it will add a MAC mapping for that source MAC pointing back to the ingress. This means that (a) return traffic won’t flood initially, and (b) the path is assured to be symmetrical in terms of ingres/egress RBridges.
(optional) Explicitly distribute MAC addresses using TRILL ESADI (End Station Address Distribution Information) protocol. In effect, TRILL can use a special protocol to distribute the known local (native) MAC addresses to all other RBridges so that they pre-populate their MAC mappings. This is similar in some ways to the way OTV (Overlay Transport Virtualization) shares MAC addresses between sites.

I’m not convinced whether using ESADI is going to be something I would want to do, but it’s certainly an option that can reduce the amount of flooding in the TRILL domain. And once we have MAC mappings, TRILL encapsulation takes over.

Compatibility

TRILL is all very well, but how will it fit into you network? First, whether you are looking at TRILL or FabricPath, both have seamless integration with existing STP networks. The idea is that TRILL can be installed in stages – maybe with the Core first, then expand outwards towards the access layer over time?

Multiple RBridges on a LAN Segment

Having multiple RBridges active on a LAN segment could be an issue if they all start forwarding traffic over the TRILL network, as this would cause both traffic duplication and also confusion in terms of the appropriate return path with which to populate the MAC mapping tables. Consequently, RBridges on a VLAN see each other and elect a Designated RBridge (DRB) for the segment, which in turn normally becomes the Appointed Forwarder that is exclusively responsible for sending/receiving frames on that shared segment while all other RBridges effectively are in a kind of standby mode. Technically (i.e. in the protocol specifications) it is possible for a DRB to make other RBridges Appointed Forwarders, but I am not aware of this being implemented yet, and the likelihood is that the DRB will do the AF job itself.

Vendor Support

Obviously Cisco is a player here with FabricPath (which they describe as a pre-standard superset of TRILL). However, the only hardware currently supporting TRILL is the Nexus 7000, and only on some linecards. You may have heard the saying around lakes and marinas that the word “boat” is in fact an acronym for “Bust Out Another Thousand”; I have a feeling that the word “Nexus” may fall into a similar category – but this time something about power and cooling. Still, if you are following Cisco’s desired upgrade path and you are forklifting your old 6500/7600 platform and replacing them with Nexus 7ks, then you are good to go!

TRILL is not compatible with FabricPath (despite the similarities) – so if you choose FabricPath, you are locked into Cisco hardware going forward. Cisco does offer a TRILL-compatibility mode though, which should interoperate with other vendors’ TRILL equipment, although that would then lose you the ‘superset’ features of FabricPath.

Brocade offers VCS (Virtual Clustering Switching) – which is a multi-pathing fabric based on TRILL – in their VDX product line. As noted in the comments by Omar Sultan, I will clarify that while the data plane in VCS is TRILL, the control plane (i.e. what would be IS-IS) instead uses Fabric Shortest Path First (FSPF) – perhaps not a surprise given the rest of Brocade’s product line, and a nice was to reuse some existing code.

Currently you will not see vendors like Avaya, Nortel, HP (H3C/3Com) and Juniper supporting TRILL; we’ll look at what they are supporting as this series continues.

Do I Need TRILL?

That, my friend, is the million dollar question. Theoretically, it could help reduce failure recovery times after link failures; it can allow you to better utilize your bandwidth by allowing multipathing, and (one hand clapping) it can get rid of STP for you. On the other hand, it’s a fairly new technology that is supported by a small number of vendors with proprietary twists, and may lock you into something you didn’t want to be locked into. Opponents (or at least, proponents of alternative solutions – particularly Shortest Path Bridging, SPB) argue that TRILL’s new protocols/encapsulations mean redeveloping OA&M tools, which means a long lead time and high expense before you can manage your network properly.They also argue that TRILL may require new ASICs in routers to be supported (perhaps why it’s not being offered in the 7600/6500 platforms), and thus may mean requiring a full forklift exercise in order to implement. On the other hand, if you just implemented a Nexus7k network, it’s of interest, don’t you think? That and OTV… which should be coming in a later post 🙂

Cisco Configuration

For interfaces that connect to other Fabricpath RBridges, this is the rather challenging configuration required. Since this is NXOS, the first task is to enable the feature-set so that you’ll have the fabricpath commands available, then enable the feature-set. In the default VDC:

N7K(config)# install feature-set fabricpath

Then in the VDC in which you want to run FabricPath, enable the feature-set:

N7K-CORE(config)# feature-set fabricpath

Next up, define the vlan(s) that you’d like to use to carry the FabricPath traffic:

N7K-CORE(config)# vlan xxx
N7K-CORE(config-vlan)# mode fabricpath

Then finally, configure the interface (or port-channel) that connects to another FabricPath router(s):

N7K-CORE(config)# interface Ethernet1/1
N7K-CORE(config-if)# description Fabricpath uplink
N7K-CORE(config-if)# switchport mode fabricpath

And… that’s about it. For monitoring, you need the “show fabricpath” commands.

Feedback

TRILL is a fast-moving beast to keep track of, and I’m positive that something I’ve said will turn out to be “old information” that has since been superseded. There’s also limited detailed information out there (especially as it isn’t completely standardized at this time – note the Internet-Drafts that were referenced) so if I have misinterpreted any part of TRILL’s functioning, again please let me know. Basically, keep me honest here please!

I’d love to hear your thoughts on whether you plan to deploy FabricPath or VCS fabric, and if you have already, what your experience is from both a functional and operational perspective, good or bad.

Next time, I’ll be making a hash of Shortest Path Bridging.

Updated 2011-05-18 @11:40am: Clarified use of FSPF in VCS TRILL implementation following Omar’s comment below.

10 Comments on Layer 2 Routing (sort of) and TRILL

Omar Sultan May 16, 2011 at 7:38 pm

John:

Really well done post. The “do I need TRILL” question is a key one. This collection of technologies allow you to do some interesting things, but whether they are actually helpful or not really depends on the specifics of the environment. Its not a networking panacea and you can spend a lot of time and resources and not really gain any material benefit. Posts like this are great to helpful in helping customers make informed decisions. At the end of the day, as you say, its nice to have the option to use TRILL (or FabricPath) where it makes sense.

As far as Cisco specific info, you are correct the F1 module for the N7K supports FabricPath and will support TRILL. The recently announced Nexus 5500 is also TRILL/FP capable and support will be turned on in the future as part of an NX-OS upgrade.

From an interoperability perspective, I explain TRILL vs FP as being similar to HSRP vs VRRP–you can choose the extended feature-set or standards complaint. Although, at this point, the discussion is somewhat moot, since the list of folks committed to TRILL is short, I believe right now, its us and IBM (BNT)–BRCD’s version of TRILL does not use IS-IS.

Finally, while Cisco Marketing does deserve its share of abuse, you can’t blame “FabricPath” on the marketing guys this time. 🙂

Regards,

Omar Sultan (@omarsultan)

Cisco

Reply
- John Herbert May 16, 2011 at 9:14 pm
  
  Great information and comments about TRILL and FP, Omar – thank you! I apologize wholeheartedly to Cisco Marketing for besmirching their name. All I need now is one name.. just one name to blame 😉
  
  Thanks again!
  
  Reply
Brad Fleming May 31, 2011 at 3:08 pm

To be clear, Brocade is not using IS-IS for the control plane because the standards were not 100% complete.

http://community.brocade.com/community/brocadeblogs/vcs/blog/2010/12/07/vcs-ethernet-fabric-and-fcoe-traffic

See the “sidebar” section about 2/3 of the way down the page.

Reply
- John Herbert May 31, 2011 at 7:20 pm
  
  *nod* I had heard that Brocade was likely to move to IS-IS eventually. When that actually happens, I guess we’ll see. After all, if we end up with just Cisco (pushing the proprietary Fabricpath extensions) and Brocade (using FSPF) then they won’t interoperate even if Brocade does implement IS-IS. I’m very curious to see where the market ends up going.
  
  Reply
Naren June 23, 2011 at 3:11 am

John,

Could you talk little bit on the Designated vlan concept in TRILL. What is the purpose of this Designated vlan?

Thanks,

Naren

Reply
- John Herbert June 23, 2011 at 5:44 pm
  
  Hi Naren,
  
  Great question. There’s obviously a little bit more to implementing TRILL than I’ve covered above, and Designated VLANs is one of those areas I quietly skipped over to keep things a little simpler from the perspective of explaining the concept. Let me throw a few things out there and see if I can explain it simply.
  
  Some background points that will help to explain things:
  
  1) When RBridges see other RBridges on a multi-access link, they will determine between them which is to be the Designated RBridge (DRB). I should note that this on Point-to-Point (P2P) links, no DRB is elected.
  
  2) When an RBridge receives a native (i.e. non-TRILL) frame that it’s going to forward as TRILL-encapsulated, it first adds a 802.1q header to the frame so that the origin VLAN will be known when the frame is decapsulated at the egress RBridge. Thus when the frame format shows the “original Ethernet frame”, it’s really the original frame plus an 802.1q header. You could, if you wanted to make the Shortest Path Bridging folks laugh quietly, liken this a little to QinQ – you’re sending TRILL-encapsulated frames sourced from multiple VLANS over a single VLAN, and inside the TRILL data frame the 802.1q header in the “original” packet means it can be ‘demuxed’ correctly at the other end. Ugh, horrible analogy 🙂
  
  3) The reality is that links between RBridges are unlikely to be carrying a single VLAN, but rather they’re likely to be 802.1q trunk links with many VLANs on them. You don’t want to send out TRILL-IS-IS Hellos and run an instance of IS-IS on every VLAN, as that wouldn’t be scalable. It would also be pointless, as TRILL encapsulated frames are not forwarded on the VLAN on which the frame ingressed; rather the TRILL data frames are forwarded on a common VLAN – the Designated VLAN.
  
  So, if we put all that together:
  
  – On any given link, there must be a single VLAN that the RBridges agree to use for the exchange of TRILL-IS-IS and TRILL data.
  
  – On a multi-access link, the DRB dictates what the Designated VLAN will be; other (non-DRB) RBridges on that link MUST use whatever VLAN the DRB dictates.
  
  – On a point-to-point link, the RBridges use tie-break mechanisms to determine whose Designated VLAN should reign supreme (since there’s no DRB)
  
  – The best design obviously would be that you configure all RBridges to prefer the SAME Designated VLAN, so that if the DRB changes, you don’t change Designated VLAN as well.
  
  – You also need to ensure that all RBridges on a link have connectivity to that Designated VLAN. Common sense, really.
  
  So in summary, the Designated VLAN is the VLAN where TRILL-IS-IS really runs, and over which TRILL data forwarding between RBridges occurs. Make sure all RBridges on a link have the same preferred Designated VLAN configured, and ensure they all have connectivity to that VLAN.
  
  Again, there’s more complexity to this than I’ve covered here, but it’s a good start. Hope that helps a little? If you are very bored, the RFCs make interesting, if sometimes confusing, reading.
  
  (Hmm; based on the length of this response, it may be worth me adding a new post to cover this better…gah! Sorry about that 🙂 ).
  
  Reply
  - Naren June 28, 2011 at 2:06 am
    
    Hi John,
    
    Really appreciable for giving the info on Designated Vlan. The concept is clear and bit confusing:-) But very informative. Its really a good opportunity to work in such an interesting protocol. (Ha ha ha)
    
    Reply
Tony Bourke June 29, 2011 at 12:06 pm

I think the biggest TRILL/FabricPath-blocking issue is the fact that only the Nexus 7Ks can do it. I don’t think the Cat 6Ks will ever be able to do it, nor will the Nexus 5000s, only the 55XX, and not right now.

Reply
- John Herbert June 29, 2011 at 12:33 pm
  
  I think that’s going to be true of any new technology to some extent, but I agree it may be a big issue for Cisco. The fact that TRILL is “all new” is used as an argument by some Shortest Path Bridging proponents for why TRILL isn’t ready for big time; they claim that it will require new ASICs which will stop it being so readily back-ported to existing hardware (versus SPB which they believe will not have this issue).
  
  I’m not clear on whether that’s entirely true or not, but certainly the fact that Cisco are only offering Fabricpath on 7k and now 55xx so far (and I haven’t heard a roadmap for it being on other platforms) may indeed be a telling sign…
  
  Reply
manish kevre March 4, 2012 at 4:42 am

Thanks For Information it helped me in my seminar On DATA CENTER NETWORKING

Reply

Layer 2 Routing (sort of) and TRILL

Why Spanning Tree Sucks

TRILL / L2MP / FabricPath

How TRILL Works

“Routing” Frames

Learning MAC Addresses

Compatibility

Multiple RBridges on a LAN Segment

Vendor Support

Do I Need TRILL?

Cisco Configuration

Feedback

Related

10 Comments on Layer 2 Routing (sort of) and TRILL

Leave a Reply Cancel reply

Why Spanning Tree Sucks

TRILL / L2MP / FabricPath

How TRILL Works

“Routing” Frames

Learning MAC Addresses

Compatibility

Multiple RBridges on a LAN Segment

Vendor Support

Do I Need TRILL?

Cisco Configuration

Feedback

Share this:

Related

Related Articles

Software Defined, uh, Week From Unicornland?

The Network Engineer’s Guide to Making Changes

An Ode to Lewis Carroll

10 Comments on Layer 2 Routing (sort of) and TRILL

Leave a Reply Cancel reply