Wednesday, 11 March 2015

IEEE STP & PVSTP+ Interoperability - A Closer Look

IEEE STP (802.1D) is the standard so it absolutely has no respect for anything other than IEEE BPDUs nor it will care about anything else. The only BPDU it understands is the IEEE BPDUs with the destination MAC address of 0180.C200.0000 without any tags (so no Dot1q TAG) encapsulated in IEEE 802.3 LLC Ethernet frames. That's it. End of story.

So what  would happen if the following takes place:
  • If IEEE BPDUs arrive with a VLAN Tagg: THE BPDU will get dropped/will be ignored.
  • If IEEE BPDUs arrive with different Destination MAC address: Well, this doesn't make any sense, because STP enabled ports on the switch are not joined with any other Multi-cast group addresses other than IEEE STP: 0180.C200.0000 so the frame will be switched according to normal switching rules 
  • If something other than IEEE BPDU arrive: This is not even a question right? this is what switches do.. they simply switch frames out the port according to the MAC address table, if it doesn't know about the Destination MAC address yet, it will flood the frame out of all ports except for the port the frame came in. 
  • If it sees a Multi-cast Address other than something it is listening on? - Again normal switching rules apply, Switch will simply flood that traffic as well (let's say that IGMP, CGMP is not turned on) 
  • Wait.. What will happen if it sees a PVSTP+ BPDU. "Hmmm.. what? what PVSTP+ BPDU ? Haven't heard of anything like that before.." says IEEE STP (802.1D). To IEEE STP, this is simply an unknown Multicast frame. So the switch simply floods it out ports.

It is vital to understand the above behavior or the "arrogance" of IEEE STP to understand why Cisco had to include all the bits and pieces and perform checks to make sure that the Cisco's flavor of STP, "PVSTP+" works in harmony with IEEE STP

Now with this in mind. Lets think about what Cisco had to go through when they decided to come up with their own, more efficient STP/BPDU flavor while making sure that it can actually work with standard IEEE STP.

First of all they can't just manipulate IEEE BPDU because it is the  standards and Cisco has to make sure that their switches support this standard as well.

So they thought.. "hmmm what would allow us to develop our own BPDU protocol.." and someone at the back of the room said "I know! I know!.. why don't we use 802.3 SNAP Ethernet Format.. it allows us to define our own protocol specific to our Organization". And everyone was like.. "Dude you are Awesome!! :)".

802.3 SNAP provides a way to identify your Organization and the custom protocol specific to that Organization. Basically once you've registered your Organization to have your own OUI which is a unique ID for the Organization, you can simply create your own protocols at Layer3 and above. If this doesn't make any sense to you.. please have a read on this article I wrote which explains ins and outs of modern Ethernet formats.

Ok. So we are all set. We will be using 802.3 SNAP frames for PVSTP+ BPDUs.  So everyone is happy. All good. end of story...? Well, not so fast Cisco.., what if I want to connect my non-Cisco switches that runs IEEE STP with a Cisco switch that runs PVST+.

Hmmm.. now you are talking.. Now Cisco has a problem to solve. 
  • PVSTP+ runs multiple Spanning Tree instances per VLAN. On the other hand, IEEE STP runs a single STP instance and that is not even attached to any VLAN. (No it doesn't have anything to do with VLAN1 - if that's what you were thinking). So Cisco has to pick an STP instance that is mapped to some vlan. This STP instance will then be made responsible to inter-operate with IEEE STP. 
    • So Cisco Picked the STP instance attached to VLAN1. Which means that, on Trunk (dot1q) links, regardless of the Native VLAN, The Cisco switch (that runs PVSTP+) will use its VLAN1 STP instance to communicate with switches that only understand IEEE STP. 
  • Now, as we discussed earlier, the IEEE STP only understands and care about IEEE BPDUs, so if you want your PVSTP+ VLAN1 instance to communicate with IEEE STP switches, the only way to do this is by making sure that when PVSTP+ deals with IEEE STP, it uses the only language that IEEE STP understands, "Untagged IEEE BPDUs". Meaning, you will have to convert the VLAN1 PVSTP+ BPDUs in to (untagged) IEEE STP BPDUs on these links.
  • So what about the Access ports that plug in to IEEE STP switches.... 
    • Cisco switches always use IEEE BPDUs on its Access ports by default. So problem solved. 
    • If you have a IEEE STP switch connected to the Cisco switch that runs PVSTP+ plugged in on "access VLAN 10" port, Cisco switch will send out untagged IEEE BPDUs out that port. Which essentially means that the VLAN 10's STP instance on Cisco switch will converge with IEEE STP instance on the non-Cisco switch.
In summary, we can narrow this behaviour down to following,
  • On trunk ports Cisco sends out IEEE BPDUs (untagged of course) corresponding to its VLAN1's PVSTP+ STP instance REGARDLESS of the Native VLAN configured on the trunk.
  • On Access Ports, Cisco switches will always send out IEEE BPDUs (untagged) corresponding to that VLAN configured on that Port
  • Is that all?.. well not really. There are other cases  that Cisco had to sort out. Specifically the following.
    • What happens if the Cisco port is connected to a shared Medium and within that shared Medium we have mixed Cisco and Non-Cisco switches  and you want All Cisco switches within that shared segment to run PVSTP+ and Cisco and non-Cisco switches to Inter-operate at the same time.
      • Well, we already discussed how the Cisco & Non-Cisco switches will sort things out. Cisco switch simply sends out untagged IEEE BPDUs correspond to VLAN1. So no problems there..  
      • The issues is that, if the segment is also shared with Other Cisco switches that runs PVSTP+, those switches need to receive PVSTP+ BPDUs pertaining to each VLAN instance so they can converge per VLAN instance. So the solution is, you simply send out PVSTP+ BPDUs out that port Tagged with relevant VLAN ID. 
      • What about the Native VLAN.. are we Tagging PVSTP+ BPDUs with Native VLAN ID as well?.  NO. PVSTP+ BPDUs pertaining to the Native VLAN on that port will be sent out without the DOT1Q tag.
    • How about VLAN1, we are already sending out IEEE BPDUs out the port.. does this mean that the port also sends out PVSTP+ BPDUS in addition to that ? 
      • YES it would send out PVSTP+ BPDUs pertaining to VLAN1 instance as well, Tagged or unTagged depending on the Trunk's Native VLAN configuration.

Let's consider following scenarios,

Let's consider different port configurations and see what types of STP frames we can expect in each case.

           Access Port BPDU Generation:

               Access VLAN 100 : 
    • UNTAGGED : IEEE BPDUs corresponding to VLAN 100 
               Access VLAN 200 :
    • UNTAGGED : IEEE BPDUs corresponding to VLAN 200
               Access VLAN 1     :
    •  UNTAGGED : IEEE BPDUs corresponding to VLAN 1

Trunk port BPDU Generation, 

      Native VLAN 1, Allowed VLAN 1,100,200 : 
  • UNTAGGED              : IEEE BPDUs Corresponding to VLAN 1
  • UNTAGGED              : PVSTP+ BPDU Corresponding to VLAN 1
  • TAGGED VLAN 100 : PVSTP+ BPDU Corresponding VLAN 100
  • TAGGED VLAN 200 : PVSTP+ BPDU Corresponding VLAN 200
      Native VLAN 100, Allowed VLAN 1,100,200 : 
  • UNTAGGED               : IEEE BPDUs     Corresponding to VLAN 1
  • TAGGED VLAN 1      : PVSTP+ BPDU Corresponding to VLAN 1
  • UNTAGGED               : PVSTP+ BPDU Corresponding to VLAN 100 
  • TAGGED VLAN 200  : PVSTP+ BPDU Corresponding to VLAN 200
     Native VLAN 200, Allowed VLAN 1,100,200 :
  • UNTAGGED              : IEEE BPDUs corresponding to VLAN 1
  • TAGGED VLAN 1     : PVSTP+ BPDU corresponding VLAN 1
  • TAGGED VLAN 100 : PVSTP+ BPDU corresponding VLAN 100
  • UNTAGGED              : PVSTP+ BPDU corresponding VLAN 200

More cases that Cisco had to deal with.

Cisco hates Native VLAN miss-matches for obvious reasons. But as you can see, this whole BPDU UNTAGG'ing business is a recipe for a native VLAN mismatch. For example, what would happen if you have two switches (Switch-1 and Switch-2) connected with a trunk. Switch-1 has native VLAN 100 configured on its trunk port whereas Switch-2 has native VLAN 200. 

In this case, from the Switch-2's perspective, it receives UNTAGGED PVSTP+ BPDUs which will be put in to VLAN 200 and process within VLAN 200 STP instance. This is what the Native VLAN configuration do to untagged frames. But if you think about it, these UNTAGGED PVSTP+ BPDUs were sent by the Switch-1, and it was only untagging VLAN 100 BPDUs. So we officially created a Native VLAN mismatch.

So Cisco was wondering.. can we somehow indicate the Original VLAN information inside the UNTAGGED PVSTP+ BPDUs ?. And a Cisco employee who was sitting in the corner of the Room said "Hey why don't we just introduce a new TLV record in to the PVSTP+ BPDU and simply indicate this info in there.. " and everyone was like.. Dude can we do that ?. And the Cisco employee is like "Well this is our protocol.. we can do anything with it right..?" and Everyone was like... "man you are a genius !!"

So Cisco decided to introduce a special TLV record to indicate the original VLAN information inside the PVSTP+ BPDUs for all VLAN instances including the Native VLAN.

For example,

Trunk port, Native VLAN 1, Allowed VLAN 1,100,200 : 
  • UNTAGGED : IEEE BPDUs corresponding to VLAN 1
  • UNTAGGED  : PVSTP+ BPDU corresponding to VLAN 1 , TLV -> Original VLAN 1
  • TAGGED VLAN 100 : PVSTP+ BPDU corresponding to VLAN 100 , TLV -> Original VLAN 100
  • TAGGED VLAN 200 : PVSTP+ BPDU corresponding to VLAN 200 , TLV -> Original VLAN 200
Trunk port, Native VLAN 100, Allowed VLAN 1,100,200 : 
  • UNTAGGED : IEEE BPDUs corresponding to VLAN 1 - 
  • TAGGED VLAN 1 : PVSTP+ BPDU corresponding to VLAN 1 , TLV -> Original VLAN 1
  • UNTAGGED : PVSTP+ BPDU corresponding to VLAN 100 , TLV -> Original VLAN 100
  • TAGGED VLAN 200 : PVSTP+ BPDU corresponding to VLAN 200 , TLV -> Original VLAN 200

Well that resolves the problem right. Now the receiving switch can check whether the Native VLAN configured on its port actually matches the TLV indicated VLAN. So the switch knows exactly that is it not mixing VLANs. If a Switch finds out discrepancies, it will go in to Port Inconsistent state and error-disable the port. (by the way.. CDP can also detect VLAN mismatches using its own mechanism)

Before moving forward, we need to clarify one thing.. As you can see, Between the two Cisco switches, Both types of BPDUs are exchanged for VLAN1. So which one gets the priority? Always IEEE BPDUs. PVSTP+ BPDUs are there to help switches to identify any misconfiguration in the transit path using its TLV field.

More Cases To Understand

What happens if the Two Cisco switches are connected through a non-Cisco switch and both inter-switch links are configure as trunks. 
As for the Non-Cisco Switch, everything is sorted. It only understands IEEE BPDUs and both Cisco switches connected to it send out IEEE BPDUs (corresponding to their VLAN1 instance) and the Non-Cisco switch sends out IEEE BPDUs (NOT attached to any VLAN by the way) towards the Cisco switches where they get processed as a part of VLAN1's PVSTP+ instances in each Cisco switches. Actually, this behavior wouldn't really depend on changes made on any of the THREE switches or switch ports. So we are all good as far as the Non-Cisco switch goes..

How about the Cisco Switches? Well let's see how it would behave in different scenarios. 
  • Switch-1 Trunk port configuration: Native VLAN 1, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL 
  • Switch-2 Trunk port configuration: Native VLAN 1, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL 
What Switch-1 Sends outWhat Switch-2 Receives
UNTAGGED : IEEE BPDUs corresponding to VLAN 1UNTAGGED : IEEE BPDUs (Originated byNon-Cisco Switch) - Processed against the VLAN1 PVSTP+ instance
UNTAGGED : PVSTP+ BPDU corresponding VLAN 1, TLV -> Original VLAN 1UNTAGGED : PVSTP+ BPDU, TLV -> Original VLAN 1 (Not being processed further. Informational Only)
TAGGED VLAN 100 : PVSTP+ BPDU corresponding VLAN 100, TLV -> Original VLAN 100TAGGED VLAN 100 : PVSTP+ BPDU, TLV -> Original VLAN 100 - Processed against the VLAN100 PVSTP+ instance
TAGGED VLAN 200 : PVSTP+ BPDU corresponding VLAN 200, TLV -> Original VLAN 200TAGGED VLAN 200 : PVSTP+ BPDU, TLV -> Original VLAN 200 - Processed against the VLAN200 PVSTP+ instance


As you can see, we don't have any issues here. Everything is working as normal. No VLAN missmatch. Life is beautiful.. :)

OK Let's say, on the Non-Cisco Switch, port connecting to the Switch-1 has been configured so it Tags the untagged traffic as VLAN 200 (equivalent to Native VLAN 200 command). Let's see how this is processed by @ Cisco switch-2

What Switch-1 Sends outWhat Switch-2 Receives
UNTAGGED : IEEE BPDUs corresponding to VLAN 1UNTAGGED : IEEE BPDUs (Originated byNon-Cisco Switch) - Processed against the VLAN1 PVSTP+ instance
UNTAGGED : PVSTP+ BPDU corresponding VLAN 1, TLV -> Original VLAN 1TAGGED VLAN 200: PVSTP+ BPDU, TLV -> Original VLAN 1 - (200 != 1) VLAN Inconsistent port.
TAGGED VLAN 100 : PVSTP+ BPDU corresponding VLAN 100, TLV -> Original VLAN 100TAGGED VLAN 100 : PVSTP+ BPDU, TLV -> Original VLAN 100 - Processed against the VLAN100 PVSTP+ instance
TAGGED VLAN 200 : PVSTP+ BPDU corresponding VLAN 200, TLV -> Original VLAN 200TAGGED VLAN 200 : PVSTP+ BPDU, TLV -> Original VLAN 200 - Processed against the VLAN200 PVSTP+ instance


As you can see here, the non-Cisco switch will simply Tagg the PVSTP+ BPDUs as VLAN 200 (To the non-Cisco switch, this is just random Multicast traffic) and it will send this traffic out as VLAN 200 TAGGED traffic towards Cisco Switch-2. Before processing the BPDU within VLAN 200 STP instance, Cisco Switch-2 will check if the VLAN TAG (200) matches the VLAN indicated in the TLV field (1). In this case it doesn't, so it will put that port in to vlan inconsistent state.

Ok, Next we reconfigure Switch-2 so it will have Native VLAN as 100 and remove Non-Cisco switch's Un-tagg 200 configuration.,

  • Switch-1 Trunk port configuration: Native VLAN 1, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL 
  • Switch-2 Trunk port configuration: Native VLAN 100, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL
What Switch-1 Sends outWhat Switch-2 Receives
UNTAGGED : IEEE BPDUs corresponding to VLAN 1UNTAGGED : IEEE BPDUs (Originated byNon-Cisco Switch) - Processed against the VLAN1 PVSTP+ instance
UNTAGGED : PVSTP+ BPDU corresponding VLAN 1, TLV -> Original VLAN 1UNTAGGED : PVSTP+ BPDU, TLV -> Original VLAN 1 - Tagged with VLAN 200 as it enters the Port: Native VLAN Miss match
TAGGED VLAN 100 : PVSTP+ BPDU corresponding VLAN 100, TLV -> Original VLAN 100TAGGED VLAN 100 : PVSTP+ BPDU, TLV -> Original VLAN 100 - Processed against the VLAN100 PVSTP+ instance
TAGGED VLAN 200 : PVSTP+ BPDU corresponding VLAN 200, TLV -> Original VLAN 200TAGGED VLAN 200 : PVSTP+ BPDU, TLV -> Original VLAN 200 - Processed against the VLAN200 PVSTP+ instance

As you can see, on Switch-2, Since the Native VLAN is configured as 200, any untagged frame coming in to the Switch will be tagged as VLAN 200 before it being processed (except for the IEEE BPDUs, IEEE BPDUs always gets processed against VLAN 1 regardless of the native vlan configuration on the port as discussed earlier). But in this case, if the switch processed the incoming Untagged PVSTP+ BPDU against the VLAN 200 STP instance, that would be wrong since the BDPU was actually originated on VLAN1 on Switch-1. So with the help of of TLV field, the switch can now determine if there's a discrepancy with the VLAN configuration and if any found, the port will go in to Native-VLAN mismatch "Inconsistent Peer VLAN ID" state.

Let's take another scenario, This time, the Native VLAN is changed on the Switch-1 to be VLAN 200 (nothing fancy done on the non-Cisco switch) 

  • Switch-1 Trunk port configuration: Native VLAN 200, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL 
  • Switch-2 Trunk port configuration: Native VLAN 1, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL

What Switch-1 Sends outWhat Switch-2 Receives
UNTAGGED : IEEE BPDUs corresponding to VLAN 1UNTAGGED : IEEE BPDUs (Originated byNon-Cisco Switch) - Processed against the VLAN1 PVSTP+ instance
TAGGED VLAN 1 : PVSTP+ BPDU corresponding VLAN 1, TLV -> Original VLAN 1TAGGED VLAN 1: PVSTP+ BPDU, TLV -> Original VLAN 1 (Not being Processed further - Informational Only)
TAGGED VLAN 100 : PVSTP+ BPDU corresponding VLAN 100, TLV -> Original VLAN 100TAGGED VLAN 100 : PVSTP+ BPDU, TLV -> Original VLAN 100 - Processed against the VLAN100 PVSTP+ instance
UNTAGGED : PVSTP+ BPDU corresponding VLAN 200, TLV -> Original VLAN 200UNTAGGED : PVSTP+ BPDU, TLV -> Original VLAN 200 - Tagged with VLAN 1 as it enters the Port: Native VLAN Miss match


In this case, Switch-2 will receive Untagged PVSTP+ BPDUs that would be put in to VLAN1. But with the help of TLV field, it can figure out that the BPDU has originated from VLAN 200. So the switch knows that there is a Native VLAN Missmatch in the transit path.

Ok one last scenario, this one is a bit fancy.. Consider 2 Cisco switches configured as follows.

Cisco Switch-1 <--> Non-Cisco Switch-1(Native VLAN 100) <-->(Native VLAN 200) Non-Cisco Switch-2 <--> Cisco Switch-2 
As you can see, what we are doing here is, basically swapping the VLAN 100 with 200 as it traverses the non-Cisco switches. The "Native VLAN 100" means that, it will Untag VLAN 100 as the frame gets sent out on that port and will Tagg the untagged frames arriving at the port with VLAN 100 before it enters the switch. We also have the Cisco switches configured with proper native VLAN assignments (no discrepancies in the configuration) 

  • Switch-1 Trunk port configuration: Native VLAN 1, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL 
  • Switch-2 Trunk port configuration: Native VLAN 1, Active VLAN 1,100,200 , Allowed VLANs on the Trunk: ALL 
Ok let's analyze behavior here..

What Switch-1 Sends outWhat Switch-2 Receives
UNTAGGED : IEEE BPDUs corresponding to VLAN 1UNTAGGED : IEEE BPDUs (Originated byNon-Cisco Switch) - Processed against the VLAN1 PVSTP+ instance
UNTAGGED : PVSTP+ BPDU corresponding VLAN 1, TLV -> Original VLAN 1UNTAGGED : PVSTP+ BPDU, TLV -> Original VLAN 1 (Not being processed further. Informational Only)
TAGGED VLAN 100 : PVSTP+ BPDU corresponding VLAN 100, TLV -> Original VLAN 100TAGGED VLAN 200: PVSTP+ BPDU, TLV -> Original VLAN 100 - TAGG(200) doesn't Match the TLV field ,VLAN(100) - VLAN Inconsistent port
TAGGED VLAN 200 : PVSTP+ BPDU corresponding VLAN 200, TLV -> Original VLAN 200TAGGED VLAN 200 : PVSTP+ BPDU, TLV -> Original VLAN 200 - Processed against the VLAN200 PVSTP+ instance

As you can see, the non-Cisco switches will now effectively swap the VLAN ID between 100 <->200. But they CAN'T and WON'T change the TLV field (because switches, by definition don't change stuff inside the Payload/BPDU). So whatever the changes made at the transit can be detected by comparing the incoming Tagg with the TLV. So in this case, Cisco Switch-2 will put the port in to VLAN inconsistent state and error disable it.

IEEE STP & PVSTP+ Convergence

Now that we understand how the BPDU works at a much deeper level, Let's have a look at the big picture and see how all these things work together to build a loop free topology in each area (Cisco & Non-Cisco) of the switching infrastructure. 

Let's consider the following Switching arrangement where you have 4 Non-Cisco switches connected to two sets of Cisco switches as shown below.







So I guess, we covered most cases if not all. It's easier to understand why the protocols are implemented the way they are than memorizing random facts about it. So you can derive the result of any scenario that you come across.

If you find any missing facts or any other interesting scenarios, please note them in the comment section.. that would be beneficial for everyone :)