Per-VLAN Spanning Tree Plus (PVST+)

By | July 3, 2015
Spanning Tree Protocol

Spanning-tree protocol is one of the oldest ways to ensure loop-free topology of an ethernet network. Why we need it, where we need it and how it works is the subject of today’s recap post.

PVST+ and Spanning Tree Protocol

STP mindmap

Motivation for the Spanning Tree Protocol

Ethernet was designed as a common bus, without any built-in means of loop prevention. There’s just no “TTL” field to kill the frame if it goes through too many bridges, because there were no bridges around when the protocol was created.

It is only a matter of time until a looping event occurs in any Ethernet network so the key design factor is to limit the impact. The bigger the failure domain, the more impact to the campus network or data centre.

Source <https://etherealmind.com/network-dictionary-blast-radius/>

As Ethernet grew in its popularity, so did the size of the networks built on this technology. Naturally, engineers wanted to implant some reliability into the networks, and a simple way to do it was to add redundant links. That’s where loops come from.

A loop in a large Ethernet domain (a VLAN spanning the whole of the enterprise campus network is, sadly, a common example) causes packet replication. Bridges (and switches) stop (known) unicast frames from looping. But they have no way to prevent a flooded frame (unknown unicast, broadcas and multicast) from being looped. Looped forever or until the network is effectively dead, whichever comes first.

A clever idea came to one engineer’s head: force a naturally loop-free Tree topology out of any physical topology a network is built to.

STP does not distribute topology information – no bridge in the domain is aware of the whole of the topology. Even the root. STP control plane knows about topology less than, say, RIPv1. RIP at the very least communicates costs (hop count) to reachable destinations. STP communicates cost to the Root. An STP Root is not a destination, it’s a hop, and its position in the network is arbitrary (from protocol’s stand point).

STP high level operation

Spanning Tree building is based on two key concepts:

  • A tree must have a root
  • each leaf must select one shortest path up (or down…) to the root

PVST+ differs from the basic STP only in one major thing: it allows different VLANs of the domain to have their own independent trees. With roots and costs.

From these, stem the port roles of STP:

  • designated port: the port of the switch which is the closest to the root of all switches on the segment (the downlink from the root), and so all of the Root’s ports are Designated
  • Root port: the port determined to be the closest towards the root on any non-root bridge (uplink to the root)

Newer variation of STP – Rapid STP – brought in two more roles:

  • alternate – a “feasible successor” for the root port
  • backup – a “feasible successor” for the designated port

 

Any path which differs from “lowest cost to the Root” is blocked. That is, Learning and Forwarding (two functions of an Ethernet bridge) through it is prevented, thus breaking possible loops.

STP Root Bridge Election

Spanning-Tree Root Bridge election is based on two main things:

  • Bridge Priority
  • the bridge’s MAC address

In PVST+, Bridge Priority is subdivided into Priority and VLAN ID. The latter, together with the MAC address form the Extended System ID. Because for each VLAN, we need a unique Bridge ID, there were two solutions: use a separate MAC address for each tree (i.e. VLAN) or do what was done with clever Priority separation. It is clever, because the other option means we need as many MAC addresses on each bridge, as we’d like to have STP instances.

Note: PVST used ISL, PVST+ uses dot1q.

The selection is based on a simple rule: the lower, the better. Lowest priority bridge wins, if equal – lowest MAC address wins.

STP Root election based on Bridge PriorityIn this simplistic example, Cranberry will win the election.

N.B.: As a consequence, older switches will win the election, given the priorities are left at default values.

STP Root election based on MAC addressIn this illustration, the Arachnid switch will be the Root. (And with all other things being equal, Dragonfly will block its link to Cranberry to break the loop).

Side note: this arrangement is not very natural, because switches often have ten MAC addresses each, making several consequent Bridge IDs  in one network unlikely.

In a little more detail, the election happens this way:

  1. A bridge initializes and starts sending STP BPDUs out of all it’s ports, claiming itself to be the Root.
  2. If the bridge receives a BPDU with a Root ID lower than its own, it ceases transmitting its own ID as Root and starts sending out BPDUs with that lower Root ID as Root

Eventually, all bridges in the domain must agree on the Root bridge ID.

Eventually here is the convergence time of an STP domain, which depends on STP timers (discussed further) and domain diameter.

It is highly recommended that an STP domain should be designed not to span more than 7 bridges, due to high propagation delay (i.e. high convergence time) and minor possibility for temporary loops (discussed at length in some networking books).

To influence Root election, there are several configuration commands:

To make a switch the Root:

– this sets the priority two steps lower than the current Root’s priority, or to zero.

To make a switch a backup for the Root:

– this sets the priority one step higher than the current Root’s priority.

To set priority manually:

For the priority number, the range is 0 to 61440 in increments of 4096 (the VLAN ID goes here). Valid priority values are 4096, 8192, 12288, 16384, 20480, 24576, 28672, 32768, 36864, 40960, 45056, 49152, 53248, 57344, and 61440. All other values are rejected. 

 

STP Path Selection with Port Cost

After the Root is selected, each bridge, independently, chooses the least-cost path to the Root bridge.

This is called the “root port selection”, and it involves a little more than just Cost:

 

  • lowest Root Path Cost
  • lowest neighbor Bridge ID
  • lowest Port ID (neighbor, than local as a tie-breaker)

The Path Cost is formed as a sum of port Costs through the path. The Cost is derived from the interface speed as per IEEE recommendation.

STP path selection based on port costHere, one link is Gigabit Ethernet, and the other one is FastEtherent (not so fast these days). Fast ethernet has STP cost of 19, and Gig links have cost of 4. Thus, between Arachnid and Betazoid switches, the Gigabit link will be forwarding, and the FastEthernet will be blocked.

A good and concise blog post on STP port costs: https://packetlife.net/blog/2008/sep/5/spanning-tree-port-costs/

To influence to port cost, we can use this, under interface configuration:

 

the Cost value can range \in [1;200000000].

 

STP Path Selection with Port Priority

Port Priority, as part of the Port ID is the next step after the path cost and neighbor bridge ID was considered.

Port ID consists of the Port Priority and the internal port number. Again, the lower the better, thus given the same port priority, the port with the lowest number will be chosen as the root port.

We can influence it with (under the interface config:

 or, for access ports:

For priority, the range \in [0;255], the default is 128.

N.B.: There are different topologies where this selection makes sense (all other things being equal):

  • STP - port id selection - jsut two switchesThe lowest port number is 1 – on the neighbor – thus it is the one selected.
  • STP Port ID - two switches and a hubIn the second topology, the Arachnid switch will block its port #2 to break the loop.
  • STP Port ID - two switches and a hub reversedIn the third topology, the Betazoid switch will block its port #2. It sees Arachnid’s BPDUs coming from two ports at once, and the port ID of Arachnid is the same in both instances.

STP Convergence Timers

There are several timers, determining STP convergence:

source: Understanding and Tuning Spanning Tree Protocol Timers

hello—The hello time is the time between each bridge protocol data unit (BPDU) that is sent on a port. This time is equal to 2 seconds (sec) by default, but you can tune the time to be between 1 and 10 sec.

forward delay—The forward delay is the time that is spent in the listening and learning state. This time is equal to 15 sec by default, but you can tune the time to be between 4 and 30 sec.

max age—The max age timer controls the maximum length of time that passes before a bridge port saves its configuration BPDU information. This time is 20 sec by default, but you can tune the time to be between 6 and 40 sec.

N.B. Rapid Spanning Tree Optimizes this a lot, and not only by using timers.

In PVST+ operation, the timers (as well as many other parameters) can be influenced on a per-VLAN basis:

 For hello-time  seconds, the range is 1 to 10; the default is 2. 

 For forward-time seconds, the range is 4 to 30; the default is 15. 

 For max-age seconds, the range is 6 to 40; the default is 20. 

Further reading on Spanning Tree and PVST+

Denis Borchev
Follow me

Denis Borchev

Engineer at Netcube LLC
I am a networking engineer, a geek and a generally nice person=)
Computer Networking Engineer with some experience; MSc Applied CS, CCIE #53271
Denis Borchev
Follow me

Latest posts by Denis Borchev (see all)