[Cisco Nuggets] troubleshoot routing

This Cisco video training course with Anthony Sequeira and Keith Barker covers how to troubleshoot routing and switching technologies, such as VLAN troubleshooting, solving EIGRP adjacency failures, routing redistribution, and more.

Related area of expertise:

Cisco networking level 2

Recommended equipment:

Two Cisco IOS-based Catalyst switches
Two IOS-based Cisco routers
Note: This equipment is recommended but not required. Students can also follow along with popular simulator products or emulators.

Related certifications:

CCENT
CCNA R&S
CCNP R&S
CCIE R&S

Related job functions:

Network Administrator
Network Engineer
Data Center Administrator
Data Center Engineer

Trainers Anthony Sequeira and Keith Barker guide learners through the “art” of troubleshooting Routing and Switching core technologies. Learn the theory and the practical steps for ensuring success in actual network troubleshooting, as well as success in all levels of professional Cisco certifications.

VIDEO TITLESDESCRIPTIONS

1. Course Introduction (4 min)

Are you ready for a very exciting and unique CBT Nuggets course taught by Anthony Sequeira AND Keith Barker? Join Anthony Sequeira as he guides you through the goals of this important course.

2. The Art of VLAN Troubleshooting (19 min)

In this video, Keith Barker and Anthony Sequeira review VLAN technologies with you, and then cover the art of troubleshooting these important Layer 2 constructs.

3. Got Trunk? (31 min)

What could possibly go wrong with a trunk? In this Nugget, Anthony and Keith walk you through the plethora of issues that could cause trunking to fail, including native VLAN errors, incompatible pairing, DTP related issues, and more. Commands used in this video, along with the a topology diagram, are available in the NuggetLab area.

4. Troubleshooting VTP (27 min)

In this Nugget, Anthony and Keith walk you through the step-by-step process for quickly identifying and resolving issues with the VLAN Trunking Protocol (VTP) from Cisco. While many network engineers refuse to use this protocol, should you decide to use it, this Nugget is key to helping you do so safely and effectively.

5. Surviving STP (41 min)

Join Keith Barker and Anthony Sequeira in this Nugget as they walk you through surviving the Spanning Tree Protocol at Layer 2 in your network infrastructure.

6. Multiple Spanning Tree (MST) (30 min)

In this Nugget, Anthony Sequeira guides you through the proper configuration, verification, and troubleshooting of this new Spanning Tree variant.

7. Don’t Fumble Your Bundle! (22 min)

EtherChannel is an awesome way to fool Spanning-Tree Protocol and get more bandwidth between devices. It also adds redundancy in the design. This Nugget shows you how to build it right, and Troubleshoot an existing install.

8. Foolproof Frame-Relay (34 min)

Students will often struggle when it comes to Frame Relay configuration and troubleshooting. One of the main reasons why is the tremendous number of options that exist for its implementation. This Nugget makes simple work of these many options and their troubleshooting.

9. Troubleshooting PPP (14 min)

PPP is simple, but where students can struggle is with the authentication methods of PAP and CHAP – this Nugget makes PPP troubleshooting a snap.

10. Solving EIGRP Adjacency Failures (24 min)

Do you have EIGRP speakers that are refusing to peer? After watching this CBT Nugget from Anthony Sequeira and Keith Barker, you’ll be able solve this problem efficiently and completely! This is one of many targeted troubleshooting videos in this exciting series.

11. Where are My EIGRP Routes?!?! (28 min)

EIGRP neighbors seem up and happy, but when we go to view the routing table, we are missing routes! Where did they go? In this Nugget, Anthony and Keith explain what can cause the routes to not show up, and then walk you through the verification and remediation of these issues at the CLI. Commands used in this video, along with a topology diagram, are available in the NuggetLab area.

12. OSPF Refuses to Neighbor!!! (26 min)

In this Nugget, Keith Barker guides learners through the process of ensuring a FULL adjacency is formed in OSPF.

13. Where are My OSPF Routes?!?! (30 min)

OSPF is the most complex IGP by far. This makes missing route prefixes all the more tricky to track down. In this important Nugget – Keith Barker will guide you through making this Troubleshooting adventure a fun and painless one.

14. Surviving RIP (17 min)

Many learners think that RIP must be an absolute breeze to troubleshoot since the protocol is so simple in its operation. Sadly, this is pretty far from the truth. Join Keith and Anthony in this Nugget that provides valuable troubleshooting guidance.

15. BGP Refuses to Neighbor!!! (27 min)

In this action-packed Nugget, CBT Nugget trainers Anthony Sequeira and Keith Barker guide you through the art of BGP neighbor troubleshooting.

16. Where are My BGP Routes?!?! (34 min)

In this Nugget, Keith and Anthony ensure that what seems complex — troubleshooting missing prefixes with BGP — is really simple. It all begins with an analysis of the BGP table. Don’t skip this Nugget!

17. Foolproof Policy-Based Routing (PBR) (23 min)

Want to route packets your own way instead of following the routing table? This Nugget is for you. This video details the implementation, verification, and troubleshooting of Policy-Based Routing (PBR).

18. Solving Generic Routing Encapsulation (GRE) (18 min)

Building GRE tunnels is a great way to move passenger protocols buried beneath GRE and transport them over otherwise incompatible transport networks. In this Nugget. Keith walks you through correcting common configuration and recursive routing issues surrounding Generic Routing Encapsulation (GRE). The topology diagram and packet capture are available in NuggetLab area.

19. Routing Redistribution (25 min)

Redistribution can strike fear into the hearts of engineers everywhere. In this critical Nugget of the series, Anthony and Keith banish this fear forever. This is a Nugget not to be missed!

20. Troubleshooting in R&S Cert Exams (24 min)

In this conclusion to the course, Anthony guides students through the specific manners during which students encounter troubleshooting in the various R&S exams. Specific and important strategies are provided.

Course Introduction

The Art of VLAN Troubleshooting

Got Trunk?

Troubleshooting VTP

Surviving STP

Multiple Spanning Tree (MST)

Don’t Fumble Your Bundle!

Foolproof Frame-Relay

Troubleshooting PPP

Solving EIGRP Adjacency Failures

Where are My EIGRP Routes?!?!

OSPF Refuses to Neighbor!!!

Where are My OSPF Routes?!?!

Surviving RIP

BGP Refuses to Neighbor!!!

Where are My BGP Routes?!?!

Foolproof Policy-Based Routing (PBR)

Solving Generic Routing Encapsulation (GRE)

Routing Redistribution

Troubleshooting in R&S Cert Exams

===========================

Course Introduction

00:00:00
Welcome to our introductory Nugget on Cisco routing and switching troubleshooting mastery. In this Nugget, I’m going to go ahead and set up this entire course for you. Let’s get started. This is indeed a very unique course for CBT Nuggets. And it all starts out very uniquely because it’s going to be a dual instructor course.

00:00:24
That’s right, you will hear from myself, Anthony Sequeira, and my dear friend Keith Barker throughout the Nuggets of this particular course. We’ll both be bringing our opinions, our expertise to the various content areas that we move through in this exciting event.

00:00:43
Now, something else that’s very unique about this course is the fact that it will focus on various levels of Cisco certification. As a matter of fact, within the first couple of moments of each Nugget, we will show you this little chart here. And we will literally tick off the boxes that will be dealt with in that particular Nugget.

00:01:05
So for example, if a topic deals with– oh, let’s say spanning tree protocol, and this is relevant from a CCNA to CCIE level, we will indeed indicate that. Now, when Keith and I were designing the course, we really said we wanted a good solid mix of both theory and practical applications in this course.

00:01:32
So that’s what we really, really strived to do. In every single Nugget, as a matter of fact, you’ll get a good dose of theory. You’ll also get a good dose of practical command-line examples of real-world troubleshooting scenarios and problems. We are addressing both the real-world things that we could encounter in a network.

00:01:58
And of course, we are going to cover certification-based and certification favorite areas of troubleshooting. Now, what is our overall content? Well, it’s core R and S. That’s right, in order to make this a manageable size and a course you could get through in the typical amount of CBT Nuggets course time, we did confine this to core routing and switching.

00:02:25
So what you’re going to see is us beginning with a wide variety of core layer 2 topics, things like VLANs; trunking; VLAN Trunking Protocol; Spanning Tree Protocol; EtherChannel technologies; from the wide area network perspective, frame relay; and the Point-to-Point Protocol.

00:02:50
Then we’ll move to core layer 3 protocols like RIP version 2, EIGRP, OSPF, and BGP. We’ll talk about generic routing encapsulation and policy-based routing. And then we’ll wrap it up with redistribution. So let me just take a moment and tell you what troubleshooting topics you might find in the CCIE R and S practical lab exam that we tabled for other troubleshooting series so that we could really give and dedicate time to them.

00:03:32
Things like network security, multicast, performance routing– these topics are certainly important in the scope of the CCIE routing and switching practical. But we really table them so that we could focus in this course on the core of routing and switching.

00:03:56
Another one just came to mind– MPLS. Obviously, the topic area of MPLS– this is something we could dedicate an entire CBT Nuggets course on. So Keith and I realized that we wanted to give amazing focus to the core layer 2, layer 3, and redistribution topics.

00:04:17
So that’s what you’re going to find in these 20 Nuggets that make up this particular course. So are you ready? I know Keith and I are ready. Please join us in the very first Nugget of true content where we take a look at the art of VLAN troubleshooting. I hope this has been informative for you.

The Art of VLAN Troubleshooting

00:00:00
On behalf of Keith Parker and myself, welcome to this CBT Nugget in our Core Routing and Switch Troubleshooting Series here at CBT Nuggets. In this Nugget, we’re going to take a look at the art of VLAN troubleshooting. Not necessarily a science since there would be several approaches to this, we’ll show you our tried and true methods of ensuring our layer 2 infrastructure is built properly with those VLANs.

00:00:27
This particular topic is going to be of interest to many of you, since it does indeed have relevance for every level of our Cisco certifications from CCENT all the way to CCIE. Let’s jump in. So as you know, we love to break these Troubleshooting Nuggets down into three main areas.

00:00:48
We’re going to go ahead and review VLAN technology in a world of Cisco devices, we’re going to go ahead and share with you the art of troubleshooting these particular VLANs, and then we’re going to bring in Keith Parker and demonstrate this at the command line.

00:01:06
We recall that VLANs create our layer 2 broadcast domains. That’s right. These are very important structures for segmenting our layer 2 infrastructure so that we don’t have an unnecessary proliferation of broadcast traffic. Broadcasts are generally frowned upon, and the last thing that we would want to do is increase dramatically the amount of broadcasts devices need to hear because they’re not in a segmented virtual, local area network environment.

00:01:42
When we are building our VLAN infrastructure, we need to be very careful to do it properly. As my dear friend Keith Parker always says, if layer 2 isn’t happy, layer 3’s going to be really, really angry. So we want to make sure we get this right at layer 2 so that we’ll have success for upper layer protocols.

00:02:06
Now when we say VLAN, we’re actually talking about two types. There’s the data VLAN and then there’s the voice VLAN. The voice VLAN is often known by the term the auxiliary VLAN. I don’t care what you call it, let’s just all agree that this is a VLAN separate and distinct from the data VLAN and it’s for our voice over IP traffic.

00:02:32
Having a voice VLAN is such a great idea, because when your organization decides to implement voice over IP technologies, they can drop in those devices, those voice over IP phones, and they don’t have to renumber their environment to accommodate them. If there’s a separate voice VLAN, they can easily drop those phones in, give them any particular IP addressing scheme that they wish, and not impact the addressing scheme that we use in the data environment.

00:03:04
Now, do you remember Pepsi One? It was a diet soft drink that would contain just one calorie per serving. Well in a Cisco world, when it comes to VLANs, we have an important VLAN 1. VLAN 1 is the default VLAN. This is automatically created for you by default on your Cisco devices, and every single port is a member of this particular VLAN 1. This is the default VLAN is its name, and it’s also the default for your trunk links native VLAN.

00:03:47
This is the one and only one VLAN in an 802 dot 1 Q environment that is not tagged with a VLAN identifier. Now you probably realize that VLAN 1’s usage is now frowned upon due to security reasons, so what we do is we get on a Cisco device and we begin moving the ports on that device off of VLAN 1– yes– and putting them into their ports that they’re going to be utilized.

00:04:19
What if you have a port that is not going to be utilized on your particular device? Well you’ll create a dummy VLAN, like VLAN 99, and you’ll put that port in that unused dummy VLAN. So the idea is to move everything off of this default VLAN 1 from a security perspective. Now as networks got bigger and bigger, the number of VLANs that administrators wanted to be able to potentially create grew, and this led to the concept of the extended range VLANs.

00:04:59
That’s right. If a thousand VLANs just doesn’t cut it for you, with the extended range VLANs, you can create even more, specifically VLANs from 1,006 to 4,094. Holy VLANs, Batman. Would we ever need a network with this many particular VLANs? I seriously doubt it.

00:05:24
You’ve probably designed something that is an atrocity. But one of the things that’s going to be very interesting for us when we are troubleshooting VLANs is to realize that when we go to create VLANs of this extended range, there are going to be particular requirements we need to make sure that are in place.

00:05:47
Now creating a VLAN is just amazingly simple. We go in global configuration mode, and we say VLAN, and then we go ahead and give the VLAN identifier, like VLAN 100. In fact, the operating system allows us to be really clever with this. You could say VLAN 100, and then you could say comma VLANs 110– or rather 111– through 115, and those would be created. You could go ahead and say comma 118 comma 121– you get the idea.

00:06:30
So we can create VLANs very efficiently in this manner from the command line. By the way, when you create a single VLAN, like VLAN 100, you will be in a VLAN configuration mode at that point., and you can go ahead and give that VLAN a name, like first floor west, and this is going to give you a meaningful identifier obviously for that particular VLAN, because by default, VLANs are going to get a name that just indicates there ID.

00:07:09
So VLAN 100 would get a default name of VLAN 100. Now once you’ve got your VLANs created, going to a particular interface, like fast ethernet 0 slash 10, and assigning that interface to the VLAN is simple. We tend to like to say switchport mode access and ensure that the port is an access port, and then say switchport access VLAN 100. Technically, all we need to do is the switchport access VLAN command and that particular port will participate in VLAN 100 if it is indeed an access port. And that’s the thing– we typically like to do the switchport mode access command to ensure that the port is truly an access port and is participating in VLAN 100. But what about that voice VLAN? Well, no problem.

00:08:15
We just issue switchport voice VLAN and give the voice VLAN identifier that we are going to use. Now the art of troubleshooting these layer 2 VLANs is to understand first of all some caveats. For instance, that extended range of VLANs that we mentioned– those are only going to be available if you’re in VTP transparent mode.

00:08:46
And it’s at this point I should probably do a little commercial for the VTP Troubleshooting Nugget of this series. Certainly check that if this sounds like a foreign language to you. So in order to do those extended range VLANs, we want to ensure that we’re in transparent mode.

00:09:04
By the way, you can’t create VLANs on your local device if you’re in VTP client mode. Yeah, so you’ll get an error message when you try and create a VLAN on your local device and you’re in that VTP client mode. Now the good news is the error message is pretty darn clear.

00:09:25
It is in plain English, and it tells you you’re in client mode. You’re trying to create a VLAN. Go to CBT Nuggets and get some training. So I don’t think it said that part, but anyways, you get the idea. You will not be able to create those VLANs in the client mode, because of course, the client mode is to obtain its VLANs from the server system that is on the network from a VTP perspective.

00:09:55
I know what you’re thinking right now. You’re saying, Anthony, I don’t use VTP. VTP has too many security concerns for me. Hey look, that’s fine. So this particular issue is not going to exist for you if you are not utilizing VTP in any capacity in your environment, but we need to be aware of this especially when we are troubleshooting someone else’s environment.

00:10:23
Now speaking of troubleshooting someone else’s environment, we now really need to get to the point of what VLANs exist and what ports are participating in those VLANs. And I have great news for you. As Keith will demonstrate for us at the command line, there is a command that is worth its weight in gold.

00:10:50
It’s show VLAN brief. Why we love show VLAN brief is it is going to show us what VLANs exist on that particular switch, and it’s going to show us what interfaces are participating in that particular VLAN. So this is definitely a command that we want to have in our hip pocket.

00:11:20
Show VLAN brief– does the VLAN exist? And what particular interfaces are participating in that VLAN? Now should we need more detail about a particular interface and its VLAN configuration and participation, a command that we’re going to love is show interface and then you specify an interface like fast ethernet 0 slash 10, for instance, and then you say switchport.

00:11:54
We will get all the details– we’ll probably get even more details than we need about the VLAN configuration on that particular switchport. These commands are golden for us as we approach this art of troubleshooting our layer 2 VLANs. Well they say a picture’s worth a thousand words.

00:12:19
I edited that a little bit. Yeah, that phrase– I say Keith Barker’s demonstrations are worth a thousand words. Let’s turn it over to Keith right now as he demonstrates this art at the command line. Hey, thanks Anthony. So where do we begin? The very first thing we got to do is validate that there really is a problem.

00:12:40
So from this computer– this is the Windows computer. We’re pinging over to 10.0.0.55, the Linux box, and sure enough, it times out all four ping attempts. Also for the sake of this troubleshooting, let’s say we’ve also looked at the Linux box and the Windows box to validate that their IP addresses and their interfaces are all up and happy, happy.

00:12:59
So next, let’s take a look at the switch. I just want to validate that the ports, fa0/4 and fa0/5 are both in the same VLAN. It’s going to be really important for those two devices– the Windows and the Linux– to able to ping across that VLAN. I want to make sure the ports are allocated correctly.

00:13:17
Show VLAN brief is the command we used, and I see fa0/5– that’s the port that the Linux box is connected to– but I do not see in this list fa0/4 anywhere. So 0/5 is in VLAN 100. So the question is, where, oh where has our little fa0/4 gone? Maybe it’s a trunk.

00:13:38
If it was operating as a trunk, it wouldn’t show up as an assigned access port. So we do a show interface trunk– oh my gosh, we have 0 trunks, so it’s not currently trunking. So in this video, we’re troubleshooting VLANs, but if we don’t have a port that’s associated with that VLAN, that is absolutely going to cause a failure of communicating across that VLAN.

00:13:58
So let’s do this. Let’s go take a closer look at our beautiful missing friend fa0/4, and we’ll do that with a show interface fa0/4. So here’s our command, show interface fa0/4. It gives us a whole bunch of information, including the status. So it says the interface is up, but the line protocol is down, and it says monitoring, which implies that this interface is a destination port for a span session, a monitoring session, and as a result, it’s down from a usability perspective.

00:14:31
So because it told us it was a monitoring port, let’s go ahead and do a show monitor just to validate what monitor sessions we have and what we’re looking at. And this says we have one monitor session labeled Number 1. We’re looking at all the traffic that’s going ingress on VLAN 300 and the destination port is fa0/4, which is why up here it’s showing as down monitoring.

00:14:54
So if we’re using this monitor session, we can go ahead and remove it altogether, or if we were using it, we can allocate the destination of that traffic to some other port other than the access port that our Windows 7 computer is trying to use. So let’s do this.

00:15:08
Let’s just go ahead and remove the monitoring session. We’ll go into configuration mode. We’ll say no monitor session Number 1, and that should to relieve that port of being a monitor port. And now let’s go ahead and take a look at the status just to verify about that interface that it’s up and happy.

00:15:25
So there’s fa0/4 and 5. I’m using a slightly different tool to look at it this way. It shows that both of those interfaces are connected, they’re both associated with VLAN 100, and they both have negotiated full duplex and 100 megabits per second, which is great.

00:15:40
So let’s go back to our Windows computer and just try another ping and see if that now has corrected the problem so it’ll work. So we’ll go back to our Windows machine, I’ll use the up arrow key, and we’ll do a ping of 10.0.0.55 from the Windows over to the Linux.

00:15:56
And it says, reply from myself. Hey, that destination host is unreachable. So based on results, it is not working yet. We still have a VLAN problem. So what else could it be? Let’s see, the ping is still not working between the PC and the Linux box. Let’s take a closer look at VLAN 100 with a show VLAN ID 100. I’m putting the do command in front of there because I’m sitting in configuration mode.

00:16:22
If I was at privilege mode, it would just be show VLAN, the keyword ID, and then the VLAN that you want to look at the details for. Down here, it’s referring to some details regarding private VLANs, which because it’s empty, there are no private VLANs configured, so we don’t have to troubleshoot that.

00:16:36
And I also see for VLAN 100, the name is engineering, there’s port 4 and there’s port 5, which is great. But I notice a couple things that might be a little concerning. One, it says status of the VLAN is active, but all these ports it says are unsupported.

00:16:51
What do you mean they’re unsupported? They’re ethernet ports. And down here, regarding the media type, it says it’s fdnet, which sounds like something related to fddi, and we’re not dealing with fddi, we’re dealing with ethernet up here. So somebody’s been messing around with the media type for our ethernet switch, so we’re going to go ahead and fix that.

00:17:11
Because it says the ports are unsupported for that media type, that’s why that’s coming up and the media type is fdnet. We’re going to go back into VLAN configuration mode for VLAN 100, and we’re going to say media ethernet, and then let’s do a show VLAN ID 100 and see if we have a different result.

00:17:31
So now the media type is ethernet. That’s fantastic. So let’s go back to the PC and try another ping. So we’ll bring back our Windows computer, we’ll use the up arrow key, we’ll do a ping– 10.0.0.55– and– oh, nuts. Sometimes it takes a little bit for that ARP request to be resolved, but this is not happening.

00:17:50
So as that times out in the background, what else could it be? So let’s take a look. There’s a couple different ways to look at the VLAN information. We did a show VLAN ID. Let’s do a show VLAN brief, and that’s just yet a different way of looking at it. So this lshut portion– that doesn’t look good.

00:18:08
I mean, anything that is shut down is probably not going to be too effective for moving packets. So what this is the result of is that someone has shut down the actual VLAN. They went into VLAN configuration mode and said shut down, so no traffic is moving on this VLAN.

00:18:24
So let’s fix that. Let’s go into configuration mode for that VLAN. We’ll do a no shut down. We’ll do an exit. And then we could try several things. We can do a show VLAN brief and verify the status, or if we think that’s it, we could just validate with a ping.

00:18:38
If our ping works across that VLAN, we know that was the missing link. We could use the up arrow key and do a show VLAN brief just for grins, and this appears to be a lot happier. I don’t have that shut message regarding that VLAN. So let’s go ahead and bring back the PC in and try that ping one more time.

00:18:57
And there we have a happy, happy response from 10.0.0.55. Now our last step, of course, if we had all these issues, we’d want to make sure we saved our changes. Now in a production environment, what you and I would want to do is make sure that before we start making changes to this switch that we had a workable backup copy of the configuration so that if we need to restore it 100% to the way it was before we showed up, we can do exactly that.

Got Trunk?

00:00:00
Trunk links are important Layer 2 constructs that allow us to carry the traffic of VLANs from one Cisco device to another. In this CBT Nugget, Keith Barker and myself will show you how to efficiently and effectively solve any issues that might be causing your trunk links not to form or not to function as desired.

00:00:25
This particular Nugget does indeed have relevance for every level of Cisco certification, from the CCENT all the way up to the CCIE. So let’s get started. Keith and I will break this important Nugget up into three main parts. First, we’ll begin by reviewing trunk link technologies.

00:00:45
This will really help us find potential pitfalls we can run into when we’re trying to implement these trunk links. Finally, Keith will breathe life into the theory by demonstrating key trunk link verifications and troubleshooting processes at the command-line.

00:01:04
So trunk links are absolutely critical on our Layer 2 infrastructure. There is a trunk between switch 1 and switch 2, and that particular trunk link can carry the traffic of multiple VLANs. Here we have this machine in VLAN 10, this machine in VLAN 20. They’re going to want to potentially communicate with each other.

00:01:26
And it is a trunk link which can carry the traffic of those multiple VLANs between the Cisco switches. Early on, Cisco invented their own inter-switch trunking protocol. It was called Inter-Switch Link, or ISL. ISL is really falling out of favor. In fact, you won’t even find it as a trunking option on many modern Cisco switches.

00:01:54
What has taken the world by storm? It’s 802.1Q. The 802.1Q is an IEEE standard for trunking between our devices. It now features all of the great features that ISL had and even more. By the way, one of my favorite stories about ISL is when Keith Barker, are wonderful co-instructor in this CBT Nugget series and my dear friend, Keith Barker was talking to John Chambers.

00:02:27
Yeah, I’m not kidding. John Chambers, long-time CEO– and he’s retiring soon from Cisco Systems– but the long-time CEO of Cisco was talking to Keith Barker during a keynote demonstration. Keith was literally demonstrating online training to John Chambers.

00:02:46
And John Chambers asked Keith, hey, could you walk me through the creation of a trunk link? And by the way, I want to use 802.1Q not ISL. That was many years ago at Cisco Live, and it was my key moment that I realized that Cisco would indeed be turning their backs on the Inter-Switch Link protocol.

00:03:08
How does the 802.1Q trunking protocol function? Well, what it does is it takes the frame, and it goes ahead and it inserts a tag into the frame. Yeah. We call this the 802.1Q tag. And inside this tag is indeed a field for the VLAN ID. By the way, there’s also a field in the tag for a class of service setting.

00:03:38
And we know this is so incredibly important in quality of service disciplines. Now an interesting concept, by the way, about this 802.1Q tagging protocol is that one and exactly one VLAN will not get a tag. Yeah. This is called the 802.1Q Native VLAN. So this is one and only one VLAN that does not get any tag value.

00:04:06
What was the idea behind the Native VLAN concept when it was invented? Well, the idea was if something happened to the trunk link, if the trunk link wasn’t functioning properly, at least one VLAN, theoretically, could travel between these devices. Because, again, it would be an untagged VLAN.

00:04:29
Today the Native VLAN is frowned upon. And this is primarily due to security concerns, things like double tagging of 802.1Q frames and sending them over the link to have one VLAN be able to send traffic into another VLAN. So there are definitely security concerns with the Native VLAN.

00:04:53
And that’s why Cisco today, in high-security environments, will recommend that the Native VLAN essentially not be used. How do you do this? Well, you set the native VLAN to a VLAN that you are not utilizing in your infrastructure, a dummy VLAN, so to speak.

00:05:12
Something else that Cisco did to guard against Native VLAN security concerns was they actually switches that allow the tagging of the Native VLAN. That’s right. Every single VLAN, including the Native VLAN will receive a tag in that environment. When Cisco was working on their trunk technologies, they tried to do you a favor.

00:05:35
And they invented something called Dynamic Trunk Protocol. What they tried do was they tried to categorize their switches as, let’s say, core switches and then, let’s say, access layer switches. And what they would do would take the dynamic trunk protocol settings of one of these devices and set them appropriately.

00:05:59
For instance, every single port on what Cisco considered a core switch would be set to a DTP mode of dynamic desirable, for instance. These ports are going to dynamically try and form trunk links. They would take the access layer devices, or what they considered access layer devices, and they would send it to dynamic auto.

00:06:26
Sure enough, this would cause a trunk link to automagically form between the core and the access layer device. As you might guess, I think we better thoroughly review this dynamic trunk protocol so we can see all of the different modes and understand intimately their behavior.

00:06:49
Students will often be driven crazy by the concept of DTP, but you don’t need to be. All you got to do is remember the possible nodes and then just remember whether or not that device will proactively try and trunk with the other side. For example, we have the ability to go in and under an interface say switchport mode, and we can say access.

00:07:21
This is a DTP mode of off. That’s right. We are not going to trunk when we do switchport mode access. It’s not a trunk port we’re creating, right? We are creating an access port. So if we were to say switchport mode trunk on this side, and we were to say switchport mode access on this side, we would not form a trunk link, because one of those sides is in the off setting.

00:07:56
Another potential command we can issue is, I just mentioned it, switchport mode. And the command is trunk, switchport mode trunk. And this is considered our DTP on setting. By the way, that looks like it might be auto, so let me fill that in with access. OK.

00:08:17
So now if we were to have a trunk setting on this side and a trunk setting on this side, of course we would indeed form the trunk link, right? Because each side is proactively trying to form a trunk link with the other side. By the way, if you are on an older Cisco device that supports both ISL and 802.1Q, before you can enter this switchport mode trunk command and set the DTP mode to on, you must indicate which of the encapsulations that you were to use.

00:08:55
This is with the command switchport trunk encapsulation, and then you’ll choose either ISL or 802.1Q. Again, on modern Cisco devices this is not an issue, because they’re doing away with ISL support on the device entirely. How about other modes? Sure, they exist.

00:09:15
If we do switchport mode dynamic desirable, as mentioned before, this is the mode of desirable. And sure enough, that side will proactively try and form a trunk. So let’s say we had dynamic on one side, and then we had the trunk setting on the other side.

00:09:38
They’re both going to try and form a trunk link, and we’ll get the trunk link formed. We can do switchport mode dynamic auto. This is, of course, our auto mode. And as I indicated, this particular mode is willing to trunk as long as it is asked by one of the switches.

00:10:02
So in other words, if we have the auto mode on one side and we have the desirable mode on the other, we will indeed get a trunk link. The auto side is not proactively trying to trunk, but it will accept the desirable side’s attempt at forming the particular trunk link.

00:10:24
One of the things that’s very confusing about all this is switchport mode access is considered DTP mode of off. But there is a command that you can utilize that really does turn off DTP. And that command is switchport no negotiate. Yeah. This is a very important command, as it truly does silence DTP messages.

00:10:55
Switchport no negotiate can be used in conjunction with either switchport mode access or switchport mode trunk. And it is truly turning off the DTP process, as I alluded to. So once again, in this Nugget, once we review the particular technology, the potential pitfalls become pretty obvious, right? What if we have an old device that’s trying to do ISL, and we have modern device that’s trying to do 802.1Q? That’s not going to work out all that well, so this is a potential problem with the formation of a trunk.

00:11:39
It’s encapsulation mismatch. That’s obviously a bit rare, because ISL is fading from support. But what about a Native VLAN mismatch? Sure, this could happen. You have a shiny new switch, switch 1. And this particular switch is configured with a native dummy VLAN of 999. You then trunk over to switch 2, and switch 2 is in the default native VLAN configuration.

00:12:09
What is the default native VLAN land? It’s VLAN 1. So sure enough, this trunk link will have issues due to the native VLAN mismatch. Layer 1 issues could exist. It’s been scientifically proven that if your port is in the shutdown state, it won’t form a trunk.

00:12:28
So watch out for that. We reviewed the dynamic trunk protocol, and you saw how combinations of DTP could cause problems. For instance, if we have one side set to dynamic auto, and we have the other side set to dynamic auto, there will not be a trunk link.

00:12:48
This reminds me of the high school dance. All of us boys would sit on one side of the gymnasium, all the girls would sit on the other side of the gymnasium, and no one would ask each other to dance. So no dancing ensued. Hey, something else consider is the VLAN trunking protocol and its tie-in with the dynamic trunk protocol.

00:13:13
We’ll be covering VTP and VTP troubleshooting in a Nugget. But for right now, I just want you to realize that if you have VTP domain 1, let’s say, and you have VTP domain 2, you cannot form a trunk link between those two if you’re going to use any form of dynamic trunk protocol.

00:13:36
You would have to get in a no-negotiate type of environment in order to create a trunk link between two different VTP domains. Well, Keith, we now know about the potential pitfalls. Can you bring this to life for us at the command-line? Hey, thanks, Anthony.

00:13:55
I would love to. Let’s use this topology. We’ve got two switches, switch 1 on the top, switch 2 on the bottom. Switch 2 has already configured its first five ports with different states. We have an access port. We have a dynamic auto, dynamic desirable. Port 4 is on with negotiation of DTP still on by default. And port 5 is on, but negotiation is turned off. What we get to do is we’re going to take a look at the corresponding five ports up on switch number 1. And we can configure and mix and match and observe firsthand the results of either proper or improper trunk configurations.

00:14:30
We’re going to start off on a very clean slate. We’re going to default interfaces 1 through 5, and I’m going to shut them down. I’ve also turned off logging to the console. That way, you and I can focus on the Show commands that are going to help reveal the status of the interfaces.

00:14:45
I loved Anthony’s analogy of the dance in high school, where on one side you have a whole line of people, and on the other side you have a whole line of people. And if neither one invites the other side to dance, then there’s not going to be any dancing happening.

00:14:58
Well, that’s exactly the situation if we have two sides, both set for dynamic auto. Let’s use port 2 on switch 2. It’s set to dynamic auto. On switch 1, that we’re configuring, we’ll set it dynamic auto. And it’ll just sit there. It will not become a trunk, because neither side invited to the other one the party.

00:15:17
So if we issue the command show interface trunk– and of course we’re putting the do in front of there, because of the main configuration mode– we absolutely have zero trunking going on. One of the critical commands that we’re going to use quite a bit as we troubleshoot trunks together is show interface yadda, yadda.

00:15:33
And then specify the keyword switchport, which will give us the details regarding switchport configuration. I’m also throwing on there this piece right here, please exclude any lines that have the word private or unknown in it, just so this fits in a nice, tidy space.

00:15:48
The first 10 lines are what’s most critical for us here. Right here it’s saying, I am administratively configured to be dynamic auto. And that’s great. That’s how it’s configured. And sometimes, on some switches, that may or may not be the default if we didn’t hard-code a command there.

00:16:02
However, what’s really important function-wise is how is it operating. It’s like this was the intention, and this is the result. Operational mode is static access because it didn’t get an invite to dance from the other side. On the other side, if it was set for desirable or on, there would have been invite with DTP.

00:16:20
But as a result of both sides being auto, it’s a no-go. The big joke is it “auto” work, because both sides are set to auto, but it doesn’t. What happens if we change the rules a little bit? Let’s go ahead and work with the same switchport, FA 0/2. I’m going to shut it down just for a moment.

00:16:36
And let’s tell it we want it to be desirable. To do that, we go in interface configuration mode, and we simply specify that we want to be dynamic desirable. I’m also specifying, as Anthony mentioned, on some switches that support ISL and 802.1Q, we can actually hard-code the type of trunking we want to do, if we negotiate a trunk with encapsulation protocol.

00:16:58
And because dot 1Q is the standard, that’s why we’re throwing that in there as well. Now here’s the question. Our switch is desirable. The other side is set to auto. Is that going to work? Based on what Anthony said, as long as one side is asking the other to dance, it’s going to go ahead and bring up the trunk on both sides.

00:17:15
How would we verify that the trunk actually came up? Because I turned off all the logging to the console. A really easy command is show interface trunk. And that will reveal the details if there’s a trunk or not. It says there is, which is great. Here’s our current mode, which is accurate.

00:17:30
There’s our trunking, because I hard-coded that. It says we’re doing trunking, and the Native VLAN is VLAN 1. We’ve got three VLANs cooking on this switch, VLANs 1, 10, and 20. And this is probably the most important line out of the output here. We are currently forwarding for all three of those VLANs.

00:17:49
As Anthony mentioned, VLAN 10 and VLAN 20 would both get tags as frames are sent for those VLANs. And any transit traffic for VLAN 1, because it’s the Native VLAN, would go across without a tag. We do not tag the Native VLAN by default. You know what’s even better than looking at a trunk and looking like it’s working? Actually have it do work.

00:18:07
So let’s send some traffic from switch 1 over to switch 2 over the VLAN 10 trunk, and it should make it. If it doesn’t, we need to troubleshoot more. But it should make it if both sides are trunking. That looks great. Let’s also sent a packet over VLAN 20. And VLAN 20 is associated with the 192.168.2 network. The last octet, switch 2 has dot 2. And for the last octet, switch 1 has dot 1. That looks great.

00:18:33
We had connectivity across both VLAN going over the trunk. Sometimes when it’s working, we just think, well, great, it’s working. I’m going to leave. But I also want to do a show interface FA 0/2 switchport, just so we can take a look at the details that might come in extremely handy if there is a problem and two sides are not talking together.

00:18:52
Here’s our command, show interface FA 0/2 switchport. It says it’s administratively configured as a desirable. And it’s currently operating as a trunk, which equals success. We’ll also notice down here that the negotiation of trunking is on. And that will continue to be there, unless we specify no negotiation.

00:19:10
So that looks fantastic. Anthony also mentioned that if we have one side at on that the dynamic trunking is still functioning in the background. So we could have one side be on and the other side be auto. Let’s go ahead and try that using interface FA 0/4 on our local switch, which, again, connects to FA 0/4 on switch 2. So we’ve shut down FA 0/2, and let’s configure interface FA 0/4. We’ll go into interface configuration mode.

00:19:37
We’ll specify the trunking encapsulation methodology, whether it’s dot 1Q or ISL. We’ll choose dot 1Q. And we’ll simply say it’s switchport mode dynamic auto. Now we’re going to be the auto in this equation, and switch 2 is going to be set to on. And a trunk should still operates.

00:19:54
Because even though the other side is on, dynamic trunk protocol is still active on that other side. As a result, it should be inviting us to the party. To verify that, we can do a show interface trunk, just to make sure that the trunk is up and happy, happy, which it looks like it is.

00:20:10
And again, one of our most important lines is a right there, what VLANs are being forwarded. We have multiple paths. We could have spanning tree that’s blocking one or more of our paths. But in our troubleshooting here, we’ve isolated all our traffic to the interface that we’re working with, so spanning tree on its own won’t be a problem regarding forwarding traffic.

00:20:28
It’s also important to note that I’ve turned on rapid spanning tree on both of these switches, so we don’t have to wait the normal 30 seconds for the listening and learning before forwarding. Let’s do a ping over VLAN 10. That should work. The trunk looks solid, so we’ll ping switch number 2 over VLAN 10 by pinging its VLAN 10 interface. And that seems to be working.

00:20:50
Then we’ll go ahead and ping switch 2’s VLAN 20 interface, which ends in 2.2. And that seems to be working as well. Just to confirm the state and the details for our interface, we could do a show. And what we’d expect to see is that we’re administratively configured as dynamic auto, but we’re operating as a trunk, because DTP allowed us to negotiate that.

00:21:11
So there’s our administratively configured mode, and there’s our operating mode. And we can thank the negotiation of trunking for making that happen. This commercial brought you by DTP. All right. Anthony mentioned that having the same it Native VLAN on both sides of the trunk is important.

00:21:27
And I would agree. One reason is because if you don’t, you’re going to get little CDP messages saying, hey, we don’t have the same Native VLAN. There is, however, another problem. And that is not all your traffic is going to go through. So let’s do this. Right here on switchport FA 0/4, we’re going to go ahead and change the Native VLAN.

00:21:45
The syntax is really simple, switchport trunk Native VLAN 20. So instead of being VLAN 1, it’s now VLAN 20. What kind of a disruption of service is that really going to cause? Well, let’s take a look at it. Let’s say we have switch number 1 and switch number 2, and we have a frame that needs to be sent down.

00:22:02
That’s for VLAN 10. No problem. We simply put an 802.1Q tag saying VLAN 10. Switch 2 receives it and continues forwarding it. No problem whatsoever. The problem starts when we have VLAN 20 traffic. So switch 1 says, oh, that’s my Native VLAN. No tag. That frame hits switch 2 with no tags. And switch 2 says, oh, my Native VLAN is VLAN 1, and switch 2 believes that that frame belongs to VLAN 1. If it’s a broadcast, it’s being broadcast to everybody in VLAN 1. And it’s very likely that your IP addressing scheme up in VLAN 20 is not the same IP addressing scheme that you have in VLAN 1. But logically, those are one giant broadcast domain now.

00:22:42
That’s bad news. To test that, let’s do a quick ping over VLAN 10, which should still operate. No problem. Gets a tag. The receiving side sees the tag, no worries. And if that does work, that means the least partially our trunk is working. However, VLAN 20 traffic will not be as fortunate. So this ping, I’m going to put a repeat 2 count on it so we don’t have to wait the entire failure of the ping of five pings in a row.

00:23:06
But it is not going to make it. The correct solution to this problem is fixing the trunk. Don’t try to reverse engineer it and start putting different IP addresses in VLAN 1 for connectivity. You just want to make sure the trunk is working properly on both sides.

00:23:21
For this interface, which is interface FA 0/4, what I’d like to do is go ahead and shut it down. And I’m also going to default that interface so that if we come back to it for another exercise it won’t have that weird non-default Native VLAN. Anthony mentioned that, as a security measure, we might want to change the Native VLAN.

00:23:39
And that’s perfectly fine, as long as you do it where? On both sides. Make sure that the Native VLAN between two devices that are on a trunk link, make sure they agree on what that Native VLAN is. For our next scenario, let’s do this. Let’s have switch 1 be on, but we’ll turn off negotiation. And on switch 2, we’ll set it to on. And I want you to think, what are the results of this going to be? One of them is on with no negotiation.

00:24:03
The other is on with negotiation. Is it going to work? Based on what Anthony shared with us, the answer is, yes. We don’t need to negotiate anything. We turn trunking on on both sides, it’s going to be happy, happy. There’s no need for negotiation if both sides are turned on and they are using the same encapsulation type.

00:24:21
In this case, it’s dot 1Q on both sides. To verify it, we can do a show interface trunk, and then we can also take a look at the details from a switchport perspective of FA 0/4. And this, my friend, looks very, very healthy. So the trunking is on, it’s doing 802.1Q, the status is currently trunking, and then we have the default Native VLAN.

00:24:42
We’re forwarding from a spanning tree perspective on this port, which is the trunk for VLANs 1, 10, and 20. We’re administratively configured as a trunk. We’re operating as a trunk. And the negotiation of trunking is off, but that’s perfectly OK, because no one needs it.

00:24:58
Neither of these parties, switch 1 or switch 2, on this port needs it, because they’re both told to be a trunk with 802.1Q. Of course, nothing feel like success other than a working ping. So let’s do a ping over VLAN 10. And I suppose we could do a ping over VLAN 20. They are both going to be successful if the trunk is working.

00:25:17
And it appears that it is. All right. Let’s up the ante just a little bit. Let me go ahead and shut down that interface. And let’s try using switch 1 being on one with no negotiate and the other side, switch 2, being dynamic desirable. Now port 3 on switch 2 is already configured as dynamic desirable.

00:25:36
All we need to do is configure our local FA 0/3, tell it to go ahead and be a trunk, which is on, tell it to not negotiate, and let’s take a look at the results. As we implement this, I would love you to think, is this going to work, yes or no? Based on what Anthony shared and what you know of switching and trunking, will this configuration work with our side being on, with no negotiate, and the other side being dynamic desirable? What I keep coming back to is what Anthony said about that dance.

00:26:07
Somebody has to invite the other side to dance in the dynamic environment. In this case, switch 2, if it’s dynamic desirable and we have turned off negotiations, meaning we are not inviting anybody to any party, even though we’re on, because negotiations in DTP have been shut down, the other side won’t get the invite, and we won’t get a trunk.

00:26:29
To validate that, we’re going to do a show interface trunk. This is the surprising part. As we look at this, it looks like, hey, this looks pretty healthy. I’m a trunk. I’m currently trunking. My Native VLAN is VLAN 1. And look at this. VLAN’s in spanning tree forwarding state not pruned.

00:26:44
Looks like I’m good to go. However, are we really forwarding traffic successfully to the other side? Is the other side really trunking? And the answer is no. The other side isn’t. The other side isn’t trunking, because it never got the invite from us because DTP was turned off due to this command right there.

00:27:02
So if we want to take a look at the details of the interface, it’s really tough from this side to see that there’s any type of problem. Because the show interface FA 0/3 switchport is going to reveal that we’re configured as a trunk, we’re operating as a trunk, everything looks happy, happy.

00:27:17
Yet if we try to do a ping, for example, over VLAN 10 or VLAN 20, it’s not going to be too happy. And so are ping, which so easily responded previously when we had functioning trunks on both ends, now is not working, even though this side, because it’s set to on, looks like it’s happy, happy and no problems.

00:27:39
In reality, we are sending 802.1Q tags for VLAN 10 for this traffic down the trunk. The problem is the other side is just sitting there waiting to be invited to trunk. And because it’s not acting as a trunk, the other side, it’s simply an access port. I want you to see both sides of this, so let’s flip it around.

00:27:57
Let’s go ahead and shut down that interface, and let’s go to interface FA zero 0/5. Let’s do this. What we’ll do is we’ll make our side dynamic desirable. And switch 2 on port 5 is already said to you on with no negotiate. That way, you and I can experience the feeling of never being invited.

00:28:18
No DTP messages, and as a result, we’re not going to go ahead and initiate a trunk on our side. The config is pretty simple. Into interface FA 0/5 we go. I’ll tell it to use dot 1Q, and I’ll specify the mode as dynamic desirable and then bring that interface up.

00:28:35
Again, I’ve turned off logging, so we’re not going to see the console messages popping up regarding what’s happening immediately. Here we could guess what’s going to happen. And my guess is that because DTP is turned off on the other side, switch 2, and because we’re desirable, we will not have a trunk.

00:28:51
So to validate that, we can do a show interface trunk, hope for the best and hope for the best and see the reality is we have no trunk. If we do the show interface for FA 0/5, we can validate the details of how it’s administratively configured and how it’s currently operating and because it’s so sad that nobody ever asked it to the dance.

00:29:11
Nobody ever invited it via DTP to become a trunk. Let’s do one more. Let’s go ahead and set both sides up for on and both sides of for no negotiate. If we do that, what do you expect the results would be if DTP is turned off on both sides yet we set both sides trunking to on? Now if you’re saying, well, Keith, that’s an easy one.

00:29:31
If both slides are on trunking, they don’t need DTP, and that trunk should be functional on both sides. I would 100% agree with you that that should be the results. So let’s validate that. Trunking on on switch 1, trunking on on switch 2, DTP, no negotiate used on both sides.

00:29:49
And the result is– drum roll, please– with a show interface trunk, Houston, we don’t have a problem. It looks pretty good. And if we did a ping, the ping should work as well on either VLAN 10 or VLAN 20. Let’s go ahead and send a ping across the VLAN 20 interface. That ping over to switch 2 should be successful. And it is.

00:30:11
One other element that I find very interesting, and it does happen occasionally, is let’s say I have three switches, switch 1, and we have switch 2, and we have switch 3. And let’s say they are all set up successfully with trunk. So the trunk negotiation is working, or they’re hard-coded to on on both sides or have a working paring.

00:30:27
And the trunks are solid. However, a customer in VLAN 10 over here can’t talk to a customer in VLAN 10 over here. Why not? Well, if the middle switch doesn’t have VLAN 10 defined– for example, it had it, and then it was deleted, or it doesn’t exist for some other reason– a switch cannot forward frames for a VLAN which it does not know about.

00:30:48
So if it receives a frame that comes in, it’s tagged for VLAN 10, it doesn’t have VLAN 10 locally defined anywhere as a VLAN, that frame of traffic, it goes directly to the bit bucket. It’s done. Please make sure those VLANs exists on every switch that you expect to be in the transit path.

Troubleshooting VTP

00:00:00
What? You’re not using VTP? Are you kidding me? VTP is awesome. The VLAN Trunking Protocol will save you tons of time and effort. All right. In all seriousness, if you are going to use VTP and you’re interested in learning to troubleshoot this particular protocol, this Nugget is for you.

00:00:19
By the way, this Nugget applies to several layers of Cisco certification, including the CCENT, the CCNA, the CCNP, and the CCIE. Let’s jump in. Now Keith Barker and I’s goal in this Nugget, it’s simple. We want to review VTP with you. We want to show you the most common things that can go wrong and how easy they are going to be for us to troubleshoot.

00:00:45
And we’re going to breathe life to this protocol at the command-line interface. If you’ve been in my training before for Cisco Systems, you know that I really can’t stand the name VTP, the VLAN Trunking Protocol. This is pretty deceptive. I wish Cisco had named this protocol the VLAN Management Protocol, because we know that’s what it’s all about.

00:01:10
It’s all about our ability to go and create VLANs, let’s say 10 through 20, on switch 1 and have those VLANs magically propagate over to switch 2. But there is one thing about the name that is nice. And that is the keyword trunking in the name does remind us that this particular protocol is only going to work over our trunk links.

00:01:37
And for our purposes, of course, we are talking about 802.1Q trunk links. So in order to propagate VLAN information magically over the network, we must be having our trunk links in place. That’s obviously going to be our first step in troubleshooting, right? We have to make sure we’re trunking.

00:02:00
Otherwise the VTP information is not going to propagate. Now let’s review the modes of VTP. Our device is going to default to a Server mode of operation. A VTP server allows you to create and manage your particular VLANs on the device. It will indeed attempt to propagate this information over to your other servers and your client systems, so another mode we have is Client.

00:02:33
The mode of Client is going to be a slave system to your VTP servers. In other words, you cannot go to of VTP client and create and edit VLANs. This is certainly why Cisco defaulted to the server mode of operation. If they had defaulted to a Client mode, you could only imagine the tech support calls they would receive when people cannot create VLANs on their local switch.

00:03:00
Another mode of operation the causes some confusion is called Transparent. With Transparent mode, the device can indeed participate in a VTP domain. It just will forward advertisements and not listen to them. So you can create, and edit and manage the VLANs all you like on a transparent device, but this information will not be propagated anywhere.

00:03:28
Now the way in which we used to turn off VTP if we didn’t want to use it, maybe we had security concerns regarding this particular protocol, the way we used to turn it off would be to go to every single device and set it to Transparent mode, the logic being that we cannot overwrite any VLAN information anywhere, because everybody’s transparent.

00:03:54
No one can affect anyone else. But Cisco has now, for clarity, added a fourth mode of operation, and that is the Off mode. So VTP mode Off is a more clear-cut way in which to completely disable the VTP protocol from your environment. If you are using VTP, you’ve got it enabled, its operation is very, very simple.

00:04:21
It works with what we call the configuration revision number. And Keith will actually show you this at the command-line. Our key verification command with this protocol is simply show vtp status. We get 90% of what we want to get done done from verification standpoint with this particular command.

00:04:42
In here, we see configuration revision number. So let’s say I add these VLANs. What will happen on the server is the configuration revision number will go from 0 to 1. All your other servers and clients out there will notice that they are at a database version of 0, and they will go ahead and listen to the new configuration revision database of 1, and they will write those changes to their particular database.

00:05:11
By the way, this is all going to take place as long as everyone is in the same VTP domain– this is a case-sensitive name that we set– and everyone has the same VTP password. The VTP password is optional, but obviously a recommended step for your VTP environment.

00:05:35
When we talk troubleshooting, really step number one, let’s make sure we have an 802.1Q trunk connecting these devices, or we’re going to have big problems. Step number 2, let’s check that case-sensitive domain name for VTP. And also let’s go ahead and check and make sure that we have a matching the VTP password.

00:05:58
Something I want to make you aware of that will happen with VTP that’s really pretty interesting is someone will go and they’ll build their 802.1Q trunk. OK? And they’ll have their VLANs created over here. Let’s say it’s VLANs 10 through 20 on this particular device.

00:06:17
Then what they do is they go ahead and set the VTP domain over here to some name. Let’s say they set that name to TEST. And then they’ll come over here here, and they’ll do their show vlan, and they will be absolutely amazed because all of the VLANs, 10 through 20, are magically over there. What in the world happened here? Well, VTP sees that you had a trunk link.

00:06:49
It starts hearing the VTP messages over here. Remember, this thing will default to Server mode, which will certainly allow it to listen to VTP advertisements. And because you have a trunk, because you set the domain name over here, it will indeed automatically join that particular domain.

00:07:12
So this is pretty startling when it happens to you in a switched infrastructure. Remember again now, you created your VLANs over here. You created your trunk between these two devices. And then all you did was put this in a VTP domain of TEST. That’s it. That’ll trigger this side joining the VTP domain of TEST, this side getting the VLANs.

00:07:37
Ways in which you could stop this automatic propagation, of course, would be to go in and not build the trunk between the two devices or immediately put in a VTP password. Now no propagation would occur to switch 2 until switch 2 has that particular password.

00:07:59
Let’s talk about a third troubleshooting issue with VTP, and that is you overwrite your entire database. That’s right. You destroy your switched infrastructure. How this can happen is– and this is one of the reasons why administrators will be afraid to even implement VTP– is that some junior admin– gosh, you know we love to pick on those junior admins– comes along and connects a switch into this infrastructure.

00:08:31
And this particular switch is in, let’s say, Server mode. It could even be in Client mode, but let’s just say it’s in Server mode for our VTP domain of TEST. It has no VLANs in its database. But there’s a problem, folks. It has a configuration revision number of, like, 101. So it has a high configuration revision number, compared to, let’s say, a configuration revision number of 2 on these particular devices.

00:09:02
Well, yes, you guessed it. In this scenario, this device can overwrite all the VLANs of our infrastructure. What is the solution to this huge potential problem with VTP? Reset the configuration revision number on this particular device that we are introducing.

00:09:22
What’s the easy way to reset it? I’ve seen lots of recommendations on this. And it really is funny. Someone will say, stand on your head, reboot the server, and then do a handstand. Crazy. All we need to do is we just need to rename the VTP domain. That’s right.

00:09:45
When you rename your VTP domain, you will reset the configuration revision number to 0, and now you can safely introduce this particular device. If it’s TEST, we could rename it to TEST1, and that’ll reset the configuration revision number, and then reset the name back to the original TEST and safely introduce your device.

00:10:10
Once again, as Keith Barker will demonstrate, the key here is that show vtp status command, so we always are aware of our particular configuration revision number. Something else that I want you to watch out for that is certainly more rare but it could become a problem is the VTP version.

00:10:30
VTP versions are not compatible with each other, differing VTP versions. Currently, there are versions 1, 2, 3 out there of VTP. Most of us are going to go ahead and settle on VTP version 2. This is set at the command-line with the VTP Version command. Just make sure all of your devices are version 2 capable, and then go ahead and set that version on each device for success with VTP.

00:11:00
Keith, there we have it, the four main troubleshooting areas for our VTP Trunking Protocol. I’d love for you to go ahead and demonstrate this for us at the command-line. I would love to demonstrate this. Anthony, let’s use the same topology that you drew for us, switch 1, 2, and 3. And the interfaces that we’re going to use in this demonstration, let’s use 0/5 on both of the interfaces between switch 1 and 2 and 0/6 on both of the interfaces between switch 2 and switch 3 for our trunks. One of the items that Anthony mentioned was that if we don’t have a trunk it’s not very likely that VTP is going to have anywhere to place.

00:11:37
So let’s do a couple things real quick on switch 1 to verify its connectivity over to switch 2. If we do a show cdp neighbors on the interface FA 0/5, that will validate if we see the neighbor that, number one, we have a cable connected and, number two, that Layer 2 seems to be working, because that’s where CDP operates.

00:11:56
And based on the results, we do see a neighbor, switch 2, on our local interface of FA 0/5. And on the remote end, it’s using its own FA 0/5 as well. That’s fantastic. Basic connectivity has been validated. Next, because we are going to need a trunk for VTP to operate, let’s validate whether or not we have any trunking in place by doing a show interface trunk here on switch 1. And sure enough, we’ve got FA 0/5 currently set to on using 802.1Q in trunking status. Native VLAN is 1, and we have forwarding down here. For this demonstration, I have another path for VLAN 1 that I’m using to Telnet to each of these switches for management.

00:12:36
So having VLANs 2 through 4 in a forwarding state on that trunk looks really, really good. You know, while we’re here, let’s also take a look at VLANs that currently exist on this switch. We’ll do a quick show VLAN brief. I’m also going to exclude any lines that have 100 in them. That way, we don’t have to waste real estate by looking at the FDDI and Token-Ring-related VLANs.

00:12:57
Here we have the default VLAN, and we have VLANs 2, 3, and 4 that have been created. Now if VTP is correctly working, these VLANs should be advertised down this trunk port. So switch 2 and, subsequently, switch 3 should all have the same VLAN database. Anthony shared with us the primary command that we’re going to use to validate a lot of our VTP information, and that is a show vtp status.

00:13:20
So right here on switch 1, let’s take a closer look at the details of VTP using that command. This switch is currently running VTP version 2. The configuration revision number is 7 at the moment. We have eight VLANs. That’s these four right here. And then there’s four more that are default for Token Ring and FDDI and so forth.

00:13:38
It’s in server mode. The VTP domain is CBT Nuggets. And this is the MD5 digest. So that digest should match on all the devices in this VTP domain that have the same password configured. One item I’ve seen students struggle with is regarding passwords with VTP.

00:13:54
They do a show run. They’re looking for the password. It is not displayed in the running config. If we want do want to see it, however, it’s very, very simple to do. Just a show vtp password command, and it will be happy to show you exactly what the VTP password is on this device.

00:14:11
Here’s the current password that’s in use for VTP on switch 1. Armed with this information, let’s go take a look at the details on switch 2. We’ll make a road trip over to switch 2. And let’s just do a quick check, just to validate if those same VLANs exist here or not.

00:14:27
We’ll do a show vlan brief, and we are missing some VLANs. We have the default VLAN, which everybody gets. But it appears that VTP is not working correctly between switch 1 and switch 2. Because if it were, our VLAN database would match the VLAN database over on switch 1, and we are missing some VLANs. If the trunk isn’t functioning here, that’d be a great reason why VTP also is not working.

00:14:52
So let’s just do a quick check and do a show interface trunk, just to see if there’s any active trunks on switch 2. It appears there are no active trunks, which would also explain why VTP isn’t working. We already verified our connectivity at Layer 1 and 2 from switch 1’s perspective. However, if we started at switch 2, we could use that same technique here just to validate that we have a physical cable connected and we’re connected to the appropriate neighbor.

00:15:18
And based on results, we are. So we’ve got our physical connectivity. Layer 1 and layer 2 are working. We just don’t have a trunk. While we’re here, let’s also just validate that we have good connectivity down to switch 3 over our interface FA 0/6. That way, we can rule out any physical issues as a connectivity problem for VTP.

00:15:37
So this is great news. We also have connectivity over to switch 3 over FA 0/6. You and I can now rule out Layer 1 and Layer 2 issues in our topology as being a problem. Let’s take a closer look at the details of FA 0/5– that’s the connection between switch 2 up to switch 1– just to see what’s going on with it.

00:15:57
We can also do a show run. And that would show us how it’s administratively configured. But doing a show interface with a switchport command gives us more of the details about what’s actually happening with that interface. The administrative mode is dynamic desirable.

00:16:11
But it’s currently operating as an access port. It appears that negotiation has not been disabled or turned off for trunking. But yet, based on results earlier, we saw that this switch is definitely not doing trunking. In our Nugget on troubleshooting trunks, Anthony mentioned something about VTP and that with dynamic negotiation of trunks if the VTP domain was different between the two switches they would not dynamically negotiate a trunk.

00:16:38
Now that could be the case right here, because this site is doing dynamic negotiation of trunks. Let’s just do a quick check to validate what the VTP domain is on this switch. Because if the VTP domain is different than switch 1 and we’re trying to use dynamic on our negotiation– And this is usually the part in the story where the student comes up and says, hey, I thought that trunking was supposed to be set up first, and then we could add VTP on top of it.

00:17:04
Now you’re saying that the VTP domain name is going to prevent trunking from correctly negotiating? And the answer is absolutely yes. To help reinforce that, check this out. Here’s a DTP packet that I collected off of one of the trunk links. And in this DTP advertisement, it has a domain.

00:17:22
Where did it get that? Well, that happens to be the VTP domain. If that VTP domain is not equal on both sides, the side trying to negotiate the trunk is going to give up. He is not going to negotiate the trunk. So as shocking as that seems to a lot of people, the VTP domain name could actually caused a failure of the trunk when that name doesn’t match on both sides.

00:17:46
This is a pretty easy fix. We have Bob’s VTP as the domain on switch 2. We can simply change it. We’ll go into configuration mode. We’ll say VTP domain CBT Nuggets, just as it was on switch 1. And then we’ll go ahead and balance the interface. We’ll shut it down, bring it back up, just to force everything to be nice and fresh, and then we’ll take a look at the results to see whether or not that’s going to, first, fix our trunk situation and, secondly, if that’s going to allow VTP to correctly advertise VLANs over that trunk.

00:18:16
Another option to this, of course, would be too hard code that trunk to be on instead of negotiate. So let’s validate a couple of things. Let’s validate, first of all, that the VTP domain name did change to CBT Nuggets on this device, which it did right here.

00:18:31
That’s great news. And let’s also validate whether or not that change we made of the VTP domain is now going to cause our trunks to show up. This is a very good sign. We have ports 5 and 6 both as trunks, and they’re both forwarding VLANs 1 through 4. And it really is amazing how a little change like a VTP domain name can be the difference between a working and non-working trunk.

00:18:54
Let’s also verify the results of getting those VLANs whether they show up or not. We’ll do a show vlan brief. I’m going to remove the FDDI and Token-Ring-related ones. And there we have it, VLANs 1, 2, 3, and 4. So it appears the VLAN database on switch 1 and switch 2 has now been synchronized. So the incorrect VTP name actually caused the trunk to fail and also, subsequently, caused VTP to fail.

00:19:17
Before we leave switch 2, let’s do one more show vtp status just to take a look at the details. Here we have a configuration revision of 7. The VTP domain is CBT Nuggets. And the MD5 digest is going to be the same as switch 1’s. We know that because this is based on the password.

00:19:35
And because VTP is operating correctly, this switch must have the same password as switch 1. So here’s my question. Do you remember the command that we would issue to see the VTP password? And if you’re saying, Keith, Keith, I know, it’s show vtp password, you would be absolutely correct.

00:19:52
So let’s do that. We’ll do a show vtp password, and it’s going to result in the same password we have on switch 1, because VTP is working. Let’s make a road trip over to switch 3. This is the interface FA 0/6 that’s connecting it over to switch number 2. First let’s verify and see if those same VLANs exist.

00:20:09
If VLANs 1 through 4 exist, there’s a good possibility that they may be learned through VTP. And if there are no additional VLANs, as there aren’t here, that indicates a problem with VTP. So let’s verify our trunks. We’ll do a show interface trunk, just to validate that we do have a trunk link between switch 3 going up to switch 2. And it appears that we do.

00:20:31
However, we’re not doing any forwarding whatsoever. And that’s because scientifically it’s been proven that you cannot forward for a VLAN that doesn’t exist. So if this switch doesn’t have VLANs 2, 3, or 4 and it’s currently forwarding for VLAN 1 on another interface, which it is for management, that’s a great explanation of why we’re not doing forwarding for any VLANs, because they simply don’t exist on this switch.

00:20:56
Here on switch 3 it’s very likely we have some kind of a problem with VTP specifically. We’ll do a show vtp status, see what the details are. And I want you to notice a couple things. Take a look at the MD5 value, which should match, if the passwords are correct, between the three switches.

00:21:12
And secondly, also the configuration revision number. This guys has a configuration revision number of 10, which is different than switch 2 that we just looked at. In fact, we can jump over there just for a moment to compare. So its configuration revision number was 7. And that’s a bad sign, because the configuration revision number should be identical across all switches that are synchronized regarding VTP and their VLANs.

00:21:34
So let’s go back to switch 3. And if this MD5 digest is different than on switch 2, which, by the way, it is, that would be an indication that the password is not correct. So we can quickly verify what the password is with the show vtp password, and this says the password is NewPassword, which is different than the password set on switch 1 and switch 2. And that would cause a failure to synchronize VLAN databases, because the MD5 hash, the password, if you will, doesn’t match between the devices.

00:22:04
Here’s the sad part of our story. The solution to this is really simple, right? It’s like, well, we can just change the password, which we can. However, Anthony warned us about taking a look at the configuration revision number. He said that the highest one, the one highest numerically when a VTP switch comes online, is going to use its database.

00:22:24
And that’s going to be replicated across the entire switched environment in that VTP domain. Now what we could do is we could create, for example, a brand new LAN to test it. Because we are in the same VTP domain, we have the same password, if we create a new VLAN, that new VLAN should be replicated across the other switches in this VTP domain.

00:22:44
So if we create VLAN 50, for example, and give it a name and then exit, that VLAN that should exist not only locally, but it should also be synchronized to the other switches. Let’s you and I validate that that VLAN exists locally first, and then we’ll go take a look at switch 2 and switch 1, respectively. We’ll do a show vlan brief, and ta da! There it is, VLAN 1 and VLAN 50, that brand new VLAN that we just created.

00:23:09
Now what you and I get to do, let’s make a road trip over to switch 2, and let’s just validate whether or not that new LAN 50 exists on switch 2. If VTP is working, it should be there. I want once to notice something very carefully about this output. VLAN 50 is there indeed. But what’s missing? Oh my gosh.

00:23:29
We’re missing VLANs 2 and 3 and 4. Why? Because the configuration revision number on switch 3 was higher than the rest of the switches. And once we got the password resolved and those VTP advertisements were being set and the configuration revision number was greater on the advertisements that switch 3 was sending, the other two switches said, hey, that must be the most accurate, and they synchronized their VLAN databases to match what switch 3 had. The sad part is if we had 200 or 300 computers all on VLANs 2, 3, and 4, those computers, those devices are now dead in the water.

00:24:04
They cannot send or receive frames on the switches they’re connected to on any switches that are part of this VTP domain. So not only is this devastating in a production environment, it’s also devastating in a lab environment. For example, you’re working on some scenario at the CCA lab, and you make a change, and it wipes out a bunch of VLANs accidentally, that’s not a good thing.

00:24:24
Anthony did tell us the solution for preventing this in the future. What we should have done is on switch 3, do you remember the trick? It’s to rename the VTP domain name to something temporary and then name it back. That will force the configuration revision to 0, which will also trigger the switch to send out a request saying, hey, I need information about this VTP domain.

00:24:46
And that’ll trigger another switch to send an advertisement to give it all the details. So by going from one name to another and back, it set the configuration to 0. And what we’re seeing right here is that it re-synced up with the rest of the network. Now what I thought would be handy is I captured all of that inside of Wireshark so we could look at the details.

00:25:04
So if we do a display filter and look for just VTP packets, here’s our request right here. When we flipped him to a different domain and then flipped him back, he sent out a request in VTP. And the request said, yes, I’m looking for information about this VTP domain name CBT Nuggets.

00:25:21
That’s packet 118 right here. Here in frame 119, we have another switch, very likely switch 2, that’s advertising a summary advertisement. Now a summary advertisement in VTP is like a Reader’s Digest overview. It’s saying things like here’s the VTP domain, there’s our current configuration revision number, and here’s the MD5 digest currently.

00:25:43
It also indicates in this summary advertisement that there’s one more advertisement to follow, and that’s the subset advertisement. That contains the details for the VLANs. So if we move to frame 120, this subset advertisement includes, for example, the default VLAN, VLAN 1. And this guy should be VLAN 50 right here. With this VLAN identifier, you might say, well, Keith, it was VLAN 50. It wasn’t VLAN 32. And the reality is that is 50 in hex. So 16 times 3 is 48 plus 2 more equals a decimal of 50. So 32 in hex is 50 in decimal. And there’s the name of our new VLAN as well.

00:26:23
The one above it should be VLAN 1, which it is. And these bottom four should be the Token Ring and FDDI ones, which they are as well. Then once switch 3 got those, it was so happy about that information it sent its own summary advertisement and its own subset out regarding that same information.

00:26:41
And in VTP, even if there’s no changes, these periodic advertisements are going to happen to assist in keeping all the VLANs synchronized across the switches in the VTP domain. The diagram of the topology that we’re using all along with that capture file that we just looked at are available in the Nugget Lab files for this course.

Surviving STP

00:00:00
Spanning Tree Protocol is a technology with issues. Yeah, there are many technologies popping up today like TRILL or Cisco’s implementation called FabricPath that literally eliminate Spanning Tree Protocol. But if you have this STP running in your switched infrastructure, let’s, in this Nugget, really focus on surviving it.

00:00:24
By the way, this is another one of those particular Nuggets that has relevance for CCNA root switch, CCNP root switch, and of course, CCIE root switch. Now, let’s start this Spanning Tree Protocol review section by examining the different flavors of Spanning Tree Protocol that could indeed exist in your network infrastructure.

00:00:47
We have classic Spanning Tree Protocol, which is 802.1D. 802.1D will show up in the operating system, by the way, as an IEEE designation, when we do our show spanning tree command. So 802.1D, classic Spanning Tree Protocol, we’re really not going to see that implemented today much anywhere.

00:01:15
In fact, it’s not even the default any longer, on these Cisco Catalyst switches. What we’re going to see instead is 802.1W, Rapid Spanning Tree Protocol. Specifically, if you want to implement this on a Cisco device you’re going to implement a mode called rapid per VLAN Spanning Tree Plus.

00:01:43
So it’s rapid per VLAN Spanning Tree, that we will implement in order to get our pure 802.1W technology. There could also be implemented 802.1S. This is Multiple Spanning Tree Protocol. And this is going to be a mode in the Cisco device of Multiple Spanning Tree Protocol.

00:02:09
When you implement Multiple Spanning Tree Protocol, it’s important to realize that, you are getting built into that behavior our 802.1W. So how can we implement Multiple Spanning Tree Protocol along with Rapid Spanning Tree Protocol? Just implement Multiple Spanning Tree– now, it would be logical for you to think that you would have a problem in a mismatched type of environment.

00:02:37
For instance, what if we had a Switch 1, and this switch was connected to Switch 2. And the switch on the right was running the rapid per VLAN Spanning Tree Plus, or 802.1W. And this other device was running 802.1D, was running our classic Spanning Tree Protocol.

00:02:59
Interestingly enough, there is not a problem here. There’s a problem from a performance perspective. You’re not going to get the performance gains of Rapid Spanning Tree Protocol, until all of your devices are running it. But the two devices can coexist, just fine.

00:03:18
Now, keep in mind, in a certification environment, this might be considered an incorrect configuration that you need to troubleshoot. So be sure to watch out for this. And it’s our show Spanning Tree command that will quickly verify what flavor of Spanning Tree Protocol that we’re running.

00:03:38
Now, successfully troubleshooting Spanning Tree Protocol really does hinge upon you knowing, having mastered the Spanning Tree Protocol process that takes place behind the scenes in your infrastructure. Now, it’s really important for you, when practicing with Spanning Tree Protocol, to really simplify things.

00:04:00
An easy way to simplify things is to go ahead and connect four switches like this in a square. Yeah, that’s a really easy way for you to visualize and practice with Spanning Tree Protocol. As you might guess, one of the ports in this topology is going to block, and having one of the ports block will prevent our loop condition– so four devices connected in a square.

00:04:27
By the way, you will probably have other connections between these switches in your lab environment. So just be sure to do a shut on all of the interfaces, except those interfaces that we are using in our scenario. So shut everything down and have the only links active being the four links that we see before us in this topology diagram.

00:04:53
OK. With that said, what is the four-step Spanning tree. Protocol Process that you want to have committed to memory? Really, you want to be absolutely clear on this process, to aid in your troubleshooting. Step one of the Spanning Tree Protocol process is, there is a root bridge that is elected right.

00:05:17
The root bridge, the king of the hill, the root of the Spanning Tree that is developed. The root bridge election, as you know, is going to be the lowest bridge ID. The bridge ID is made up of a configurable priority value plus the VLAN identifier, called the Extended System ID plus the Mac address on the device.

00:05:47
So if you do not manipulate the priority value, the lowest Mac address is going to dictate what devices is the root bridge. Step two, on each non-root bridge, we have one root port that is elected. The root port is the best path, from a bandwidth perspective, back to the root bridge.

00:06:16
If there is a tie into the bandwidth values back to the root bridge, then tie breakers are used. And this would include the bridge ID– lower is better– the interface identifier– lower is better. And these are fairly arbitrary, right. The first thing that is looked at, thank goodness, is bandwidth, to make sure we don’t block a higher bandwidth patch, instead of a lower bandwidth path.

00:06:47
So let’s say, in our example here, that switch to is elected as the root bridge. That’s step one. In step two, we would have one root port on each non-root bridge be elected. And just to make things clear, let’s make this a 100 megabit per second link. And let’s make these one gigabit links.

00:07:16
OK. So these are one gigabit links. And then we have a fast ethernet link over there. OK, switch one, the best path to switch to from a bandwidth perspective, right here– the root port on Switch 1 gets elected. Right over here on Switch 3, we have a root port. It’s got a nice, gigabit link right back to the Switch 2. Now, what about Switch 4? Well, we do we want to go via the fast ethernet? Or do we want to go via the gigabit? Sure, the gigabit will be chosen.

00:07:50
So our root port is right there. Those are the elections that would take place by default, with step two. Now, step three– on each link we are going to have a designated port elected, so each link gets exactly one designated port. Guess what? On the root bridge, by definition, all of our ports are designated ports.

00:08:22
And look at– that takes care of the designated port on this link. That takes care of the designated port on this link. On the top link, this would be our designated port, because again, it’s got preferential bandwidth back to our root. And then, the question becomes– on this link right here and on this link, the designated port is located on switch three because switched three has the lower cost back to the root bridge.

00:08:53
And now Spanning Tree Protocol is at its fourth step. Everything left over is non-designated and would be blocking. There’s only one port left over, and it’s right here. That’s our non-designated blocking port. And it’s no coincidence that it is the slower link of our topology, that has the blocking port.

00:09:18
If you look at something like rapid Spanning Tree Protocol, our 802.1W technology, it’s still going to do all this that we see. It does a little more. And it does it using a new Proposal and Agreement process to speed things up. But these four steps really do become the gospel, from a Spanning Tree Protocol perspective.

00:09:44
Now, to wrap up this review. Let’s remind you of the controls that you have, as and administrator, to literally I have this tree constructed in a different manner. Obviously, one of the controls that you have is the ability to influence that step one, to influence the election of the root bridge.

00:10:07
Remember, we do this by manipulating the bridge priority. So I’ll say, B and then priority. We manipulate the bridge priority. And there’s two ways to do that. You can set it manually, with a key word of priority, after the Spanning Tree Protocol command. And you can do it by setting a root primary.

00:10:35
And then you could go in and set a root secondary. That’s the syntax we would use, in order to have the switch examine the priorities around it and try and set itself automatically lower. I don’t know about you, but I don’t like things happening automatically, whenever I can avoid it in my network.

00:10:55
So I would tend to use the priority manually, in order to force a particular device to be the root bridge. Now, in our steps two and three of the process, we had root ports being selected, and we had the designated ports being selected. We can manipulate that by manipulating the Spanning Tree Protocol costs assigned to links.

00:11:21
Or we can manipulate the priority value of interfaces that is assigned to links. So we can manipulate ports that are chosen for specific roles, using one of two values. And finally, it’s important for us to know about the Spanning Tree Protocol toolkit, as we troubleshoot and survive Spanning Tree Protocol.

00:11:48
The tools UplinkFast and BackboneFast, these tools are really gone. Because classic Spanning Tree Protocol 802.1D really isn’t being implemented in networks all that much anymore, those two tools, which existed for classic Spanning Tree Protocol, they’ve faded out of usage as well.

00:12:15
The one tool that we are still utilizing, even in Rapid Spanning Tree Protocol environments is PortFast. This particular tool, that we use for connections out to servers, for instance, or workstations. This particular tool is still used, even in rapid and multiple Spanning Tree Protocol environments.

00:12:41
Don’t forget about BPDU Guard. We can go ahead and turn this on, on our PortFast ports. And this will error disable a port, should someone decide to plug in a device that can create a temporary bridging loop, like another switch. Another tool worth mentioning from the toolkit is Root Guard.

00:13:06
With Root Guard, we can go in, and we can select a particular root bridge. And then, we can set up a Root Guard on the ports of that device, so that no other device can come in and try and take over that root bridge role, using what we call Superior Bridge Protocol data units.

00:13:30
Yeah, no way you can come in with a lower priority and wrestle that root bridge role away from us. Finally, there is Loop Guard. Loop Guard is really interesting. Let’s say we have a topology, and this port is in the blocking state. And we want that port to stay in the blocking state.

00:13:52
We don’t want convergence to block some other port and move that into a forwarding state. We want the topology rather static. We would rather be alerted to a failure and go correct it than run the risk of having a loop in our topology. That’s what Loop Guard is all about.

00:14:16
It will literally stop the normal convergence process of Spanning Tree Protocol. Very conservatively, it stops that normal convergence process. So there’s really no chance of a loop, thanks to maybe a switch being too busy from a CPU perspective to properly process BPDUs and then make a port go to forwarding, that would normally be blocked and create a loop.

00:14:47
So Loop Guard is really unique in that it stops the normal convergence process of Spanning Tree Protocol. Well, this is certainly a lot of information. And to really drive it home and to make sense of it at the command line. We’re, of course, going to turn to our good friend, Keith Parker.

00:15:08
Keith, take it away. Hey, thanks Anthony. I would love to. One of the challenges that we all face when troubleshooting Spanning Tree is believing in stuff that perhaps maybe isn’t configured. For example, let’s say, you and I are given this topology. And we’re asked to go ahead and either investigate or troubleshoot the Spanning Tree.

00:15:27
The first thing that we might want to do is validate that these priorities on these switches really are what they say they are. Because as you and I know, the lowest bridge ID is going to become the root of Spanning Tree. So how do we validate that? Well, one quick and easy way to do that is show Spanning Tree.

00:15:44
And then, the VLAN that you’re interested in– because remember, every single VLAN can have its own separate instance of Spanning Tree with per VLAN Spanning Tree and then the keyword of bridge. And what that’s going to do– it’s going to identify for us, immediately, what the Bridge ID is , including the bridge priority for that switch.

00:16:03
So the first thing we ought to do is validate for VLAN 100 that these priorities are in the right ranges. So to verify that, let’s start on switch number 1. And let’s do a show spanning tree vlan 100 bridge, just to validate that the actual bridge ID and the priority, as part of bridge ID, really is in the 32,000 range. So based on results, it is.

00:16:25
And what we want to do is, on every single switch, just go through and quickly validate that the bridge priority for that specific VLAN is set correctly. So we’ll do that on Switch 2. We’ll do a show spanning tree vlan 100 bridge, just to validate that the bridge ID on Switch 2, the priority value, is somewhere in the 12,000 range, which it is, based on this value right here.

00:16:46
And we want the same thing on Switch 2 just to validate it and also on Switch 4. Now, the actual bridge priority values are in increments of 4096. So we do 0, 4096, 8192, and so forth. So on Switch 3, 8192 is the implemented priority value, which becomes part of that bridge ID, which so far is the lowest of all the other switches in the topology.

00:17:10
And let’s also go and validate Switch 4, that it indeed is in the 32,000 range, which would be the default. So that’s a fantastic start for us, to validate that the actual priority values that are in the actual switches reflect what’s shown in the topology for the VLAN in question.

00:17:27
Now, our next step should be to identify who the actual root bridge is, and more importantly, to validate that all the switches agree on who the root is for VLAN 100. So here’s my question for you. Out of the information that we just looked at, based on the Bridge IDs that we just looked at for all these switches, which of all these switches should be the root switch or the root bridge for be VLAN 100? Now, if you’re saying, Keith, that’s so easy.

00:17:50
I know that. It is Switch 3, has the lowest priority, which equates the lowest bridge ID. And I think you’d be absolutely right. So our next step would be to validate that all these switches in the stability, that they all agree that Mr. Switch 3 is the actual root. Now, there’s a couple of ways we can see that.

00:18:09
We could do it with a show spanning tree vlan 100 and look at all the information. Or we could specifically asked the switches, hey, tell me who you think the root is. And they should all agree. The other cool thing is, that on the non-root switches that have root ports, this command will also show us what those are root ports are, for VLAN 100. So here’s what I’d love for you to do right now.

00:18:30
Before we actually use this command, I’d like you to take this topology, which you could just pause the screen right here. Going ahead and pause me, so you take a look at it. Or if you want to, in the Nugget Lab files for this video, you can also download this topology, regarding VLAN 100. And I’d like you to identify which of the ports are going to be root ports.

00:18:50
Now, we on the root switch, it’s going to have all forwarding designated ports. But for Switch 1, 2, and Switch 4, they’re all going to have one, and only one, root port. I’d like you to identify which of their interfaces are going to be root ports. So here’s what I want you to do.

00:19:07
Go ahead and pause me right now. And when we come back, we’ll use this command to validate that all the switches agree that Switch 3 is a root and will also identify the single root port on the non-root switches. So because we have reason to believe that Switch 3 is the root of Spanning Tree for VLAN 100. And one of the giveaways is this– the root cost is zero, meaning it doesn’t cost anything to reach me.

00:19:30
I am the root. And there’s no Root Port listed, which is another telltale sign that this switch is the root for VLAN 100. So let’s do this same command over on Switch 1. So one Switch 1, based on what I’ve asked you to do, you’ve already determined or guessed on which port will be the root port on Switch 1. And we can validate that by using the command show spanning tree vlan 100 root. That will indicate who Switch 1 believes is the root. And it will also indicate the root port.

00:19:59
So as we look at the bridge ID of the root, Switch 1 believes that this guy is the root. And this also happens to be Switch 3’s bridge ID. It also says that, Gi0/17 on Switch 1 is the root port. And there is a root cost of 19. I’d like to chat with you a little bit about why that is.

00:20:18
So if this is Switch 1, and this is port 17, Gi17. And this is Switch 3. And this is fast ethernet 17, and they’re connected together. And from this output, it shows that’s the root port. Why do we have a cost of 19? In Spanning Tree, if we’re operating at 100 megabits per second, that port is considered to have a cost of 19. If we’re at 1,000 megabits per second, which is also known as one gig, that port would have a cost of 4. So how come this gigabit interface is currently showing a cost of 19? Because here’s what happened– the root is advertising a cost of 0, hey, it doesn’t cost anything to reach me.

00:21:02
And Switch 1, when it receives those BPDUs, is adding on the local cost of its interface, for a total cost to reach the root. So that implies that Switch 1 added a cost of 19. So 19 plus 0 equals 19. And that’s what we’re seeing right here. So a great question is, why in the world does this path cost 19 and not 4. because it’s a gigabit interface.

00:21:24
And the reality is that these two are directly connected, and they auto-negotiated. And because F0/17, that port can’t do a gigabit. They both negotiated at 100 megabits per second. And so from a Spanning Tree perspective, the cost of 100 megabit interface is 19. And that’s what it’s going to go ahead and use.

00:21:44
So here’s my question for you. Did you get it right? Did you know that the Gi0/17 was going to be the root port on Switch 1? If so, great, great job. Let’s go ahead and take a look at Switch 2, to verify that Switch 2 knows who the root is and which root port Switch 2 will be using.

00:22:01
So road trip over to Switch 2, and on Switch 2, we’ll do show spanning tree vlan 100 root. Survey says, the exact same bridge ID, which happens to be the bridge ID of Switch 3, which is great, and a path to get to the root of 19. Switch 3 is advertising a cost of 0. And Switch 2’s local interface, of Fa0/19, which is the root port, because it’s operating at 100 megabits per second, has a cost of 19. It added those up, the advertised cost plus the cost of the interface, for total cost of 19. 19 plus the advertised 0 equals a cost to get to the root of 19. Fantastic.

00:22:40
Did you get that one correct? And if so, excellent, excellent work. Let’s go over to Switch Four. And on Switch 4, we’ll ask him the same question. Who do you the root is, and by the way, what is the root port? He should report also, if Spanning Tree is working, that the bridge ID of the root is the same as what everybody else believes, which is Switch 3’s bridge ID. The root port it’s going to use is Fa0/23. And the cost is 19. Again, that’s because the root, which is Switch 3, advertised a cost of zero.

00:23:11
And the local fast ethernet, operating at 100 megabits, has a cost of 19. It added the advertised cost plus the local interface cost, for a total cost of 19 to get to the root. Now, my question for you is, did you get the port right. Did you choose port Fa0/23 as the root port? Or did you choose Fa0/21? And what I’ve discovered is that most people would have incorrectly guessed the root port that Switch 4 is using to get to the root.

00:23:40
And what you and I get to do right now is take a moment to discuss why Fa0/23, the higher numbered port, got chosen, compared to a lower numbered port. Because in Spanning Tree it’s like, everything lower is better, right. So how come it shows the higher numbered port? So let’s take a look at the details of why that’s occurring.

00:23:58
So here we have Switch 3, which is the root. And we have Switch 4, which is not the root. And on Switch 3, we have a couple ports. We have Port 21 and Port 23. We also those same ports on Switch 4, Port 21 and Port 23. And you can see the diagram right over there, if you’d like to take a look at it.

00:24:14
And what’s happening is, Port 21 is connected to Port 23 and vice versa over here. So Switch 4, when it’s going through it’s Spanning Tree process says, for VLAN 100, oh my goodness. I’m receiving BPDUs here and here, for that same VLAN. And as a result, if I start forwarding on both of those ports, I will be creating a loop.

00:24:33
And that’s what Spanning Tree is there to prevent. So what it does– it says, they do go ahead and block on one of these ports. Which one do I use? It’s always based on cost first. Now, the cost on Switch 3 that’s being advertised is the cost of 0, cost of 0. The local port cost on Port 2123 is 19. So as Switch 4 considers the cost to get back to the root, he has a problem.

00:24:54
He has 19 on 21 and 19 on 23. It’s the same exact cost. So what exactly is the tiebreaker? Well, after cost we have the bridge ID. Now, because this is the Switch 3 in both cases, the bridge ID of the advertising or designated switch is going to be exactly the same.

00:25:12
So because it’s equal, that’s not a tiebreaker. So it’s cost first, then the lowest bridge ID. And now, it gets down to ports. There is port priority. And there’s also Port ID. Now, because the port priority, by default, on Port 21 and Port 23 are both the same– we’re talking about the advertising ports on Switch 3. The port priority is equal.

00:25:35
So that is not a tiebreaker either. So we have cost first, then the lowest bridge ID, then the port priority, lower again being better. And finally it boils down to port number. So check this out. Because all these other factors are equal and the BPDU is coming down 23 and 21, the cost is going to be equal, from the perspective of Switch 4 on the two ports. The bridge ID is equal.

00:25:57
The port priority is going to be equal. And the port priority is in the advertisements of the BPDUs, coming from Switch 3, coming to Switch 4. And so it boils down to port number. But it’s not– I repeat, it is not the receiving port number– lowest is better.

00:26:12
It is the advertising port. So because 21 is lower than 23, this be BPDU advertised, that is received on Port 23, will be preferred because of the lowest advertised port ID, when all other factors are equal. And that, my friend, is why, on Switch number 4, we are using FA0/23 as our root port, to get to the root. Because you and I are right here at the command line interface, let’s go ahead and use another Show Spanning Tree command, this time, with the keyword detail at the end of it, to actually see those details, regarding the advertised port number.

00:26:50
So if we scroll up just a little bit, this output shows us the we’re in the IEEE-compatible Spanning Tree. That’s 802.1D. Here’s our default priority of 32,000 and change that we’re using. Here is the information about our current root, which is Switch 3. And if we scroll down just a little bit, here’s the details regarding Port 21 and Port 23. So this poor switch has to decide, out of Port 21 and Port 23, which one to forward on. So it’s going to go for, the lowest is best.

00:27:19
The first thing it looks at, always, always, always, is the cost. The advertised cost that it received on the BPDUs that came in on Port 21 has a cost of 0. Plus the local port cost of 19, because it’s operating at 100 megabits per second, is a total of 19 for Port 21. And on Port 23, same thing, the advertised cost is zero. The local port cost is 19. So cost is a wash.

00:27:43
They’re equal cost to get to the root. The second thing we’re going to look at is Bridge Identifier. Because the reality is, we could be learning these BPDUs from two separate switches. So whichever switch, that’s advertised in this BPDU, has the lowest bridge ID.

00:27:57
We’ll choose that path as our next decision point. So the designated switch for this segment has a bridge ID of this, which happens to be the root. And it’s the same switch for both ports, so that is equal. So it will say root on both. So we can’t use bridge ID as a tiebreaker.

00:28:13
The next thing that we’re going to consider– and this is important– is the advertised priority and Port ID, not our local priority for an interface but the advertised priority and Port ID, in the BPDUs that came in. So what that means is, on Port 21 the BPDU came in with a priority of 128. And on Port 23, it came in with a priority of 128. Both of those were sourced by Switch 3. And because the priority is exactly the same, the next decision point is the actual Port ID itself, which is this portion.

00:28:47
So on our local Port 21, the Port ID in the advertised BPDU that we received was 23. And on Port 23, the advertised BPDU Port ID that we received was 21. And as a result, this one is lower. And that is why, my friend, we are forwarding on Port 23 instead of Port 21. So if we wanted to manipulate the actual port priority, to change which interface, which four we’d be using, we would have to change the port priority on Switch 3, the advertising designated switch for that segment.

00:29:22
Because it’s the received priority that Switch 4 is receiving, that it’s making that decision on, what happens a lot is, people will change the actual port priority on Switch 4. It doesn’t affect the Spanning Tree decision whatsoever. And they wonder why.

00:29:36
And now we know. Because it’s the received priority that we receive from the designated switch on that segment, and not our locally configured port priority that we’re making the decision on. So the reality is, most people would incorrectly guess what the forwarding port or the root port would be on Switch 4. But now that you know the decision process and how it operates, you can not only interpret what it will.

00:29:59
You can also make changes to a port priority on the upstream switch, in this case Switch 3, to change that decision. So we’ve taken a look at the bridge and root options and also the detail options, to take a look at the advertised priorities that Switch 3 was sending on these ports. And another really cool feature is the Summary option.

00:30:20
The Summary will give us an overview of the Spanning Tree add-ons, I like to call them, like Loop Guard and Root Guard and so forth, that can identify whether or not those features are running on our switch. So for troubleshooting, if we’re trying to identify some feature that might be causing a hindrance in Spanning Tree, the Summary Keyword is a wonderful tool that can help us identify that very quickly.

00:30:41
So for example, if we do a show spanning tree vlan 100 summary on any of these switches, it can share with us the details, regarding PortFast, BPDU Guard, BPDU Filter, Loop Guard. UplinkFast, BackboneFast, the bridge ID of the root for that Spanning Tree, and a quick overview of how many interfaces are in what states in Spanning Tree.

00:31:02
So I’ve got another question for you. It shows us in this summary, that we have three interfaces that are currently active, one forwarding, which is our root port and two mor that are blocking. So as we look at the topology, my question for you is, why is it that Gi0/13 and 15 are both in blocking state, between Switch 1 and Switch 2, when Switch 1 has gigabit interfaces and Switch 2 has fast ethernet interfaces. So let’s consider the logic that Switch 1 and Switch 2 are both going through.

00:31:31
Now, because these links here are gigabit on Switch 1 and fast ethernet on Switch 2, just looking at thinks, oh, Switch 1 is better, better, better. The reality is, they’ve auto-negotiated both of them to 100 megabits per second, for all the links. And the real question is not what the speed of these links are.

00:31:49
The real question is, what is the cost to get to the root. The Switch 1 cost to get to the root is 19. And that’s through its Gi0/17 interface. The cost of Switch 2, to get to the root, is a cost of 19, through its Fa0/19 interface. So the cost of the root is equal.

00:32:06
And my question for you is, do you remember what the next decision point is, after cost. And the next decision point is the lowest bridge ID. And if we look at the topology, the bridge ID for Switch 1, the priority is in the 32,000 range. And the priority for Switch 2 is in the 12,000 range, which makes Switch 2 have a much better bridge ID. Lower is better with Spanning Tree.

00:32:30
And as a result, Switch 2 would actually win that little competition on both of its And it would be forwarding on both of its interfaces, between Switch 1 and Switch 2. And Switch 2, as we see right here, is blocking on 13 and 17 for that reason. Also, in this output, we see that Rapid Spanning Tree is enabled and the peer that we’re connecting with over our 13, 15, and 17 are running Rapid Spanning Tree as well. If they weren’t, if they were running Legacy 802.1D, it would show as (STP) in parentheses, right next to it.

00:33:04
So for example, if we go to the root switch, which is Switch 3 and did a show spanning tree vlan 100, because Switch 4 is running the legacy IEEE 802.1D, the two interfaces that we’re using from Switch 3 to connect over to Switch 4 has the STP. It doesn’t change the math or the actual cost factors, because Rapid Spanning Tree is backwards compatible with Spanning Tree.

00:33:28
But that’s a quick of validating that a peer is running STP. So here’s what you and I get to do next. We’re going to take the same exact topology, and I’d like you to look at it and tell me what you see that’s different from our previous topology. Well, the physical interfaces are the same.

00:33:43
So what’s different? Oh, I see, this is VLAN 200. Oh, and check this out. The actual priorities on the switches are different from a spanning tree prospective, than for VLAN 100. It’s important for us to remember that, if we have four different VLANs, we could have four totally different instances of Spanning Tree, with, for example, Switch 1 being the root of one of those Spanning Trees, switch 2 being the root of another one.

00:34:07
And we need to pay attention to which VLAN we’re working with, as we work through a topology. So looking at this, which one of these would be the root of a Spanning Tree 200? Now, if you’re saying, oh, I know. It’s this one, because it has the lowest priority.

00:34:20
You’d be absolutely correct. And the next thing to do, before we start believing that, we’d want to use this command right here, just to validate what the bridge IDs are, for each of the switches, to make sure we don’t have any misconfiguration there. If we discover a discrepancy between the topology and what the actual priorities are for those bridge IDs, we could then administratively change that back to what it should be.

00:34:44
Once we have the priorities straight, we would take a look at verifying who the root is, from every switches perspective. For example, if Switch 1 believes it’s the root for Spanning Tree VLAN 200 and Switch 4 also believes it’s the root for Spanning Tree 200, it’s quite possible we have a break somewhere in the network.

00:35:03
So here’s what I’m asking you to do– this topology diagram for VLAN 200 is also in the Nugget Lab file. So you can either grab it and work with it or pause it right here. And I’d like you to identify, based on the information provided. And what I’d like you to do is identify who will be the root for the entire Spanning Tree for VLAN 200. I’d like you to identify the root ports.

00:35:25
Again, those are on non-root switches that have the root ports; then the designated ports; and then the ports finally that are in blocking state. And sometimes those are referred to as discarding, in the Rapid Spanning Tree terminology. So when you’ve identified the roles of each of the ports shown in this topology, let’s go ahead and continue the video.

00:35:45
And we’ll walk through it together. So let’s start from what we think will be the root bridge and work our way out. So Switch 2 with a priority in the 8,000 range should be the root. And sure enough, it says, “This bridge is the root.” So this top part identifies the root bridge information.

00:36:02
And this information right here identifies the local bridge information. And because this device is the root, these are going to match. And on a root switch, of course, all the ports are designated. And they’re all going to be forwarding. So next, let’s go up to Switch 1 and ask Switch 1 about its status for VLAN 200. Now, it has two directly connected interfaces to the root Gi0/13 and 0/15. So this top part is identifying the root switch.

00:36:31
And this bottom part is identifying our own Local Switch. And it’s very clear why we’re not the root. It’s that somebody else had a better priority, a better bridge ID than us. So our bridge ID has a priority of 32,000 and change. The root has a priority of 8,000 and change. And that’s why we’re not the root.

00:36:50
But here’s something peculiar. Check this out. This shows that we’re forwarding on Gi0/15. That’s our root port, and we’re blocking on 13. How come we’re not forwarding on 13 and blocking on 15? Why is that? And if you’re saying, Keith, I know. You explained earlier that, if we have two ports that are directly connected, it is the advertising interface’s lowest priority.

00:37:11
And if that’s equal, the advertising interface’s lowest port ID number that’s going to be chosen. And because Gi0/15, if we take a look at the topology, is connected to switch 2’s the Fa0/13, the advertisement out of 13 would’ve been lower than the advertisement out of 15– because of the lower port number.

00:37:31
And that is why this Switch, Switch 1, is choosing Gi0/15 to forward on. Another question I have is, why is getting Gi0/17– why are we blocking on that? So we have a little competition between this local switch, Switch 1 and Switch 3, on who’s going to be designated for that link between Switch 1 and Switch 3. So the logic would go like this.

00:37:54
Who has the best cost to get to the root, Switch 1 or Switch 3? And if they both have a cost of 19 to get to the root, which they do, the next choice would be the lowest bridge ID, between Switch 1 and Switch 3. So if we looked at the details of Switch 1 and Switch 3, which we can do right now– so let’s go ahead and go over to Switch 3 for a moment. And on Switch 3, we’ll do a show spanning tree for VLAN 200. So one Switch 3, the bridge ID is 32,768 plus the VLAN number, followed by the base Mac address.

00:38:29
And because these values, right here, 32,768 and 200, should be exactly the same as on Switch 1. So we’re back to Switch 1, and those are exactly the same. It boils down to the Mac address. So here’s the Mac address on Switch 1– 001C, et cetera. So let’s take a look at the base Mac address over on Switch 3. So on Switch 3, the base Mac address is 000e, which is numerically lower than Switch 1’s, which makes the bridge ID better.

00:39:00
And that is the reason that Fa0/17 is the designated interface for that segment. And Gi0/17 on Switch 1 is currently blocking. This is also showing us on Switch 3 that ports 21 and 23 are designated. And these are the connections to go over to Switch 4. And that Switch 4 is currently running the IEEE 802.1D. And we are running 802.1W, Rapid Spanning Tree. Probably, the most critical skill in troubleshooting Spanning Tree is to, first of all, make sure that we understand how it works and correctly predict and anticipate the behaviors of Spanning Tree.

00:39:38
And then we can use these commands to validate that Spanning Tree is operating correctly. For example, if we do some Spanning Tree command and we see that Switch 1 thinks it’s the Spanning Tree root for VLAN 200. And Switch 4 also thinks that it’s the root for Spanning Tree 200. What does that mean? It means that we do not, for whatever reason, have a contiguous Layer 2 broadcast domain. There’s a break somewhere.

00:40:01
Maybe we have interfaces that are disabled, or we have switch ports that are configured as Layer 3 interfaces instead of Layer 2 interfaces or trunks that are not configured to allow the appropriate VLANs through the entire Spanning Tree Domain. Those could all be reasons why Spanning Tree isn’t making it all the way across your Layer 2, or what should be your Layer 2 domain. So here’s what I’d like you to do.

00:40:23
The topology diagrams, as I mentioned, are up in the Nugget App files for this video. I’d like you to do one additional exercise. I’d like you to take this switch right here, for VLAN, let’s say, 300 and give it a priority of 4096, just logically on paper.

00:40:38
And then, once again, using all the other priorities that are shown, I’d like you to take the steps of identifying which ports will be the root ports, which ports will be designated and which ports will be blocking slash discarding. And that is your homework assignment from this video.

Multiple Spanning Tree (MST)

00:00:00
Now, as you know, in the Nugget just previous to this one, we took a look troubleshooting Spanning Tree Protocol. And our focus was on classic and rapid Spanning Tree Protocol. Well, it’s time for us to go ahead and zero in on multiple Spanning Tree. This represents yet another Nugget that covers Cisco certification levels and route switch from CCNA all the way through CCIE.

00:00:28
Multiple Spanning Tree Protocol addresses how many Spanning Trees you’ll have in your infrastructure. You see, when Spanning Tree first began, it was a common Spanning Tree approach that would be utilized. Every single VLAN in your infrastructure would follow the same exact Spanning Tree topology.

00:00:52
So if you had a port blocking in a triangular configuration, like this one, that would be the blocking port for every single VLAN in your topology. Now, what is an obvious problem with that? Well, you can’t really initiate any kind of load balancing, can you? Yeah, this particular link here is indeed shut down for absolutely all traffic in your infrastructure under the common Spanning Tree.

00:01:22
So Cisco came along and said, look, there’s got to be a better way. Let’s do per VLAN Spanning Tree or PVST. The idea here is every single VLAN will have its own Spanning Tree topology. Now, you can do some really clever things. You can go ahead and, let’s say, set up switch one as the root for a certain set of VLANs.

00:01:49
Set up something like switch three as the root for another set of VLANs. And now you’ll have another blocking port for a different group of VLANs. You can implement some crude load balancing thanks to per VLAN Spanning Tree. So this was the landscape for a real long time with Spanning Tree protocol, right? It was per VLAN Spanning Tree protocol.

00:02:16
And every single VLAN gets its own topology. What’s the problem here? Well, what about for environments where we don’t want or need that? We are doing a lot of processing in an environment where we have a lot of VLANs, really, unnecessarily. Right? I mean, if a common Spanning Tree would have worked for you, you’re forced to use per VLAN Spanning Tree in the Cisco environment unnecessarily.

00:02:48
Well, along comes multiple spanning tree protocol or 802.1s. 802.1s really seeks to give us the best of both worlds. If we want something that looks like common Spanning Tree, we can implement it. If we want the flexibility of per VLAN Spanning Tree, we can implement it.

00:03:11
In fact, the idea behind this technology is for you to be able to implement the exact number of top topologies that you want. That’s right, I said it. The exact number of topologies. If you have 200 VLANs in your infrastructure and you want one topology for VLANs two through 50, you want a second topology for 51 through 110, and you want a third topology for 111 through 200, you can go ahead and create that. That is the exact number of topologies you could create.

00:03:55
And you can map them to those exact VLAN identifiers. Now, one of the first questions that comes to mind here, especially with our troubleshooting slant in this course from CBT Nuggets, is could this infrastructure co-exist with a traditional per VLAN Spanning Tree infrastructure? And the answer is, yes, it can.

00:04:19
Sure enough, the MST domain that I’m going to review with you how you would create can coexist with the earlier technology. Sure enough, this domain looks like one big Spanning Tree switch to that legacy type per VLAN Spanning Tree environment. Obviously, in most environments, of course, we want to move to MST everywhere so we can get the full benefit.

00:04:49
But as we’re making that transition, sure enough, we can indeed coexist with per VLAN Spanning Tree. Another thing I want to remind you of that we mentioned in the previous Nugget but it’s worth mentioning here again is the 802.1s technology from Cisco Systems. When we configure it, inside it, we are getting 802.1w. So we are getting rapid Spanning Tree Protocol as the technology that’s running inside of this 802.1s environment.

00:05:27
The first thing that you’re going to want to do is make sure all of your devices support multiple Spanning Tree Protocol. Remember, one way to do this is the feature navigator. We can go cisco.com in our web browser, forward slash go forward slash fn for feature navigator.

00:05:46
Now, if you’ve already got the device in place, you can go to global configuration mode and just do spanning-tree mode, question mark, and the context sensitive help is going to show you the modes that are available on your device. And we are looking for a mode of MST to indicate that our device supports this technology.

00:06:12
Now, as Keith will demonstrate at the command line, in order to actually configure MST as your particular technology, you have to enter MST configuration mode. And that is not done with the Spanning Tree mode command. So once we set the overall flavor of Spanning Tree on the particular device with the Spanning Tree mode command, we will then enter Spanning Tree configuration mode.

00:06:43
And again, Keith will go ahead and demonstrate that. Now, once we’re in the right configuration mode, we configure our domain. And for this, we’re really going to want to Notepad. Yeah, Notepad is the critical ingredient for an MST configuration. Why? Well, because when we configure multiple Spanning Tree, there are three elements that we need to configure identically on all of our switches.

00:07:16
Three elements that must be letter for letter identical. And that is why Notepad is recommended here. What are those three elements? They are your MST region name, your MST revision number,– that’s a u, by the way– and your instances. And of course, to what VLANs those instances are mapped.

00:07:50
Yeah, this is what we are going to place in Notepad because we need this identical configuration on all of our MST devices. So we have a region name. We could say TSCLASS is our region name. And is the region name case sensitive? I don’t know. I really don’t.

00:08:14
But why even get into that? When I’m naming things at the IOS, I always use full uppercase in names. Now, case sensitivity is never an issue for me. I also like to use uppercase for all of my name’s because they will stand out when I’m reviewing a configuration output.

00:08:34
So TSCLASS is our region name. Our revision number. This is just a number that never increments. You can think of it as the second part of the name. So you know how when we name software we might say, like, Windows 8, heaven forbid. So that’s a two part name, right? And then, later on we’ll create Windows 9. Ooh, can’t wait.

00:08:58
So it’s just a two part name. Notice that if we were to revamp our multiple Spanning Tree Topology, we could go in and call it TSCLASS2. And now we could convert everyone over from a name perspective to TSCLASS2. So the revision number is just the second part of our name.

00:09:20
One of the things I see students doing is confusing this with the configuration revision number that we might have in a VTP type environment. And that number, of course, increments when there’s changes to the VLAN database. Please, this is nothing like that.

00:09:40
This is just the second part of the name of our region. And then our instances to VLANs. Following an earlier example, we might have instance number one, which we map to VLANs 1 through 10. We might have instance number two, which we map to VLANs 11 through 20. And then we might have instance number three, 21 to 30. And these instances to VLAN references are going to be duplicated exactly as they appear here on all of your devices.

00:10:17
So all of this configuration is going to go into out Notepad. And then we’re going to be pasting it into all of the devices that we want to run multiple Spanning Tree Protocol. So your configuration of MST, as you might guess, is pretty unique from what we are used to with either classic or rapid Spanning Tree Protocol.

00:10:40
But I got some great news for you. When it comes to configuration and troubleshooting there after, it’s very, very simple. Remember how everything used to be spanning-tree vlan, right? We would indicate a particular VLAN for our manipulations. Or even if we were doing some kind of a show command, we were interested in the VLAN that we were scrutinizing.

00:11:07
Well, guess what? Now, we’re just going to be doing a big search and replace. It’s going to be Spanning Tree MST in both our configuration and our show commands. And what’s amusing about this, by the way, is that you can still do a show Spanning Tree VLAN command.

00:11:30
Let’s say we looked at show Spanning Tree VLAN 10 in an MST environment. That’ll still work. And it’ll still give us the Spanning Tree properties of the particular VLAN. But, of course, it’s going to tie it back with information as to what particular instance you are participating in.

00:11:48
So multiple Spanning Tree Protocol. Yes, it’s a new world with a new configuration. But it is, indeed, going to leverage what we have mastered already with Spanning Tree Protocol. Keith, I think it’s time to see this particular technology come to life at the command line.

00:12:09
Thank you, Anthony. In this topology that we get to use as we troubleshoot Spanning Tree, I’d like you to presume that you and I have been called to a customer site who has implemented MST, but it’s not working correctly. Their objective, they told us, was to have these four switches in one single MST region.

00:12:27
And that, for instance, number one, they wanted to have VLANs 11 through 20. And for instance two, they wanted to have VLANS 21 through 40. And additionally, they wanted switch two to be the root of MST instance one. And they wanted switch three to be the root of MST instance number two.

00:12:45
And after just a little bit of looking at this network, they told us that they cannot correctly get it going on their own. And that’s why they need your help and my help. And that’s why we’re here. I like how Anthony mentioned that we want to make sure you’re using the feature navigator that the IOS we’re running on the switch supports multiple Spanning Tree.

00:13:02
That’s fantastic. Another question we definitely want to ask is are we running a version of IOS that supports the current and implemented IEEE standard for MST because check this out. This is saying they have four of these devices. The MST implementation on Cisco IOS 12.x and higher, that version, is based on the standard.

00:13:22
The MSE implementations in earlier Cisco IOS releases are prestandard, which means you may or may not have really good compatibility with other devices who are also running the standard 802.1s. So what we definitely want to do is you and I validate that we’re running on our platforms, a version of IOS that supports the 802.1s and not some prestandard implementation of MST.

00:13:47
So a quick way of validating that is, first of all, find out what the minimum version is for the platform you’re running. And then just do a show version command. In this output, I’m just going to do a pipe, and sort it out based on any lines that include the word version.

00:14:01
So on this 3560, I’ve got 12.2(55) which does, on this platform, support the 802.1s. And we want to do that same thing on each of our switches. So we’ll go to switch two and say show version. And it’s going to, again, pipe it to include just the lines that contain the word version.

00:14:20
And this has 12.2(44), which is high enough. Let’s go to switch three, and verify the same thing with the show version command. This is also a 12 (44) on a 3550. And let’s go check switch four. So in this switch, as well, we also have a version of IOS that supports the IEEE implementation of MST.

00:14:40
Another really important thing that has to match on all four switches for this to work is a matching region name, the revision number, which Anthony pointed out is really like part of the name itself, as well, and the mappings, which instances of Spanning Tree are going to be mapped to which VLANs.

00:14:56
All three of those items have to be identical in order for them to become part of one MST region. So here on switch one, let’s go ahead and validate those elements. The region, the revision number, and the mappings. And probably the quickest way of doing that is a show run.

00:15:12
I’m just going to do a pipe to start the looking at the running config. So let’s do that on all four switches. So here on switch two, we’ll do a show run. Begin Spanning Tree. And then also on switch three and also on switch number four. So Anthony mentioned that one thing that he often does when configuring MST is using Notepad and do a copy paste so it’s identical for every single system and in comparing them.

00:15:38
That’s also not a bad idea if we wanted to copy and paste the configuration for Spanning Tree into a separate document just to look at each one of them note by note or letter by letter to make sure they’re the same. So here because they’re in the same position on the screen, this is switch one.

00:15:52
I going to go to switch two. And all we’re doing is looking for any differences that pop up in this Spanning Tree configuration. We’ll go to switch three and switch four. And I do notice– check this out– on switch three the O-U-R has the U-R in lower case.

00:16:08
And that is not the same if we take a look at switch two and switch one. They have a upper case OUR. So when they had configured this, if they used the technique that Anthony shared of copy paste from a Notepad, they wouldn’t have made that typo. So let’s go back to switch three.

00:16:23
And because the name of the region is incorrect, this won’t be part of the overall Spanning Tree region like switch one, two, and switch four are. So let’s do a quick check. We’ll do a show Spanning Tree for MST instance number one. And what this output shows us is that for MST instance number one, we have VLANs 11 through 19. Switch three is claiming to be the root of this MST instance.

00:16:49
And what really gives it away that we are not participating correctly as part of one big happy MST region is that switch three believes it has some boundaries between itself and another region on ports 21 and 23. So if we look at our topology, ports 21 and 23 go over to switch four.

00:17:06
And if switch three and switch four a part of the same region, we shouldn’t have this indicator that it’s a boundary between MST regions. So we’ve identified that the region name was incorrect. It’s capital O-U-R. And all the other switches in port R3 is using O- and lowercase U-R. So let’s go fix it.

00:17:23
We’ll go back into Spanning Tree MST configuration mode. And we’ll say name our region one spelled correctly. And then let’s see if that corrects the issue. And by the way, to go into configuration mode for MST, this is exactly how we would do it. Spanning Tree MST configuration, Enter, and that puts us into MST configuration mode.

00:17:45
Now there’s a couple of different ways we could validate the region name that we’re advertising in bpdu’s We could look at the config, like we just did, or we could do a debug. For example, I’m going to turn on monitoring to the terminal I’m sitting at. And I’m going to do a debug Spanning Tree MSTP bpdu transmit.

00:18:02
So any sent bpdu’s will get a nice debug of those on the screen. And then I’m going to turn debugging off. I’m also turning off terminal monitor. So right here, these are all outbound bpdu’s that we’re sending. We have our region name, which is OUR-REGION1. We have the revision number, which is, in this case, 6783. And these have to match identically on all the switches in that MST region for it to be considered a single MST region.

00:18:29
Now, I’ve got a question for you. What was the other item that has to match in addition to these guys in order for the devices to become part of the same MST region? And if you saying, Keith, I know. It’s the mapping. The mapping of the instance– for example, instance one applies to VLANs 11 through 20, and instance two maps to VLANs 21 through 40 and so forth– is the third element that has to match on all the devices in the same MST region for them to believe that they’re in the same MST region.

00:19:02
So now that we’ve corrected the name on switch three, let’s go back and see whether or not that makes a difference. So we’ll do a show Spanning Tree for MST instance number one. And hopefully, that boundary message– ouch– will go away. Now, it’s still there.

00:19:19
So in our debug, we validated the region name and the revision number. The only reason that this switch would not consider itself to be in the same region with another switch is that one of the three items do not match. And because we now know that this matches and this matches, the only thing left is the mapping.

00:19:37
So let’s validate the mapping here on switch three just one more time. We’ll do a show run. And we’ll just start the output where the line shows up of Spanning Tree. I’ll hit Q to stop the output. So here’s our region, here’s our revision, and there’s our instance to VLAN mappings.

00:19:53
Now, what in the world is wrong with those? Let’s compare them. Let’s compare them against switch four. That’s the guy he’s complaining with. So we’ll go to switch four. An on switch four that we’re now looking at, I see, it’s 11 through 20. And over on switch three, it’s 11 through 19. So which one is correct? It is switch three or switch four? And what we ought to do is look at our documentation of what they should be.

00:20:15
Or if there is no documentation, maybe, look at the other switches– switch two and switch one– just to validate what the common consensus is. And in this case, because all of the switches– switch one, two, and four– all agree on these instance to VLAN mappings, it’s very likely that that is what was intended.

00:20:36
And it’s also very important that switch three have exactly those same settings. So on switch three, we have the technology. We can correct those problems. We’ll go into configuration mode. We’ll go further into MST configuration mode. And we’ll set up instance one and instance two with the same exact mappings as switch one, two, and four currently have.

00:20:58
And that’s instance one is going to be mapped to 11 through 20. And instance two is mapping from 21 through 40. And then we can verify that with a show Spanning Tree for MST instance one. And right here, it’s confirming that this instance– instance number one– applies the VLANs 11 through 20. And check this out.

00:21:16
Down here, ports 21 and 23, no longer have that boundary message indicating a boundary of an MST region, which implies that switch three and switch four are apart for the same MST region. So for these VLANs 11 through 20, which is MST instance number one, let’s follow the ports, and find out who the root is.

00:21:34
So if we look at the root port, which is Fa0/23 on switch three, if we look at our topology diagram, that leads over to switch number four. So if we go over to switch number four, it’s very, very likely that switch number four is going to be the root of MST instance number one.

00:21:52
So let’s go over to switch four and let’s ask him. Let’s see the Spanning Tree information for MST instance number one. And sure enough, he says, I am the root for this instance, which means VLANs 11 through 20. We are currently looking at the root, which is switch number four.

00:22:07
Now, another interesting thing too- let’s use the Up Arrow key. And let’s swap out that one with a two. He is also the root for instance number two. Now, why is that? And it goes back to basics Spanning Tree. The lowest bridge ID is going to win. So what that means is, if we simply implement MST, yes, we are going to be cutting down on the bpdu’s compared to per VLAN Spanning Tree because we only have two configured instances in addition to instance zero, which we’ll talk about in a moment.

00:22:36
And that is going to cut down on our bpdu’s. However, we haven’t really changed the topology. The topology, the logical path that we’re going to flow through for instance one and instance two, is all identical because switch number four has the lowest bridge ID.

00:22:52
Let’s do one more thing on switch four. And let’s do a show Spanning Tree for MST instant zero. And that’s the one you get by default. Any VLANs that are not mapped specifically are going to fall into the Spanning Tree for instant 0. And instance 0 is also used for functions such as common Spanning Tree, which help us to sort out all the details.

00:23:11
We have multiple MST regions, and we still need to make sure we have a single path through them. So the challenges that you and I have solved, so far, are that the region name was not correctly spelled on switch three. And also, the VLAN to instance mapping wasn’t the same on switch three.

00:23:27
We corrected both of those. And now, unless there’s other errors regarding configuration, the four switches that we see in the topology in the upper right hand corner are all part of the same MST region. Now, the next challenge is, based on the diagram and what they wanted, the customer still doesn’t have the root for MST one sitting at switch two.

00:23:47
Nor do they have the root of MST two sitting at switch three. However, we can fix that. It’s so easy. Let’s go over to switch two. And just like for traditional Spanning Tree or rapid Spanning Tree, all we have to do is artificially lower the priority on the device for that instance in this case.

00:24:06
So all we’ve done here is in global configuration mode said for Spanning Tree MST instance number one, we want to be the root. And I followed it with the keyword primary. So what this command does is it says for MST instance number one, I want this switch, which is switch number two, to be the root.

00:24:22
And how does it do that? Really simple. It knows who the current root is, which was just a moment ago. Remember, it was switch number four. Well because it knows the bridge ID of the root, all it needs to do is manipulate the priority for MST instance number one on the local switch.

00:24:37
In this case, switch two. And as soon as we have a lower bridge ID because of a lower priority, that will automatically make this switch the new root for that instance of Spanning Tree. And the key word primary, we have two options here. We could say primary or secondary.

00:24:52
And the purpose of this key word right here, primary, verses the options secondary is that we can control the pecking order in the event this switch fails. OK, this switch fails. Who is going to be the new root for MST instance number one? And it would be the switch where we had configured secondary as the second in line to become the root.

00:25:12
And all that’s really happening behind the scenes is were manipulating priority. So for example, if we wanted to see exactly what that command changed the priority to, we’ll do a show runs. Say I want to include any lines that have the word priority in it.

00:25:26
And right here, we can see exactly the priority that was changed on this configuration. So switch two knew exactly what the bridge ID was of the current root, which was switch four. It said a bridge priority that would make its own bridge ID the newest and best.

00:25:42
For instance, number one causing it to become the root. If we had chosen, for example, that we want switch one to be the backup or the next in line, we could have gone to switch one and said Spanning Tree MST one, root, and the key word secondary, which would also modify the priority on switch one.

00:26:00
But it wouldn’t beat the current route. But rather, it would give switch one a priority, which would be next in line. So if switch two did go away and disappeared off the face of the earth, switch one would have the next best bridge ID and become the successor, if you will, or the next root for MST instance number one.

00:26:18
Also based on results here, the show of the Spanning Tree for MST one indicates that this switch is the new root for MST number one. Now, also based on their topology, they wanted switch number three to be the root of MST instance number two. So let’s go ahead and do that right here on switch three.

00:26:35
We’ll go into configuration mode. And say Spanning Tree MST to root primary. The other option we could have used was to simply specify a priority instead of using the keyword root. But using the keyword root is, simply, a method of automating the process to automatically configure a priority that’s good enough to become the root for that instance of Spanning Tree.

00:26:56
Of course we’d want to verify everything that we do. So let’s do a show Spanning Tree for MST instance number two, which indicates that we are indeed the root for this instance, which is governing VLANs 21 through 40. And as we would expect, on a root switch, all of our ports are going to be designated and forwarding.

00:27:15
Now, just for grins, let’s take a road trip over to switch four. And ask switch four, who just not too long ago was the root for MST zero one and two, about his opinion of the current MST two. So for MST two, he says, OK, my root port is Fa0/23. And I’m currently blocking on 21. Now, I’ve got a question for you.

00:27:38
Take a look at the topology. Why in the world are we’ve forwarding on port 23, which appears as a higher numbered port, instead of port 21? Now, in our previous video on troubleshooting Spanning Tree,– that was the 802.1d and 802.1w– if you recently watched that video, you already know the answer to this question.

00:27:58
And that’s because if we look at the topology, the cables from 21 and 23 on switch number three– and the topology’s right there, by the way– on switch four, 21, and 23, they are cross connected as such. So the reason that switch four is forwarding on port 23 is because all other things being equal, the cost and everything else, it boils down to the port priority and the port ID.

00:28:25
And because of the advertising port 21 is lower than the advertising for 23, it’s that reason that switch four is choosing to forward on port 23. So with the port priority, it’s the advertising port priority that we need to pay attention to. So here’s what I’d love for us to do just to drive this point home.

00:28:44
Let’s go over to switch three where it’s advertising on its ports 21 and 23. And let’s manipulate it so that port 23 has a lower port priority. So at the moment, the default priority is 128 for all the ports. Then we have the port number as the tiebreaker.

00:29:01
So port 21 on this switch connects over to switch four on port 23. And port 23 on this switch connects over to port 21. So what I’d like to do is let’s take this port right here– Fa0/23– and give it a priority of 64, making it better. And as a result of a lower advertised port priority with all other costs being equal, that will cause R4 to start forwarding on port 21 if we make the advertised port priority on 23 the best one available. So let’s go ahead and do that.

00:29:31
We’ll go into configuration mode. We’ll go into interface config for Fa0/23. And we’ll say Spanning Tree MST two port priority 64. A lot of commands are similar to the older legacy Spanning Trees that we’ve used. It’s just that we’re substituting in MST2 as opposed to the VLAN that we’re wanting to work with.

00:29:50
So let’s just locally validate whether or not that priority changed for port Fa0/23 right here on switch three. And sure enough, our local priority now is 64. And that means if we go over to switch four– and I use an Up Arrow key to do that same command again– now they have flipped roles.

00:30:09
See, previously, we were forwarding on Fa0/23. But now as we learn a lower port priority being advertised to us and learning that on port 21, we are now using port 21 as a root port. These numbers right here are reflecting the local interfaces priority and port number.

00:30:27
And remember, everything else being equal, we’re going to pay attention to the lowest priority being advertised to us to make our decision on which port to forward on and which port to block on. We have had a great time. On behalf of Anthony and myself, we hope this has been informative for you.

Don’t Fumble Your Bundle!

00:00:00
This is a Nugget on “Don’t Fumble Your Bundle.” In this Nugget, we will ensure– we being Keith Barker and myself, Anthony Sequeira– we’ll make sure that when it comes to your EtherChannel bundles, they function flawlessly. Should you inherit a network where there’s a problem with the EtherChannel, we’ll teach you how to verify the issue at hand and quickly resolve your problem.

00:00:25
So you know the drill. You have a couple of key switches in your infrastructure. These switches are located in a key location. And you want to go ahead and connect these devices with several links, because of the incredible bandwidth that’s going to be going between these two key switches.

00:00:42
So you do a smart thing. You go ahead and you take one cable and connect them, and then you take another cable and connect them. And lo and behold, you notice that only one cable’s being utilized. Only one cable is being utilized because of that pesky Layer 2 protection mechanism called spanning tree protocol. Sure, spanning tree protocol says we need to prevent a loop in this situation, and blocks one of those links.

00:01:09
EtherChannel is not a spanning tree protocol feature. I repeat, EtherChannel is not an STP feature. Many students think that it is, because it’s always taught alongside of spanning tree protocol in your lower level curriculums. EtherChannel is going to trick spanning tree protocol.

00:01:29
It’s going to say, all right this is actually an interface called port channel 1. And this single interface there’s no need to block. It’s that simple. We often refer to the EtherChannel, of course, as a bundle. And you’ll often see it represented as the links circled when you look at a topology diagram.

00:01:50
A nice thing about EtherChannels of course, is that they can be Layer 2 structures or they could be Layer 3 structures if your switches are multi-layer and support Layer 3 routing. So you could literally go ahead and create a Layer 3 EtherChannel which would respond to a particular IP address assigned to the port channel.

00:02:10
Now let me give you the run through on creating these EtherChannels trouble free. And this will obviously lead to a discussion of what we would look for when we’re troubleshooting an existing EtherChannel. So my first step, shut down those physical interfaces on one side of the equation.

00:02:28
Let’s say we start over here at switch 2. So step one, we’re going to go into the physical interfaces and we are going to shut them down. Why do we do this? Well, we don’t want to bring up the EtherChannel, or attempt to on this side and have some kind of incompatibility with the other side and then have that EtherChannel get in an error disabled state.

00:02:51
On some of your Cisco switches, the error disabled state of the EtherChannel literally requires a reset of the entire device, a reboot. Oftentimes, you’ll need to do this in order to get things working again. We want to avoid that entire nightmare. We want to make sure we build this thing right the first time.

00:03:09
So we are going to shut down a side of the link to make sure we don’t have incompatibility problems. Step two, you want to make sure your links are indeed twins. That’s right, we’re going to make sure that in this case, in this example, all four physical interfaces are twins in their configuration, and are twins in their capabilities.

00:03:32
By this, I mean let’s make sure they’re all gigabit per second links, for example, operating in full duplex mode of operation. So we want to just ensure that the links are both physically, and from a configuration perspective, identical. The next step is to go ahead and make a decision.

00:03:53
Are we going to do a dynamic creation of the EtherChannel– attempt to, that is– or are we going to do a manual? Remember, the modes that you would have as options would be a mode of on. This is the static manual configuration. And then with the link aggregation control protocol, you’re going to have the modes of active or passive.

00:04:20
And then with the port aggregation protocol of Cisco, your modes are going to be auto versus a desirable setting. Now once we decide on the actual mechanism we’re going to use to create the EtherChannel, we just gotta make sure on each side we’re operating in the correct modes.

00:04:40
For instance, if we want to do on and on, that’s going to work beautifully. And this is what I do. I keep link aggregation control protocol and PaGP out of the equation. But if you wanted to do LACP for instance, you could go ahead and do active on one side and passive on the other, and this would successfully form the EtherChannel using that dynamic protocol.

00:05:05
So think about your troubleshooting steps and how easy they would be. We want to go in, if we inherit an existing EtherChannel, and ensure that the interfaces are indeed twins in their configuration. In the instance of trunking, are they forming the trunk links properly? Are all the interfaces allowing the same range of permitted VLANs? These are the types of twin-like configurations that we need.

00:05:34
And then we would know to inspect the commands that were used to actually build the EtherChannel. By the way, as Keith will demonstrate for us here in a moment at the command line, the key command that we want to utilize for verifying the health of the ether channel is going to be show EtherChannel.

00:05:54
And do a question mark, by the way, because you’ll see lots of various levels of scrutiny you can give. You could do a show EtherChannel scrutinizing a particular port channel and all of its details. Or you could say show EtherChannel summary and get a higher level overview of all of the EtherChannels that are created on your particular device.

00:06:18
So Keith, we don’t want to fumble. Please help me drive this right into the end zone for a score. Barker’s got the ball! He’s at the 20! He’s at the 10! Touchdown! And the crowd goes wild! Hey thanks Anthony, for the opportunity. Now you recommended, Anthony, that I should do two basic things.

00:06:36
Number one is show them the correct process for building an EtherChannel in the first place. And then after we’ve done that, do a second scenario where we’re troubleshooting an existing ether channel. So let’s do that right now. So we’ve got two switches that are connected together.

00:06:49
I’ve got switch 1 and switch 2, and I’ve got six interfaces connected back to back. fa 0/1 all the way through fa 0/6. So when creating EtherChannel, one of the best practices that I like to follow is to default all interfaces that you’re going to use. So you just make sure there’s no leftover or configuration differences between any of the interfaces.

00:07:11
And to do that, I’m going to use the default interface range command just to basically wipe those six interfaces clean. The next thing that you and I are going to do together, we’re going to go ahead and shut down that range of interfaces by going into interface range configuration mode for them, and doing a shut down.

00:07:27
Another side benefit of doing this shut down before we do the config and then the no shut down after, is that if we’ve been mucking around with the interface configs and we have an error disabled interface, the shut and the no shut’s going to give that interface an opportunity to go ahead and try again.

00:07:43
So let’s also validate our interfaces. I’m going to do a quick show interface status just to validate the details of those six interfaces. They should all be disabled, ’cause we just shut them down. The default VLAN will be one. Auto negotiate’s on for speed and duplex.

00:07:58
And they’re capable of 10 or 100 megabits per second. So that’s all well and good. And with those interfaces shut, we’re going to do several things. On this switch, it does support ISL or dot1q, so we’ll tell it that we want to use dot1q for trunking. I’m also going to tell it that I do not want to negotiate a trunk.

00:08:13
I want these interfaces to all be trunks. And then we’re going to allow VLANs 10, 20, and 30. By default, all VLANs are allowed. And then finally, I’m going to specify what channel group I want this to be a member of. And we’re going to do channel group 1. And what that did for us, it automagically creates this brand new PO1 interface. So that’s PO1, it’s a Layer 2 EtherChannel. And because of the configuration we just applied to all those interfaces that are now members of that EtherChannel, it’s going to be operating as a trunk.

00:08:42
Or when we bring up the interfaces, it should be operating as a trunk. So what you and I are going to do, we’re going to make a road trip over to switch 2 and do these same exact commands. We’re going to go ahead and default that same range of interfaces of fa 0/1 through 6. We’re going to make sure that we shut down all those interfaces in interface range configuration mode.

00:09:02
We’ll just do a quick verification to make sure we see the status of all those interfaces as being VLAN1, autonegotiation, nothing special configured on them. And we’ll go ahead and put the same configuration as we had on the other side. So encapsulation dot1q, switchport mode trunk, allowed VLANs of 10, 20, and 30, and channel group 1 mode active. Now Anthony shared with us some details about negotiating an EtherChannel.

00:09:27
We could use port aggregation protocol, where the two options are auto and desirable. Or we could use the link aggregation control protocol– that’s the IEEE standard– and the two keywords for that are active or passive. So if we’re using the keyword active on both sides, that implies that we’re using the IEEE standard for negotiating that EtherChannel.

00:09:47
Now what we need to do is bring both sides up. I’m going to go to switch 1 and simply say no shut down. And then we’ll get out of configuration mode. We also need to bring up the interfaces on switch 2. So let’s make a road trip over to switch 2. But here let’s do something a little special here as well.

00:10:03
Let’s go ahead and do a debug of EtherChannel. And also because I’m connected on a VTY line and we’re not seeing any console and debug messages by default right here, let’s go ahead and do a terminal monitor so we can see the debug messages that are going to be generated once we bring the interfaces up.

00:10:20
So we’ll do a no shut down. We’ll get out of interface range configuration mode and then just wait for the fireworks, which shouldn’t take too long. And there they are. Now what I’d like to do is let me go ahead and turn off debugging. So now we can take a look at this output without having future debug messages interrupt our screen.

00:10:38
So the bottom line is all these interfaces came up, the port channel was negotiated, and as a result, the interface port channel number 1 is now in an up state. So in a perfect world, when everything is configured correctly and operating normally, if we did for example, a show interface status and just wanted to include the details regarding trunking, so for example, show interface status would show us all the interfaces.

00:11:01
I only want to include the ones that had the keyword trunk. And if you’ll notice fa 0/1-6 all show as a trunk. Also we have this logical interface, which is P01, also represented as a trunk. So this interface was created automagically for us simply by us assigning one or more interfaces to that channel group.

00:11:20
If we want to take a look at the details from a switching perspective of port channel interface number 1, we could do a show interface PO1 switch port. And we can see right here that it’s operating as a trunk using dot1q, and it’s currently allowing VLANs 10, 20, and 30 on that trunk. To see exactly which interfaces are participating in this bundle of interfaces for the EtherChannel, we can issue the command show ether channel summary.

00:11:46
And it will give us not only the status of the EtherChannel, but it also gives the details on which interfaces belong to that group. So here we have our port channel. We have this capital S, which says it’s a Layer 2 EtherChannel. We have the U character, which says in use.

00:11:59
Also that could be interpreted as meaning it’s currently up. The protocol that was used to negotiate this EtherChannel is the link aggregation control protocol. That’s the IEEE standard. And that’s because we use the keyword active on both sides. And the other combination that would work is active and passive.

00:12:16
That would also allow them to negotiate. If they said passive on each side, neither one would initiate the negotiations and an EtherChannel would not be formed. It also shows the individual ports involved in that EtherChannel. And there we have beautiful fa 0/1 all the way through 6. And this capital P after each of those is just indicating that they are bundled as part of the EtherChannel.

00:12:39
Now here’s something interesting. Even though we did a show interface status and it showed each of the individual interfaces as trunks, in reality, logically from a spanning tree perspective, the only trunk that we have functioning out of those six interfaces is the one logical port channel number 1. And with the command show interface trunk, it’s indicating that port channel 1 is currently forwarding for VLANs 10, 20, and 30. Now the beautiful thing that Anthony pointed out was that EtherChannel is not really directly tied to spanning tree.

00:13:09
However it does lie to spanning tree. And because all six interfaces are appearing as one logical big interface, spanning tree sees it as only one path. So it doesn’t have to do blocking on five out of the six of the interfaces. So our command here shows spanning tree for VLAN 10. So we’re running rapid spanning tree.

00:13:28
Switch number 2 is the root of spanning tree. And we’re currently doing our forwarding on port channel number 1, so we’re in a forwarding status. Because we are the root, we’re going to be designated in all our reports for that VLAN. And our cost is 6. So fast ethernet 100, the default cost is 19. For 1,000 megabits, or one gig, the cost is 4. Again, in spanning tree, lower is better.

00:13:50
So because we’re aggregating six 100 megabit links, the cost is somewhere in between the 19 and 4. And it’s not a linear scale, and that’s why it’s giving us a cost of 6. Which is a whole bunch better than 19, but not as good as 4. If we looked at the other side, switch 1, who is not the root for VLAN 10, and did that same exact command, show spanning tree for VLAN 10, and it’s also forwarding, the cool thing is there is no blocking from a spanning tree perspective because spanning tree sees that port channels as one big pipe.

00:14:21
So as spanning tree says yes and is allowing traffic to flow across this port channel, it’s up to the switches themselves to perform the actual load balancing of traffic over the various six interfaces. And a logical next question that you might ask is, OK Keith, how exactly does it do load balancing? Well, there’s a couple factors involved with that.

00:14:39
Number one, we need to take a look at the type of switch and the methods in which it could do load balancing. And then secondly, we would want to configure it for the load balancing method that we want to use and that is supported by the switch. So the default behavior on this switch is to load balance based on the source Mac address.

00:14:57
So what that implies is that the same Mac address would always use the same exact physical interface as that traffic is sent over the trunk. Now you may or may not want to always use that mechanism on this switch for load balancing across the EtherChannel.

00:15:13
So the cool thing is we can actually control that. We can go ahead and specify the port channel load balance command. Use a question mark, and then all the options that are supported on your switch. For example, source Mac address, destination Mac address, ports that may be involved in the source or destination address.

00:15:29
You can then choose the exact method that you want to use on your EtherChannel for the load balancing. You and I have seen a working EtherChannel. Let’s take a look at a not working or incorrectly configured EtherChannel. A customer put it in place, it’s not operating correctly, and they’ve asked you and I to come in and take a look at their config and correct anything that needs to be corrected in order for the EtherChannel to become up and active.

00:15:54
So the first thing that you and I might want to do is just validate, either through visually looking at it, or running some commands, that they have the physical cables connected between the two devices that are going to be using EtherChannel. If Layer 1 isn’t happy, there ain’t no way Layer 2 and higher are going to be happy as well.

00:16:10
So we can use the show CDP neighbors command. And this output looks great. So locally on switch 1 it sees switch 2 as a neighbor on all of these interfaces, and they happen to match. So we know that all the physical cables are in place, and from a CDP perspective, there’s basic connectivity.

00:16:26
So next, let’s just see if there’s EtherChannel that’s up and active. We know the command, it’s show EtherChannel summary. And we’ll just take a look and see if it’s active. And this doesn’t look too positive. However, it does give us some information. They do have a port channel number 1. It exists.

00:16:41
The S right here refers to it being a Layer 2, which is fine. And the D is saying that it is currently down. That’s unfortunate, it’s not working. The protocol that was used to negotiate it is link aggregation control protocol. That’s the IEEE standard. And all of these ports, instead of having the beautiful P there indicating they’re bundled as part of that port channel group, they have this I, which says standalone here.

00:17:05
Effectively means they are not cooperating together in a functioning, working EtherChannel. Another way that we could look at some of the details of this EtherChannel is a show interface with the interface name of PO1, and then followed with a switch port keyword. And this is also telling us that it’s operationally down.

00:17:22
Not functioning. So this is switch 1. If we scroll up just a little bit, this is negotiated using link aggregation control protocol. Let’s go take a road trip over to switch 2 and take a look and see what it says about its port channel. So we’ll use the same command of show EtherChannel summary.

00:17:37
And we’ll just look for any differences. So we have the S here, which represents Layer 2, which is perfectly fine. We have the D, which says is currently down. One significant difference that I see right here is that switch 1 was using link aggregation control protocol from the IEEE, and switch 2 was using port aggregation protocol, which is Cisco proprietary.

00:18:00
They both have to be using IEEE, or they both have to be using Cisco’s proprietary method. Or another option is we could say, I don’t want to do any negotiation, and both sides could simply be set to on. But having one side with link aggregation control protocol and the other side with port aggregation protocol, that is not going to form an EtherChannel.

00:18:18
So we definitely need to correct that. So currently on switch 2, if we went and looked at the individual interfaces, either fa 0/1 or 2 or 3 or 4 or 5 or 6, they’re going to include in their configuration the keyword of either auto or desirable, which are both pointing towards port aggregation protocol from Cisco.

00:18:36
Now to fix this, because the port channel already exists, we can go to interface configuration for the port channel, specify the keyword of active, which would cause us to use link aggregation protocol instead of Cisco’s proprietary port aggregation protocol.

00:18:50
So here’s what you and I are going to go ahead and do. We’re going to go into configuration mode here on switch 2. We’re going to remove the port channel interface number 1 completely. And then we’ll go back to those six interfaces and instead of using the keyword of auto, we’ll go ahead and use the keyword of active for the channel group command.

00:19:08
Also what we’re going to do is shut down the range and bring them up again to give every single one of those interfaces a nice fresh chance at becoming part of a correctly negotiated ether channel. So we’ll do a no shut down. And with any luck, if that was the only problem, our EtherChannel in the background should be negotiating and establishing an EtherChannel.

00:19:28
So let’s do some verifications. Let’s do a show EtherChannel summary on both sides. And if it’s working, the EtherChannel should be showing as up, and all the interfaces should be shown as bundled as part of that EtherChannel group. So based on this output, port channel 1 is still Layer 2, and this U is a lot better than the D that was there previously.

00:19:48
We’re using the IEEE standard for negotiation. And all these six ports are currently saying that they are bundled as part of this port channel. That looks great! Let’s use the same exact command over on switch 1. So we’ll go to switch 1. We’ll issue the command show EtherChannel summary.

00:20:04
And so here on switch number one, it’s also looking very, very positive. So it’s Layer 2, it’s currently up, it used the same protocol, and these are all the interfaces that are participating in the port channel on switch 1. So let’s do a couple more commands just for some verification.

00:20:19
Let’s do a show interface PO1 switch port. I’m going to go ahead and remove any of the lines that have the words private VLAN in it, because we’re not really needing to look at that. And I’m also going to do a show interface trunk as well. So from the show interface switch port command, it’s showing us that the VLANs 10, 20, 30 are all being allowed on the trunk.

00:20:38
And from the command show interface trunk, it shows that we’re currently in spanning tree forwarding state for those three VLANs on that port channel. If we go to switch 2, and do those same commands, we should have similar results. So let’s do a show interface trunk just to validate.

00:20:54
And look at this! On port channel 1 on switch 2, it’s only forwarding for VLANs 10 and 30. That’s not 10, 20, and 30. What’s going on? Well, that is a misconfiguration. Now the IEEE’s link aggregation control protocol doesn’t care about that, so it allowed the trunk to form.

00:21:11
However, because it’s still a misconfiguration, we really ought to correct that as well here on switch 2. So here on switch 2 what we’re going to do is we’re not going to go back into the physical six interfaces. Because the port channel interface already exists as a new logical interface, let’s go ahead and simply modify the details of that port channel interface.

00:21:33
Any changes we make there will be inherited by the physical interfaces that are part of that channel group. So we’ll go into port channel interface number 1, and we’ll say switch port trunk allowed VLAN and we’ll simply say add, meaning take what permitted VLANs you currently have on this trunk and we’re simply going to add 20 to it. Of course, we’d want to validate that what we put in is giving us the results that we’re expecting.

00:21:56
So by verifying that that extra VLAN is now part of the allowed VLANs on that trunk, that matches exactly the same VLAN that switch number 1 is currently allowing on the corresponding trunk on its side. And just as a confirmation of what’s happening in the background, if we did now look at one of those physical interfaces that are part of the EtherChannel, if we did a show run for example on the interface, those parameters, such as the allowed VLANs, those commands have been inherited by these interfaces that are part of that channel group.

00:22:26
So in our troubleshooting, we had two problems. Number one, we had the incorrect or incompatible protocols being used to negotiate the EtherChannel. And secondly, we didn’t have the exact same configuration regarding which VLANs were allowed on that trunk.

Foolproof Frame-Relay

00:00:03
I think one of the reasons students tend to struggle with Frame Relay configurations is there’s just too many options. In this Nugget, one of my goals is to make sure that I sort out all these options for you. Also, we’re going to be joined, of course, by our dear friend, Keith Barker, who’s really going to bring this to life for us at the command line.

00:00:21
Hey, and this is another one of those Nuggets that’s going to help Cisco certification candidates from CCNA on up to CCIE. So I guess it’s not all that surprising that the very first option that we have to deal with comes with Frame Relay itself. That’s right.

00:00:37
There is a Cisco variant. And then there’s the IETF variant. Now, as Keith Barker has demonstrated for me in the past, you can actually have these two co-exist in your lab environment when you build a Frame Relay connection to practice with. It’s not going to cause any problems having, let’s say, Cisco’s Frame on one side and the IETF version on the other.

00:01:03
But it’s important to think about from a certification standpoint. Right? Maybe Cisco states that you need to use the IETF or the Cisco variant. And you’d, obviously, want to do that consistently in your Cloud. Then there’s our language of love. The LMI. This is the protocol that is used between our local customer equipment and the Frame Relay switch of our service provider.

00:01:28
As you know, there’s a Cisco variant here. There’s an and ANSI variant. And then there’s a Q.933a. Obviously, we want to watch out for this being preset. That could cause issues. We know the default behavior on a Cisco device is to auto sense. I like how they give it this fancy term of auto sense when, really, all your Cisco router’s going to do is it’s going to fire off one.

00:01:55
See if it works. Fire off another. See if it works. Fire up a third. See if it works. So watch out for manually set LMI settings that might be incorrect. And now we have the issue of layer 2 to layer 3 resolution. Sure, there are, indeed, options here. For instance, we can utilize InverseARP.

00:02:19
The automated way in which we’re going to try and learn the mapping of the local data link connection identifier, or DLCI, to the remote layer 3 IP address. The other way in which we can do it is our static mappings. Now, I like to go ahead and throw in a third thing, here, for you to think about.

00:02:41
And that is NA, not necessary. And this, of course, would be the environment when we are point to point with our Frame Relay connection. If we’re point to point and there’s only one other Frame device out that circuit to communicate with, we don’t really have to worry about doing anything with layer 2 to layer 3. In fact, this is one of the reasons that point to point circuits are smiled upon by engineers of the wide area network.

00:03:10
And this leads into our discussion, beautifully, of the particular interface that we are dealing with. Is it a multipoint interface, or is it a point to point interface? Something that students tend to forget is the fact that a physical interface– let’s say we’re dealing with serial 0/0/0 on a particular router– is a Frame Relay multipoint interface.

00:03:43
So on that interface, we’re going to have to decide are we going to do InverseARP? Or are we going to do our static mappings? Now, if we have a sub-interface, another option, right? We can go point to point. Or we can go multipoint. As a eluded to a moment ago in this Nugget, with the point to point, there is no need for layer 2 to layer 3 mappings, right? Via InverseARP or our static mappings.

00:04:15
So nothing to worry about there. All we worry about is assigning the data link connection identifier to that particular circuit. You see, DLCIs are assigned by LMI to the physical interface. So when we’re working with sub-interfaces, we have to make sure we get them there ourselves.

00:04:34
And that brings up the multipoint sub-interface. We got to make sure the DLCIs are known there. So we, typically, do our static mapping, which does two things for us. It makes sure we can resolve the local DLCI to the remote layer 3 address. And it also ensures the DLCIs are known on that sub-interface.

00:04:57
There’s a lot of options here, aren’t there? And guess what? We got another option. To broadcast or not to broadcast? Yeah, that is the question here. So we have another issue. And that is, are we going to configure the circuits to carry broadcast traffic? Now, this leads to a very, very interesting situation.

00:05:21
This is one of my favorites when it comes to Frame Relay. Let’s say that we have a DLCI, on this device, of 102 to carry from R1 to R2. A DLCI of 103 to carry from R1 to R3. Here, we have 301 and here we have 201. On this R3 device, when we’re doing our Frame Relay mappings, we can go ahead and have a mapping, obviously, to the hub device.

00:05:52
So we would have DLCI 301 going to the IP address of R1. And we could put the broadcast keyword on this in order to make sure we send broadcast packets over that particular circuit. Now, what a lot of students will do is they will go in and do a Frame mapping over the same DLCI for the IP address of our two.

00:06:22
And they will go ahead and put the broadcast keyword on. This does, indeed, work. And at first glance, everything looks fine. But what they have done is they’ve attached the broadcast keyword to the same DLCI the two times. Haven’t they? And this will, indeed, result in double the traffic, unnecessarily, over the circuit.

00:06:50
Cisco, in certification environments, does consider this a misconfiguration. So I want you to be really aware of this fact. What I do to always make sure I guard against this is I just, faithfully, will apply the broadcast keyword to the Frame Relay mapping to the hub.

00:07:15
In reality, it doesn’t matter where you do it. It could be on the mapping to the hub. Or it could be on the mapping to the remote spoke. But I just– as a matter of best practice– will always do it on the mapping to the hub device. And as we view this classic hub and spoke topology, it’s worth reminding you of split horizon.

00:07:41
Yeah, split horizon is going to be a major challenge for some of your routing protocols. A classic would be EIGRP. Remember what split horizon will do. R1 will take in an update. Let’s say we’re using a physical 0/0 interface. It will take in an update from R3. And because it is that same interface leading out to R2, it will not send it out to R2. We can solve this by using sub interfaces, of course.

00:08:15
Or we can disable split horizon for EIGRP. These are concerns that you want to be aware of when troubleshooting Frame. So [INAUDIBLE] options, options, and more options. The great news, of course, is at the command line we have a really excellent robust show command tool set that’ll allow us to very easily verify our Frame environment.

00:08:47
And with that, I’ll turn it over to our dear friend, Keith Barker. Thank you, Anthony. I appreciate the opportunity. I still remember when Frame Relay was a shiny, brand new technology. In fact, back in the ’90s at Paramount Pictures, I remember us rolling out Frame Relay to replace some of our released lines.

00:09:03
And it was very, very exciting. Now, a couple decades later, Frame Relay is not as shiny and new. However, it is still in some environments. And, certainly, it’s still part of our certification world. So for those reasons, it’s important to be able to understand it and troubleshoot it.

00:09:18
The diagram that you and I get to use is this one right here, which is the same one Anthony drew for us. I took the liberty of adding some additional networks hanging off of R1. The 1.1.1 network here, and the 2.2.2 network there, and the 3.3.3 network here. And we’re also running a couple routing protocols.

00:09:32
RIP– just for old time’s sake– and EIGRP are both running on all of these routers. We’re also using a multipoint network. What does that mean? It means we have three or more devices on the same logical IP subnet. And that common subnet is the 10 network. This topology right here, it’ll be on the screen as we do our demonstrations together.

00:09:52
And it’s also available in the Nugget lab files for this video. So let’s start at the hub. We’ll start at R1 and just very quickly say, is Frame Relay providing any PVCs available to us? It’s a simple show Frame Relay PVC command. And here’s the high level overview that says we have one deleted PVC.

00:10:13
And that deleted status, or deleted PVC, has one the router. The local router is trying to use a specific PVC. But the switch, the Frame Relay switch, isn’t advertising that PVC to the router. So another name for deleted could be I think I should be using this, but I don’t see it, says the router.

00:10:31
The other thing that’s a little bit concerning is that I see PVC 102 here, but I don’t see PVC 103. And because this shows as deleted, that leads me to believe that the LMI or the switch is not advertising any PVCs up to this router. So how could we verify that? Pretty simply.

00:10:50
We’ll just do a show Frame Relay LMI. And that will help verify whether or not the language of love, as Anthony called it, is working between our router and the edge of the Frame Relay switch. And one of the first lines that we might want to look at are these guys, right here.

00:11:07
So what should happen in a healthy environment if we set, for example, 100 requests, we should have 100 status messages. So we have 100 to 0. And it also shows 99 timeouts. And if we refreshed this by using an up arrow key and pressing Enter, what that really looks like is that their LMI is not working successfully between the router and the switch.

00:11:30
Referring to the Frame Relay switch that we’re connected to our serial interface. We also know that serial 1/0, based on this output, where we have the connectivity to the WAN provider. Also, right here for the LMI type, it says CCITT. And Anthony mentioned that the LMI is auto detected.

00:11:47
So unless we had hard coded this incorrectly, that should be working. However, because it’ not working, let’s go take a look at the interface to see if there’s any administratively configured LMI type there. So we’ll do a show run for interface serial 1/0 and take a look. And sure enough, somebody’s administratively configured the LMI type.

00:12:09
So configuring this LMI type results in this output. And based on these numbers here, that LMI type is not what is currently being supported on the Frame Relay switch. So what do we do about it? Well, Anthony mentioned that it’s auto detected. So let’s do this together.

00:12:25
Let’s go into configuration mode for interface serial 1/0. And, simply, say no. I think that was Nancy Reagan’s policy for drugs. Just say no. But we’ll just say no to the Frame Relay LMI type that will let the default be used. And then it can auto negotiate the LMI type with the Frame Relay switch.

00:12:42
That would be a good troubleshooting step. So now that we’ve done that, let me go ahead and clear the counters so that any old timeouts will be gone from the show of Frame Relay LMI command. And let’s just verify whether or not we’re having better responses with LMI between our router and the switch.

00:12:58
So here it says, regarding status, we’ve sent one message and we’ve received one. And let’s go ahead and use the Up Arrow key just to verify that it’s continuing in the right direction. So, for example, if we had 1,000 over here– we had 999 with one timeout– that’s no problem.

00:13:15
It’s probably at the beginning of the negotiation or what have you. But it’s super healthy if, when this number increases, this number increases correspondingly. It also happens to be using the LMI type now of Cisco, which it negotiated with the switch. So based on this, it now appears that LMI is happy, happy.

00:13:33
I remember from grade school, or somewhere else, a joke that went something like this. If April showers bring May flowers, what do May flowers bring? And the answer is pilgrims! That’s the joke. Well, if we have LMI, what does LMI bring to the table? And the answer is it should be delivering two R1. The information about two PVCs that are available.

00:13:53
PVC labeled 102 and also DLCI 103. And to see that, we can just issue the command show Frame Relay PVC. And the output should reveal the PVCs that either are being advertised to us or that the router still believes it should be using. So with this command, show Frame Relay PVC, it shows us we have DLCI 102 and DLCI 103. That’s a very good sign.

00:14:16
It’s showing us that DLCI 102 is associated with a sub-interface 1/0.123. And that DLCI 103 is associated with the physical interface. So Anthony mentioned that, by default, all PVCs are associated with the physical interface unless something pulls them down, if you will, to the sub-interface.

00:14:34
With a multipoint interface, perhaps, we have a Frame Relay mapping. Or with a point to point sub-interface, we’re using a Frame Relay interface DLCI statement. In either case, both of these PVCs should be associated with a single interface because we’re on a multipoint network.

00:14:50
So the next thing I’d like to do is just validate which interface owned the IP address of 10.0.0.1, which is the IP address on the Frame Relay network that R1 is using. So we’ll do a show IP interface brief. We’ll exclude from the output any lines that have the word unassigned in it.

00:15:07
And this says serial 1/0.123 has the IP address of 10.0.0.1. So based on that, both of these PVCs should both be associated with that sub-interface. And because we have R1, R2, and R3 on this multipoint network, that sub-interface needs to be a multiple point sub-interface.

00:15:27
So at this point, the possibilities include– we haven’t included a Frame Relay map for DLCI 103, which is why the physical interface is still claiming ownership of that PVC. So a quick way to validate all of the Frame Relay maps that are currently present is with a show Frame map command.

00:15:45
So let’s go ahead and do that. Let’s do a show Frame Relay map just to see what current mappings are in place on this router. We should have two that are present. One goes to R2 and one goes to R3. And based on the output here, we only have one in place. It’s associated with serial 1/0.123. It’s talking about DLCI 102. And instead of having an IP address that we’re mapping, it says point to point.

00:16:09
And that, my friend, implies that this sub-interface is the wrong type of sub-interface for our topology. We need a multipoint sub-interface where we do Frame Relay mappings. So let’s take a look at the details of this sub-interface. We’ll do a show run for interface 1/0.123. And just see, exactly, what’s configured in it.

00:16:28
It’s very likely, based on our results earlier, to be a point to point sub-interface and sure enough, right there. So the IP address is correct. The Frame Relay interface DLCI statement was used, which is appropriate for a point to point sub-interface. However, if this had been a multipoint.

00:16:44
We would then use Frame Relay map commands for the two PVCs. So let’s do this. Let’s go ahead and replace that sub-interface. We’ll go into configuration mode. We’ll say get rid of that serial 1/0.123 as it currently exists. It’ll warn us about, maybe, some traces of a sub-interface aren’t completely gone.

00:17:01
That’s OK. And we’ll recreate a brand sub-interface, this time, using multipoint. We’ll give it an IP address. And we’ll put in the appropriate map statements that says basically, hey, to get to R2’s remote IP address of 10.0.0.2, use the local PVC DLCI 102. And to get to the remote IP address of R3, which is 10.0.0.3, use the local DLCI of 103 as an on-ramp, if you will, to get to that destination.

00:17:28
We’ve also included the broadcast keyword on both of those Frame Relay mappings. Why? Because we want dynamic routing protocols that are using multicast to be able to forward those packets across those PVCs to the peers on the other side. Now, our next step, after we make a change, we want to go ahead and verify that what we put in place is currently working.

00:17:48
So let’s just do a quick verification with a show Frame Relay map. Just to validate, we have our two Frame Relay maps in place, which we do. And they both include the broadcast option, which we configured. So local DLCI 102 is used to reach the remote IP address of 10.0.0.2. And local DLCI of 103 is used to reach the remote IP address 10.0.0.3. That looks perfect.

00:18:11
So let’s test our connectivity. Let’s do a quick ping over to 10.0.0.2. We’ll do a repeat count of two so we don’t have to do a control break if the time’s out and not have to wait for all the pings to fail. And survey says that R2 is not reachable yet. So that could be a local issue on R1 or it could be a remote issue on R2. Let’s also try to ping R3 as well. So we’ll ping 10.0.0.3. We’ll do a repeat count of two there as well.

00:18:37
Wow, look at that. So we have success. So we can reach one of our spokes. Just not both of them. And in the networking world, one out of two is bad. So we need to get full get full connectivity to both of our peers. Another thing I’d like to do is let’s just take a look at our EIGRP neighbors.

00:18:53
Now, RIP and EIGRP are both running. But I just want to validate that we have EIGRP enabled on our serial interface here on R1. Now because we did have a successful ping over to R3, let’s just also verify whether or not we have any EIGRP neighbors. And survey says, no, we don’t.

00:19:12
So we have some basic connectivity, but we still have a problem. It appears between R1 and R2 three even though we have some basic connectivity. So R2, we got no pings to at all. Let’s make a road trip up to R2 and talk to it. And say, you know what Mr.R2? How are you doing? How’s the basic language of love, the LMI, doing between yourself and the Frame Relay switch? Is it working correctly? And to find that, we can do the command show Frame Relay LMI.

00:19:38
And what this says is that we have a successful LMI, which is great. OK, so let’s see what PVCs that we currently have available that we learned through the LMI or we have locally mapped. We’ll do a show Frame Relay PVC. And I’m going to hit the Space Bar one time so we can see all of the output.

00:19:56
So we’ve got one that’s unused and one that’s deleted. OK, so DLCI 102. That DLCI should not exist up at R2. And because the status says deleted, what that implies is that the router is trying to use DLCI 102 but the Frame Relay switch is not advertising a DLCI 102 as being available to R2. So it also looks like it’s on a sub-interface.

00:20:21
Hopefully, it’s a multipoint sub-interface. And the other DLCI 201 it’s currently associated with the physical interface, which is the default behavior until we do something to pull that PVC down. Like, a Frame Relay interface DLCI statement on a sub-interface for point to point or a Frame Relay map command on a multipoint sub-interface.

00:20:40
So what this looks like to me is like somebody has misconfigured a Frame Relay map or a Frame Relay interface DLCI statement using what they thought was the correct DLCI but, actually, did an incorrect number instead. So let’s take a look. Let’s go look at the details of what’s configured in serial 1/0.123. We won’t have to guess.

00:21:00
So it is a multiple point. That’s a good start. The IP address is correct. And we have two Frame Relay mappings. One that goes to R1 and one that goes to R3. They’re both pointing down the same PVC, which is a wrong. This is not the PVC that R2 has available to it from the switch.

00:21:18
It should be 201, not 102. And that’s the reason that it showed 102 as a deleted PVC because there’s no PVC really available identified by DLCI 102. Yet, the router was trying to use it. So we need to correct both of those mappings and use the correct DLCI.

00:21:36
So let’s do it. We have the technology. We’ll go into configuration mode. We’ll go into interface configuration mode. Once you have the sub-interface defined as point to point or multi-point, to go back into that interface, you do not have to add that keyword again.

00:21:50
It’s already defined. You can just go into the sub-interface based on the number. And then we’ll get rid of the two mappings that are incorrect. And we’ll replace the two mappings as they should be using the local DLCI of 201 to reach either 10.0.0.1 or 10.0.0.3. It’s going to be the same exact PVC going up through R1 in either case.

00:22:11
And we’ve added the broadcast keyword, as well, for the PVC going to R1’s address. And that will allow our routing protocols and broadcast and multicast traffic that need to be forwarded to be forwarded over that PVC. And the fact that we just got an EIGRP adjacency message is a very, very good sign that things are working.

00:22:30
So let’s do some verifications as well. We’ll do a show Frame Relay map. We should have, basically, two mappings using the same local DLCI of 201. So we’ll make sure they’re both present. And as Anthony said, we only need to broadcast on one of them. And as part of good forum, we might want to include that on the mapping that points back to the central device, the hub, in our hub and spoke topology.

00:22:51
We can also do a show Frame Relay PVC just to validate that we have one PVC that’s currently active and associated with the correct sub-interface, which it is. And let’s see if we have any EIGRP learned routes. That’s really the proof of the pudding, as well.

00:23:06
If we do a show IP route EIGRP and we’ve learned a route. Look at that. We have. We learned the loopback interface. It’s on R1. We’ve learned it via EIGRP. So it looks like we’ve cleared up a lot of the Frame Relay issues between R1 and R2. Let’s go take a look, next, at R3. Now, in R3, we already did a successful ping But you’ll notice that we have some EIGRP adjacency flapping.

00:23:31
It looks like it came up and went down, went up and came down, went up and came down. That is absolutely not healthy. What might cause something like that? Well, in our troubleshooting video on troubleshooting EIGRP adjacency’s, one of the things that we discussed was that, perhaps, we don’t have a bidirectional communication path between the two devices.

00:23:52
So what might cause this lack of full bidirectional communication? The answer is let’s take a look at the details. On R3, we’ll do a show Frame Relay LMI, which I think is going to be working because we can ping back and forth between R1 and R3. We already tested that.

00:24:08
So the LMI looks good, based on these numbers right here. We could also validate that the PVC is available, which is PVC 301. So it looks like DLCI 301 is associated with the physical interface, which is perfectly fine. And because we have traffic and the ping also did work between R1 and R3, it looks like our PVC is in good shape. So let’s take a look at the Frame Relay mappings involved.

00:24:31
So we’ll do a show Frame Relay map. And this says I’ve got two mappings. I’ve got one to reach the remote IP address of 10.0.0.1. Use the local DLCI of 301. And reach 10.0.0.2, which is R2, the other far spoke. Use DLCI 301. And that, actually, looks great because we only have one PVC to use.

00:24:49
So if we ever send a packet to 10.0.0.2, in measurable terms, it’s going to be sent over the PVC to R1 who will de-encapsulate it, look at it, make a routing decision, and then forward it back on its PVC 102 that leads up to R2. But I do see something missing here in the mapping for R1. And that is I do not see the keyword broadcast.

00:25:13
And that would explain the behavior of EIGRP. It’s because broadcasts are not being allowed through. Now, the broadcast keyword applies to broadcast and multicast traffic. So effectively, R1 because it’s not seeing the messages from EIGRP periodically. It gives up, reestablishes, and then times out again.

00:25:31
And that cycle will repeat over and over again until we have a correct mapping that includes the broadcast option. I suppose another option we could use is a neighbor statement inside of EIGRP. But it would be much better to fix the incorrect Frame Relay problem instead of using a separate band-aid.

00:25:50
Another thing I’d like to do is let’s validate that the traffic to reach 10.0.0.2 really is being routed through R1. If we do a trace route from R3 that says, hey, I’m trying to ping 10.0.0.2, check out the path it takes. It’s sent over the PVC to R1. So that’s our first hop.

00:26:07
R1 makes a routing decision and then forwards it to 10.0.0.2. So even though we have a multi-access network,– we have three devices all on the 10.0.0/24 space– all our traffic, literally, is being routed through our hub so we’ve got our adjacency for EIGRP just bouncing periodically.

00:26:27
Let’s fix that right now. Let’s do a show run for interface serial 1/0. And that confirms we do not have the broadcast keyword that we need. So let’s add it. We’ll go into configuration mode. We’ll go interface serial 1/0 because that’s where the current IP address is.

00:26:44
We’re not using any sub-interfaces. And we’ll overlay a brand new Frame Relay map, including the key word broadcast, which will now allow our EIGRP to successfully come up, and stay up, functionally between R1 and R3. And what I’m also going to do because I didn’t have a functioning EIGRP neighborship anyway, I’m going to do a clear IP EIGRP neighbors just to, kind of, refresh everything, especially in this lab environment.

00:27:11
And then what you and I want to do is go back in, validate that the neighborship came up,– we can wait for a console message for that– and then take a look at any EIGRP routes that we’ve learned. Hopefully, we’re going to be learning about the 1.1.1 that R1 has configured. And also, we’d like to learn about the 2.2.2 network that R2 has configured. Those should both be EIGRP learned routes on R3. And there’s the adjacency.

00:27:36
So let’s take a look and see if we have any EIGRP learned routes on R3. And we’ve got one of them. And that is the 1.1.1 network. That’s the loopback. What we are missing is the 2.2.2 network from R2. Now, on all these routers, they’re all running EIGRP. And just for grins, they’re all running RIP.

00:27:56
And the reason we’re seeing this route, right here, as EIGRP is because of the lower administrative distance of 90 for EIGRP internal routes as compared to RIPs default administrative distance, which is 120. So also, let’s verify that we can ping that IP address.

00:28:12
I know we can see it in the routing table. Let’s go ahead and do a ping of 1.1.1.1. And that is successful. We could also do a ping and source it from our loopback zero interface. And that would also validate that R1 has reachability to our loopback address of 3.3.3.3. So R3 does not have all the routes present. Let’s go back to R1 and ask R1 about what routes it has learned via EIGRP.

00:28:36
Now, it should see the loopbacks of R2 and R3, which it does. If we go up to R2, however, and we do the same command– show IP route EIGRP- what it shows is that it only knows about the 1.1.1. So the spokes, R2, doesn’t see R3’s networks. And R3 doesn’t see R2’s networks. So let’s talk about why that is.

00:29:00
And the reason that is is because EIGRP and RIP are both examples of distance vector routing protocols. And I know, EIGRP is like an advanced distance vector routing protocol and has neighborships and all the rest. But the reality is that split horizon is not allowing R1, on its sub-interface, to go ahead and forward the routes it learns on that sub-interface out on the same interface.

00:29:26
So as R2 is advertising it’s 2.2.2 network, R1 says great. I’m learning that on my sub-interface. But it doesn’t repeat it out that same interface down to the second DLCI in any advertisements because of split horizon. So a show IP interface of the sub-interface on R1 shows that split horizon, by default, is enabled.

00:29:46
So the fix for that is what? Well, we could go ahead and disable split horizon. So let’s go ahead and do that. Let’s go into interface configuration mode for our sub-interface 123. We’ll say no IP split horizon. And then let’s do a clear IP EIGRP neighbor just to make sure everything is fresh and happy with the neighborships.

00:30:05
Now, on a serial link, the EIGRP neighborship may take a moment or two longer than it would on ethernet. But based on the messages on the screen, the neighborships have come back up. I also want to validate inside of our interface that that change took effect.

00:30:19
So let’s do a show IP interface for sub-interface 123. And just have it show us the lines that include the word split just to verify that the change we made took. So it says split horizon is now disabled. So now, what should happen if we go up to R2 and take a look at what we’ve learned via EIGRP, hopefully, we’ll see the 1.1.1 network and the 3.3.3 network. And it is still not there.

00:30:44
So here’s what’s happening. That command– no AP split horizon that we did on the interface– only applies to RIP. It does not apply to EIGRP. So if we wanted to go ahead and remove it just for a moment, we’ll take EIGRP out of the equation so it’s no longer running on R1. What we’re going to see is that the RIP learned routes from R2 and R3 will show up on the other peer because we’ve now disabled the split horizon on the IP interface on the hub.

00:31:13
So to verify that, let’s go over to R3. And on R3, if we do a show IP route for just RIP learned routes, there we have our RIP learned routes. So that is the loopback off of R2. That’s the loopback off of R1. And if we went up to R2, we’d see the complimentary set. We’d see the loopback of R1 and the loopback of R3, those networks, up on R2. However, what we’re not going to see is the EIGRP learned routes because there’s a separate disable split horizon command just for EIGRP.

00:31:46
We need to apply that one if we want to see the EIGRP learned routes. So let’s do a couple things. Let’s go to R1 and, first of all, let’s go ahead and put back EIGRP. We’ll go into configuration mode. Router EIGRP 1. Network 0.0.0.0. On IOS 15 and higher, the auto summary is off by default. So we don’t need to worry about that.

00:32:05
And then, let’s go ahead and disable, specifically, split horizon for EIGRP autonomous system number one. So we’ll go into interface configuration mode. We’ll say, interface serial 1/0.123, please, no IP split horizon for that specific EIGRP autonomous system.

00:32:23
And that should allow the split horizon to be disabled, which should allow R2 and R3 to see the remote spokes networks learned via EIGRP. So if we go back to R3 and we do a show IP route for routes learned via EIGRP, we should see the 1.1.1.1 and the 2.2.2 network now replaced with EIGRP learned routes, again, because of the better administrative distance.

00:32:47
Also, as a test, we could ping the loopback address of R2 from our loopback address on R3 just to validate that R2 also has a route back to 3.3.3.3. So we’ll do that real quick. And if that works, that, my friends, is a home run. One other item that I want to share with you as we take a look troubleshooting is– and this applies mostly to the CCIE level– what if you have to ping your own Frame Relay interface address? For example, R1 has 10.0.0.1. If we try a ping from R1, the layer 3 to the layer 2 mapping isn’t there for his own local IP address.

00:33:23
He’s got Frame mappings for R2 and R3 but not his own. So if there’s a wacky requirement that you have to be able to ping your own Frame Relay address, all we have to do is follow the kind instructions of my high school teacher, Mr.Chickini, when he said feed the

00:33:39
bird what it wants. If there’s no layer 3 to layer 2 mapping for our own IP address, let’s go ahead and give it one. So all we’re going to do is go into interface configuration mode and say Frame Relay map my own IP address, and pick any one of your valid PVCs.

00:33:55
And then you have a frame relay map for your own IP address. And you might ask, OK Keith, what’s that really going to buy me? And the answer is, if there’s a requirement to ping your own IP address, it could get you two points. So I tossed that one out there for CCIEs.

Troubleshooting PPP

00:00:00
The Point-to-Point Protocol is so entirely simple you might wonder why we’re even bothering to shoot a Nugget on its troubleshooting. Well, there is some issues with PPP that learners will run into, and specifically, it typically revolves around authentication.

00:00:20
Let’s cover that and a bit more in this Nugget. Now, in order to walk you through the troubleshooting of PPP, let’s go ahead and jump right into an example. So we have R1 and R2 connected over serial 1/0 connection and a simple 10.10.10.1 on R1 and 10.10.10.2 on R2. First step, let’s check for IP connectivity.

00:00:45
So we’ll go to the R1 device, and we will ping 10.10.10.2. Now, what is the encapsulation in use? Well, as you know, on a Cisco device, the default serial encapsulation is going to be Cisco’s flavor of HDLC. We can prove this by doing show interface serial 1/0, and under the encapsulation area, we can see that we’re at HDLC.

00:01:13
So first problem? Yeah, the first potential problem, of course, is that you would have an encapsulation mismatch. One side is speaking HDLC, while the other has been converted to the PPP config. So let’s do that, and let’s see what results at the command line.

00:01:30
On this side, on the R1 side, I’ll go ahead and say encapsulation PPP. And we should see the interface’s line protocol, which is roughly equivalent to layer 2 of the OSI model, we should see it come crashing down. So if we do a show IP interface brief, we see the telltale encapsulation mismatch setting of up, down.

00:01:57
Look at that. So we’re up, down on this particular interface. That is a surefire indication that we have an encapsulation mismatch. Well, let’s fix it. We’ll go ahead over to the R2 device, and on the R2 device, we’ll say interface serial 1/0. We’ll say encapsulation PPP, and we should see the circuit come back up.

00:02:22
And we see that indeed happens. Line protocol and interface serial 1/0 change state to up. And we can go ahead and do a final verification of a ping over to the R1 device. So let’s track the commands that we have in place on these devices. So we know it’s interface serial 1/0 on each device, and we have under that now encap PPP on each side.

00:02:56
And that results in a happy, happy PPP circuit. Now, an area of huge trouble for students when it comes to the Point-to-Point Protocol is, indeed, authentication. And that’s what we want to go ahead and focus on in this Nugget. As you probably recall, there are two options for authentication.

00:03:19
There’s the clear text PAP, and there is the non-clear text CHAP. And obviously, CHAP offers us a measure of security, where PAP really doesn’t because it is a clear text exchange of the password information that’s utilized. But we’re talking about certification here, aren’t we? So in all of the various certifications from NA to IE, it certainly is possible that we get PAP thrown at us from a configuration standpoint.

00:03:52
What we’re going to do is we’re going to demonstrate something that takes students often by great surprise. We’re going to prove to you that these are indeed uni-directional type protocols. Yeah. They don’t need to be configured bi-directionally. And that’s what so many students think.

00:04:15
And we really can’t blame those students because of the fact that it’s often taught where you’re configuring it dual ways, right, bi-directionally. What we’ll do is we’ll begin by having R1 be our PAP server. Now, that terminology, I really make up, right? There’s no such thing as a PAP server.

00:04:40
But I like to think of these as client-server type technologies. It aids me in my configuration. And we’ll have this R2 device be a PAP client. Then, as you might guess, we’re going to go in and have R2 be a CHAP server, for example, and have R1 be the CHAP client.

00:05:00
But first, let’s focus on this one-way Password Authentication Protocol or PAP that we’re going to configure. Let’s go to R1 and enter the appropriate commands for PAP on that side, and then enter our appropriate commands on R2 for the PAP client functionality.

00:05:21
Now, when you’re configuring something like this, you’re going to break it. You’re going to configure one side and not the other, and you’re going to break the circuit. In fact, we know what it would look like broken. Sure enough, it would be in an up/down state on each side of the link.

00:05:39
So do yourself a favor. Don’t break it. We’ll go into interface serial 1/0, and we’ll say shut, and we’ll shut down this particular circuit. OK. Here we are on R1. Let’s do the quote “server” unquote PAP configuration. We say ppp authentication pap. And guess what? That’s it.

00:06:04
That’s it. That’s all that’s required on this side. Let’s go over to the R2 device and do the quote “client” unquote configuration. Over here, what we can do is go in under the interface and say ppp pap. The sent-username we’ll try and log in with will be R1, and the password that we will attempt to log in with will be cisco.

00:06:33
Now, it’s at this point that you realize, OK, wait a minute. We need this username and password defined over in R1, don’t we, so that this process will work. Yeah. So let’s jump over to the R1 device, and we’ll go to the username and password database. And we’ll say there’s going to be on R2– and I think I made a mistake we’re going to correct here in a moment– username R2 password will be cisco. OK.

00:07:07
Now, let’s go over to R2 and see what username and password it is sending. OK, yeah. We did that wrong. It’s sending R1. I need it to send R2. No problem. We’ll go up arrow. We’ll say no. We’ll go up arrow again a couple times, and we’ll change this to R2. OK. There we go.

00:07:28
So let’s summarize these commands that we’ve issued on our whiteboard. Let me get a new color and indicate that it is a command that is related to PAP. So PAP will be in the color red. You ready? So over on R1, we said PPP authentication pap. And over here, we did a username and password entry for R2. And then over here on R2, the supposed “client” in our situation, what do we do? We said ppp pap sent username and password.

00:08:18
OK. So that’s it. That’s going to enable the use of password authentication protocol between these devices. R1 is challenging R2 for a username and password with this command right here. R2 is able to respond with this command right here. And that is checked against the username and password database on R1. Well, this is all fine and good if it works.

00:08:48
Let’s jump back to the command line and see if this indeed functions. Here we are back on R1. This is where we did the shutdown of the interface. Let’s no shut the interface. It helps to be in the interface when we do that. And what we’re looking for here is an up/up indication.

00:09:09
And that’s exactly what we have. Can we ping the other side? Of course we can. Let’s do our ping, and we can ping. So one-way PAP authentication is now configured. So now our job is to put in the commands that would be required for CHAP. And again, we want to do a one-way CHAP example here.

00:09:36
Well, under R2, we’re going to go in with PPP authentication CHAP. So sure enough, R2 is going to be challenging R1, and it’s going to be ensuring that R2 does the CHAP authentication. Now, CHAP utilizes a shared secret. On R1, we already have this password set up for R2 of cisco. So what we’re going to do is, over on R2, we’re going to go in and say username R1 password cisco. Now they each have a shared secret in which they can utilize for the CHAP authentication process.

00:10:30
And that’s it. That’s it. So two commands now over on R2. And thanks to the fact that earlier we had put this username and password entry in for our PAP authentication process, we should be good to go. Let’s jump to the command line and try it. So we’ll go over to R2. We’re suggesting that is the only place we need to make configurations, right? We’ll go in there, and we’ll shut the interface.

00:11:01
We know we would potentially cause it to come crashing down as we configure this. We’ll say ppp authentication chap. And then on this R2 device, we need to set up the username and password entry. And the big thing is, that password needs to match the password that’s been configured over on the R1 device. That is a shared secret that is not going to be sent across this WAN link, making this much, much stronger than password authentication protocol.

00:11:38
All right. All that’s left for us to do is a no shut. And with our no shut, we are indeed looking for an up/up designation. So we no shut the circuit, and there’s our layer 1 up. And then the big thing that matters here to us is the line protocol coming up, and there it is.

00:12:04
Awesome. Can we ping? Of course we can because we are in an up/up state. So notice, this is not difficult, and it’s probably more flexible than you had thought. Let’s go ahead and review one last time what we did with authentication here. In this Nugget, we went ahead and we set up PPP as the encapsulation on this WAN circuit.

00:12:35
It is unbelievably simple to do. We just say encapsulation PPP on each side of the circuit, and we go from an HDLC WAN connection to a Point-to-Point Protocol WAN connection. Then we went and made R1 what I like to term a PAP server and R2 a PAP client, and this was with these simple commands that we see in red.

00:13:02
Then we went and made R2 what I would call a CHAP server, and we made R1 a CHAP client. And this was by adding the commands that we see in purple. Just two commands required here on R2 in order to do that. So we have proved that the CHAP and PAP processes are not bidirectional, they’re uni-directional.

00:13:32
And you can, if you want to, configure them bidirectionally, and that is often the case, by the way. What’s the real-world case? Well, it’s to configure CHAP authentication over here and CHAP authentication over here, so they are both acting as servers and clients to each other.

00:13:53
But remember, in certification environments it’s not always that we’re going to discover the real-world type of configuration, right? Remember, too, that we didn’t need any sophisticated show commands or debugging commands here at all. When we’re working with the WAN encapsulation, we do our show IP interface brief as our key key command.

Solving EIGRP Adjacency Failures

00:00:00
There’s a myth about EIGRP, and that myth is it is very simple and its configuration is very simple. This is a bit of a myth and can cause engineers real distress when two devices do not form an EIGRP adjacency correctly. This is because engineers are convinced of the simplicity, and they wouldn’t really even know what to check to find out what has gone wrong with the adjacency.

00:00:32
In this CBT Nugget, Keith Barker and myself are going to not only guide you through the adjacency process in EIGRP, but we’re also going to ensure that you can accurately and efficiently troubleshoot any problem in the formation of such an adjacency. In order for you to troubleshoot EIGRP adjacency failures like an absolute pro, we’re going to break this Nugget down into three main areas.

00:01:03
First, we’re going to examine in great detail how EIGRP forms these adjacencies to begin with. From that information, it will be very easy for us to come up with a list of possible problems, and then we’ll have Keith Barker walk us through some demonstrations on just how easy it is to identify these particular problems and solve them.

00:01:28
Now this is certainly an exciting topic in its own right for the real world troubleshooting we might often have to do. But it’s exciting for those of you interested in Cisco certifications, because this particular Nugget does indeed pertain to CCENT, CCNA, CCNP, and CCIE certifications.

00:01:51
So just how do two routers form an EIGRP neighbor relationship? Well, here we have R1 and R2. And notice they’re connected by a serial link. How they will form their adjacency is each of these interfaces, once configured properly for EIGRP will periodically multicast hello packets.

00:02:14
That’s right. It is a packet type, one of the five EIGRP packet types, called a hello packet. And these will be multicast on the circuit so these neighbors can find each other. An obvious question here is, how often are these multicast packet sent? Well, the answer is, by default every five seconds.

00:02:37
And it doesn’t really matter the type of link you’re dealing with. If it’s a broadcast Ethernet link, it’s going to be five seconds. If it’s a point-to-point serial link, it’s going to be five seconds. But Cisco decided to do one important thing. They said, if the speed of the link is T1 or greater, then go ahead and aggressively multicast the hellos at five seconds.

00:03:03
If it is a less than a T1 speed, the new default timer interval is 60 seconds. This is a great idea, right? We have a slower physical media, so we want to go ahead and reduce the overhead due to hellos on that slower link. Another very important piece of information to know here is, what is the multicast address that is used for this neighbor formation with hello packets? It’s 224.0.0.10. What do we notice about this particular multicast group address? Well, this is going to be link-local only, so these multicasts will not propagate beyond the local link, perfect for adjacency formation.

00:03:54
From this discussion, we begin to immediately imagine potential issues. Remember what we said? The interfaces must be properly configured for EIGRP? Well, that’s the first area we should look at. Are the interfaces properly set up? Yeah. Are they properly configured for EIGRP? And this involves our network statement.

00:04:23
Are the network statements configured accurately? Also keep in mind that the AS numbers in the EIGRP configuration, they must match. That’s right. If we have autonomous system 100 configured on our 1 and autonomous system 1 configured on our 2, we will not form an adjacency.

00:04:49
In fact, it used to be that engineers would try and increase the scalability of EIGRP by configuring it in multiple AS domains and then redistributing between those domains. This is now frowned upon. And Cisco has invented things like the stuck in active query behavior and EIGRP Stub devices in order to eliminate the need for administrators to configure these multiple domains in their environment.

00:05:25
Remember how we said the whole hellos would be sent. They would be multicast. So sure enough, does our link support it? What if our link is a unicast link incapable of supporting broadcasts or multicasts? Well, in this case, we can go ahead and configure the neighbor command.

00:05:48
What the neighbor command does is it forces the sending of unicast hello packets across the link. One potential configuration issue, of course, would be that you have R1 configured with the neighbor command to unicast and R2 configured without the neighbor command, and its defaulting to a multicast behavior.

00:06:12
Now something else that we like to do with EIGRP is we like to configure MD5 authentication so that our neighborships are secured. This is definitely something we have to look for when we’re troubleshooting EIGRP adjacencies. Are both configured, R1 and R2, for the authentication? And is the authentication configured correctly? Something else to watch out for is those K-Values.

00:06:41
Cisco really recommends we don’t manipulate K-Values. The K-Values are constants that manipulate the EIGRP metric calculation formula. We know, by default, it’s bandwidth and delay that are considered in the calculation of a metric. If we manipulate the K-Values, we manipulate the way in which EIGRP p calculates the metric.

00:07:08
If we are to manipulate these K-Values on, let’s say, R1, then these same K-Values must be in place on R2. And this is going to be another ingredient that we would check if we are unable to form adjacency. Is there a way for you to silence a device, let’s say R1, from sending hellos? Sure.

00:07:33
This is, of course, accomplished with the passive interface command. This is something else that we want to look for to make sure that a particular device has not been silenced from sending hellos. Notice, of all of these possible issues we’ve identified, they’re all pretty darn particular to EIGRP, aren’t they? What I do after thinking about all of these potential issues that could exist with EIGRP’s adjacency formation, I then think about Layer 1, Layer 2, and Layer 3 configurations that could impact these interfaces.

00:08:15
For instance, are we administratively shut down? Well, that would be a Layer 1 problem that would keep our neighborships from forming. Do we have a uni-directional link? Yeah, a uni-directional link would be a Layer 2 problem that would certainly impact the adjacency formation.

00:08:33
And then do we have any Layer 3 stuff going on that might impact us? What would be an example of a Layer 3 problem? Well, how about a security configuration, specifically something like an access control list that is preventing the formation and of the adjacency.

00:08:54
We have reviewed the adjacency formation process. And from that, we have compiled this superb list of possible issues that we could have. Let’s have Keith Barker breathe life into this theory at the actual command-line. Well, the very first thing we might want to do is just verify we have basic connectivity between R1 and R2. So how would we verify that? Probably with something like a basic ping.

00:09:23
If we can’t ping or don’t have connectivity to that neighbor, the opportunity for an adjacency is very unlikely. However, in this case, it looks like our pings are successful. Next, if we’re troubleshooting an EIGRP adjacency, why not just verify that we are actually having a problem? We could do a show IP EIGRP neighbors, just to verify if any neighbors show up.

00:09:43
And sure enough, R1 has no EIGRP neighbors. Anthony mentioned some very important things that have to be in place for adjacencies to make sure they happen. We have to have interfaces that are enabled for EIGRP. We have to have the same autonomous system numbers between two neighbors.

00:10:00
We need to make sure the K-Values match up and also make sure if we have any passive interfaces we’re not passive on an interface that we want to use to communicate with a neighbor. One way, or a couple of commands, we can use to verify all of that information is a show IP EIGRP interface and a follow-up show IP protocols.

00:10:19
So what do we have here? Well, we know we have two interfaces on R1 that are enabled for EIGRP. We have loopback 0, and we also have serial 1/0. That looks like a good sign. We have an EIGRP autonomous system number 10. Now if both sides were using AS number 10, that would be perfect.

00:10:35
We’ll check out the other side here in just a moment. The K-Values values are at their default, so K1 is set to a 1 and K3 is set to a 1, and all the others are off. So as long as those K-Values match on the other side, that would or would not be a problem.

00:10:48
They need to match. And I don’t see any passive interstatements is down here. Because both interfaces show up in the EIGRP interface list, and I don’t see any that are passive, I’m not looking at that as a specific problem on this side of the equation. Let’s go take a look at the same parameters over on R2. With a road trip over to R2, we’ll issue the same exact commands.

00:11:09
We’ll do a show IP EIGRP interface. And this looks like a problem already. We’ll do a show IP protocols also. As far the autonomous system numbers being the same, R2 is using 100, and R1 is using 10. So that needs to be corrected, if they’re ever going to be neighbors.

00:11:27
I also see over here on the interface list that R2 is not including its serial interface, 1/0, inside of EIGRP. It’s not enabled. Now that could be from a network statement. However, there is a network statement reflected right here, and it’s right below that we have the passive interface, which is causing EIGRP not to be enabled on that serial interface.

00:11:49
So presuming we want to use AS number 100, we can go ahead and leave that here on R2, and we’ll simply remove the passive interface command on R2 while we’re right here. The actual configuration to correct these issues is pretty simple once we know what the issue is.

00:12:03
However, I’d strongly recommend, before you start making changes in an environment where you may need to go back to something, always back it up. Make sure you have a backup that you can replicate or reproduce if you need to go back to your starting point.

00:12:17
We’ll go into EIGRP autonomous system 100, and we’ll simply say no. No passive interface serial 1/0, and that will take care of at least that problem. We can then use the up arrow a few times just to verify with the show IP EIGRP interface command right here, that now the serial 1/0 is enabled and participating in EIGRP. Based on our findings, we have an AS number mismatch.

00:12:43
So let’s go back to R1– I’ll move over to that tab– and let’s remove the old router process, the EIGRP 10, unless we are using it for some other reason. However, if we’re not, we’d remove it and then simply put the new one in, router EIGRP 100, network 1, network 10. In IOS 15.x, the auto-summary feature is disabled by default.

00:13:07
However, if you were on IOS 12.x or older– heaven forbid it was older than 12– you might also want to disable auto-summarization in EIGRP, because it’s very likely it’s going to cause you more problems than it is going to help you in today’s modern networks.

00:13:25
Whenever we make a change, we want to go back and verify, just to make sure that maybe what we changed corrected the problem. Again, our goal is an adjacency between R1 and R2. So we can do a show IP EIGRP interface, just to verify we have the interfaces we need.

00:13:40
Show IP EIGRP neighbors, in the event that R2 decided to start neighboring with us. He is not. There we have no neighbors based on this empty output right here. And the show IP protocols can help us verify that we have our network statements included correctly, which reflects why these interfaces are involved now.

00:13:58
And we have new sources of information. Effectively we have no EIGRP neighbors who are feeding us information. So it looks like our neighborship and our adjacency issue isn’t quite yet solved. Another thing that Anthony mentioned was that the link needs to support multicast, because that’s how the updates are done.

00:14:15
So we are in a serial link. We should take a closer look to find out what the details on that link are. One way of doing is a show interface. What I’m doing here, I’m just going to pipe and just show specific outputs from that, the ones I want to focus on.

00:14:28
I want to look at the encapsulation and also the IP address. It shows that it’s frame relay. Now take a look at this. The IP address is 10.0.0.1/23. What is wrong with a slash 23-bit mask? And the answer is nothing, as long as R2 is also using that same mask.

00:14:44
For general network functionality, we’d want to have the same mask on both sides. But one of the interesting things about EIGRP is that you don’t have to have the exact same network length or mask length on both sides to form an adjacency. Other routing protocols, like OSPF, would have a huge problem with that.

00:15:02
But that’s not a problem with EIGRP to form an adjacency if the mask is slightly different. The other part here I wanted to look at was the encapsulation is frame relay, and here’s a frame relay mapping. So we have 10.0.0.2. We can reach that device via dlci 102. It’s a static mapping.

00:15:19
And there’s something missing right here. It’s the broadcast keyword. That broadcast keyword would give us the ability to go ahead and forward broadcasts and multicasts across that link. But because that is missing, that is not happening. This link does not support multicasts as it is.

00:15:36
Now how can we fix that? We could add the broadcast keyword on both sides. Or we can go ahead and send unicast messages by including a neighbor statement in our EIGRP configuration. Let’s add that unicast update support. We’re going to do that by going into Configuration mode, router EIGRP 100. And we’re going to say, dear Mr. R1, your neighbor is 10.0.0.2, and he’s reachable out of serial 1/0. That will cause unicast updates to be sent out.

00:16:07
And because we’re on a slow link– I think the bandwidth is currently set to 64 kilobits per second or something like that on this frame relay circuit– the updates are going to be happening every 60 seconds, instead of the normal every 5 seconds that we’d find on high-speed links.

00:16:22
So here we may want to just give it a moment. If we’re waiting for 60 seconds before an update might happen, we might want to wait a full 60 seconds. And then after 60 seconds, if we still don’t get the neighborship, we could say, well, perhaps it’s something else.

00:16:37
However, we’re whittling away at the multiple problems that would cause an adjacency not to be formed between two EIGRP speakers. It doesn’t look like it’s going to form an adjacency. What else did Anthony mention? He mentioned that authentication needs to be acceptable on both sides, meaning you may not use any authentication.

00:16:55
Or you may need to use plain text. Or you may use MD5. But it has to match on both sides. Let’s go ahead and take a peek. Because we’re sitting at R1, let’s take a peek at the details regarding authentication. To do that, I’m just going to simply do a show run of interface serial 1/0. And I’m going to say, please only show me lines that include the word authen.

00:17:16
So if there’s any authentication commands inside of that interface, they’ll show up here in the output. And there aren’t any. OK. That’s simple enough. There’s no authentication on R1. Let’s go take a peek at R2. Here on R2. Let’s go ahead and just take a peek.

00:17:30
We’ll do that same show command, say please show me the details for interface serial 1/0, only lines that include authentication They’re R2, so they are using authentication. Let’s take a peek at what the authentication key are in use. We’ll do a show key chain.

00:17:45
And we have key 1. The string of our key is 156266783. There’s no time constraints regarding when this key may or may not be valid. It’s always valid. And the key chain name is called keys-for-EIGRP. So on R2 side, we’re using MD5. And this is the key chain that we’re using.

00:18:03
We need to implement the same keys over on R1, or we could remove the keys from R2. It’s very likely, however, that we’d want to add it to the other side, so let’s go ahead and add it back in over at R1. A little road trip over to R1. Let’s go ahead and give us a little more space on the screen.

00:18:22
And let’s just verify if we have a key chain. If we have the key chain already set up, we don’t have to recreate it. We’ll do a show key chain. That’s fantastic. You want to make sure that the text is right, the string, make sure there’s no extra 0’s or extra spaces or anything else that might make those strings different.

00:18:40
And then we’ll simply add that authentication back to the interface where we want the adjacency to happen. Now that that’s done, let’s go ahead and do a show IP EIGRP neighbors. And again, it may take up to 60 seconds on this slow serial link for those messages to be sent for the neighborship to go ahead and come up.

00:18:59
So we want to give it just a moment, just to make sure that if that was the last problem with our adjacency we aren’t continuing to troubleshoot when, in reality, we just have to wait a few more seconds. And look at that. It’s like, whoo-hoo. Party, party.

00:19:14
Won’t to you be my neighbor? Yes, you will. This looks pretty good to me. So I’m thinking, wow, we’ve got our neighborship with 10.0.0.2. It says up, new adjacency. Let’s go ahead and take a look at the routes. Let’s do a show IP route. And this is IOS 15, so it’s showing me some local routes. But let’s actually put on a little appendage on that and say, let’s show IP route for EIGRP and the EIGRP learned routes.

00:19:37
Nothing. It says I have an adjacency, based on this message right here, but I’m not learning any routes. What I should be learning is the loopback address that’s over on R2. It is part of the EIGRP process. Let’s go take a look over at R2 for a moment and see what he says.

00:19:53
I don’t see any messages on the screen that talk about, hey, I’ve got a new neighborship. In fact, here, if we do a show IP EIGRP neighbors, R2 says, I got nothing. How can that be? R1’s saying, yeah, I got a neighborship, and R2’s saying, I don’t have a neighborship.

00:20:11
Hey, what else did Anthony mention that might be a problem? He mentioned that if the interfaces are shut down, that’s going to be an issue. If there’s uni-directional links, that might be an issue. There might be access control lists that are stopping it.

00:20:23
Let’s take a look and see if, perhaps, there’s any type of administrative controls, like an access list, that is preventing some of those hello messages from being received at either of these routers. I’m thinking R1 might be perfectly OK, because he thinks the adjacency is up.

00:20:38
He’s happy, happy. But R2 isn’t quite so happy. He’s not saying anything about that. So on R2, what I’m going to do is do a show access list, just to see if there are any access lists. See, if there aren’t any access lists, we don’t have to worry too much about an access list being applied.

00:20:52
However, because we do have an access list of 100 right here, we then want to find out is this applied, yes or no, to an interface that might be blocking some traffic. And if we do a show IP interface for serial 1/0, it says, you know what? The inbound access list is 100. So if we go back to the access list and we say, OK, we’re permitting ICMP from anywhere to anywhere, UDP from anywhere to anywhere, TCP from anywhere to anywhere, and denying everything else, there are 43 matches. Well, heck, should that allow EIGRP through? And the answer is if EIGRP was using UDP or TCP, the answer would be yes.

00:21:30
Unfortunately, EIGRP is its own protocol at Layer 4. And as a result, it’s not matching on lines 10, 20, or 30. It’s actually matching on line 40 to deny everything else in the IP stack that hasn’t previously been matched. If we look at a protocol analyzer of what’s happening, we’ve got some LMI stuff from Frame Relay.

00:21:50
If we scroll down a little bit, we start the hello messages. So here’s R1’s telling R2, hey, EIGRP message, I want to be your neighbor. And if we open up Layer 3, is says, OK, protocol 88 is the protocol number for EIGRP that’s being sent. And unfortunately, when R2 gets that, he’s saying, you know what? That’s being denied.

00:22:10
And it sends an ICMP message back saying, you know what? I killed your packet. And if we look at the details of that ICP message, it actually describes what it killed. It said, well, it was protocol 88. I would have let them in, but the access list just said no.

00:22:24
But I thought you should be aware that I killed your packet. Now if you go through this trace, which, by the way, we’ve included in the Nugget Lab files for this video, if you go all the way down, you’re going to see this repeating over and over again. So approximately about 217 seconds-ish from the beginning of this capture, R1 effectively gives up on the relationship.

00:22:44
If we go back to the console and go back to R1, we get this beautiful message. Yep. Retry limit exceeded. But I’m not going to give up, because I’m still receiving every 60 seconds the hello message from R2. So this process is going to go over and over and over again.

00:23:03
The adjacency will think it comes up on R1. And then after a period of time, it’ll say retry limit exceeded. And then it’ll start the cycle over again. The entire time, however, good old R2 never thought there was adjacency. And the end result is neither one are sharing EIGRP routes with each other.

00:23:23
Let’s go fix this over on R2. So over on R2 where the problem is with the access list, we can just simply modified the ACL. We’ll just stick in line 35 that says permit EIGRP. And once that’s there, then if we wait long enough, like 60 seconds or less, we should go ahead and be able to get our adjacency.

00:23:42
We’ll go into Configuration mode. We’ll go into named access configuration mode for access list 100 and add line 35. And that should be it. Now when we do a show IP EIGRP neighbor, we don’t have any neighbors yet. But I’m thinking within 60 seconds or less, depending on when the timer hits, we should get a neighborship.

00:24:02
And I, for one, am willing to wait. However, I wouldn’t wait more than 60 seconds. If more than 60 seconds goes by and it hasn’t happened, there yet may be another problem. And there’s our adjacency. That looks great. As a result, we should have IP routes learned via EIGRP, which we can check with a show IP route EIGRP.

00:24:23
This says, sure enough, we’ve learned about the loopback interface on R1. And as a final test, let’s go ahead and ping it. To ping it, we’ll do a fair test. We’ll go ahead an ping it and source it from our loopback, which is 2222. If that works, that’s a home run in both directions.

Where are My EIGRP Routes?!?!

00:00:00
Welcome to this CBT Nugget Where, oh, where is my EIGRP Route? That’s right. Nothing more frustrating than viewing your routing table, expecting to see some shiny new prefixes, and you don’t see them. In this Nugget, Keith Barker and I will walk you through the easy troubleshooting process to solve these missing EIGRP route problems.

00:00:25
If you are interested in Cisco certification, this particular video does pertain to the CCNA, CCNP, and CCIE certification levels. So in this Nugget, we will start out with a review of some of the basics involving EIGRP route propagation in your network infrastructure.

00:00:47
We’ll make sure we detail some common and maybe not so common issues that can occur when it comes to routes not showing up in your forwarding database. And as always, we’re going to turn to the expertise of Keith Barker to bring this subject to life at the command line.

00:01:07
One of the ways in which EIGRP is kind of like a link state routing protocol is that it utilizes three databases in its operation. What are those databases? Well the first is the neighbor database. That’s right. A database that is going to list the adjacent neighbors that the particular device possesses.

00:01:32
Next up, we have the topology table. The topology table is roughly equivalent to the link state database in a true link state routing protocol like OSPF. The forwarding database, otherwise known as the routing table, is the next database that is utilized.

00:01:57
So think about this, EIGRP forms neighbors and records these neighbors in a database. It then exchanges EIGRP prefixes. All of these prefixes on each device are stored in the topology table, and then be defusing update algorithm or dual will run and will take the very best routes and move them from the topology table into the routing table or the forwarding database.

00:02:33
Remember that when we look inside the topology table at all of the prefixes, we can have prefixes show up as p for passive. And that means that there is full reachability knowledge for that particular prefix and everything’s just fine. If we look in the topology table and we see a route marked as active, that means we have lost information about that particular prefix and we are currently in the process of querying the devices to learn additional information about how to reach that particular destination.

00:03:11
By way of review, remember EIGRP does something very interesting as well. It’s going to take the best routes and it’s going to move them into the forwarding database. And these best routes are called the successor routes. Now it will bookmark, if you will, second best routes.

00:03:33
And these second best routes are known as feasible successor routes. And there may be more than one of these that are second best paths to get to a particular destination. If these feasible successors exist, they remain in the topology table. They stay tucked away in the topology table unless there is a problem with the successor and then they can be immediately installed into the forwarding table as the new successor route.

00:04:10
There doesn’t have to be an active state. There doesn’t have to be all of this querying behavior. We can just immediately implement this feasible successor. I should say EIGRP immediately does this for us and implements this device prefix as the new successor route in the forwarding table.

00:04:35
Another interesting aspect of EIGRP’s behavior is the fact that prior to 15.X code, the EIGRP device will auto summarize by default. That’s right. It’ll act classful in nature by default. A lot of students are surprised by this. So if we were going before 15.X code, by default the EIGRP speaker will summarize on major network boundaries.

00:05:13
And this certainly, as we’ll allude to, could be a reason that we’re not seeing all the prefixes that we expect to see in our environment. Now what’s going to be our command worth its weight in gold to see if we have one of the common issues we’ll cover when it comes to EIGRP routes not being there? Well obviously, we can go to our device and we can issue show IP route EIGRP.

00:05:42
It’s nice to qualify EIGRP especially in the case of a very large routing table where we might see a lot of other different routes like BGP routes, maybe we have an OSPF environment we’re connected to. So show IP route EIGRP will just give us those EIGRP prefixes that have made it to that forwarding database that we discussed.

00:06:06
Now a potential situation here is you look in there with show IP route EIGRP and you have none. Yeah, no routes are showing up at all. What’s the first thing we want to check? We want to check to insure we have the adjacencies established that we expect. This is, of course, with show IP EIGRP neighbor.

00:06:33
And if we are indeed having problems with the adjacencies in our topology that might be between R1 and R2, it might be between R2 and R3, or it might be between all of these devices we’re experiencing the adjacency failure, well you need to consult with our Nugget in this series in which we solve EIGRP adjacency problems for you.

00:07:01
Now if you check your neighborships and everything there is just fine but you still have absolutely no EIGRP prefixes, well now it’s time to check in on distribute lists. Are there distribute lists in place and are those distribute lists blocking all of your route prefixes? By the way, keep distribute lists in mind when we are troubleshooting an environment where just some of our prefixes are missing too because obviously, the distribute list logic could be messed up and perhaps it’s just blocking several of your EIGRP prefixes when you don’t want it to be doing so.

00:07:48
So always give consideration to both inbound and outbound distribute lists that may exist in the topology. Now might you be missing just some prefixes? Well again, we definitely want to consider our distribute lists situation, but we can really rule out adjacency issues, at least with our direct neighbor.

00:08:11
There may be some adjacency issues downstream, and that’s why we’re not seeing as many EIGRP prefixes as we would like. But oftentimes, when we’re not seeing the sheer volume of prefixes that we would expect, it’s that automatic summarization behavior. Sure, remember, EIGRP speakers prior to 15.X code are going to automatically summarize by default.

00:08:39
And we will, therefore, lose prefix details often between our dis-contiguous networks. We will see a summarized version of the EIGRP update as opposed to the individual component routes that we might want to see. Now there are two other cases where we can see some of our EIGRP routes not being there.

00:09:05
These are corner cases. But we want to make sure you are aware of them. Remember there is a behavior called EIGRP stub that can be configured. And with the EIGRP stub feature, you can choose to have the stub device only advertise a subset of potential routes.

00:09:29
For instance, you could say, OK R2, I only want you to advertise static routes and connected routes. This is done in the EIGRP stub configuration command. This could certainly lead to some prefixes being, quote, “hidden” unquote from you view. Another thing to watch out for, and this is why we have this additional EIGRP AS out here, is the corner case of where you have R3 existing in that external AS and it has an EIGRP router ID that is identical to, let’s say, the EIGRP router ID of R1. If this is the case, when these prefixes go to get redistributed into AS100, sure enough, the prefixes will not appear because of the duplicate EIGRP router ID issue.

00:10:34
Again, this is rare and is a corner case type situation. And it would only impact you being able to see EIGRP externals in the AS100 domain. But this is worth pointing out since tricks like this, while rare in production networks, might be fun for certification exam designers as we are moving through the various Cisco certifications.

00:11:03
So Keith, there are some of our common and admittedly not so common issues that we can run into as we are looking for our EIGRP prefixes. I’d love for you to show some of this great verification and problem resolution techniques at the command line. Hey thanks, Anthony, I would love to demonstrate verification and remediation at the command line.

00:11:29
To do that, let’s use this topology. So we’ve got R1, R2, and R3. They’re all connected together over a frame relay network. However, there’s not three PVCs. I’ve got one PVC from R1 through R2, and one PVC from R1 to R3. But is a multi-point network as we have three devices all connected to this common 10.0.0 network. Hiding behind R2, we have these range of networks and behind R3, we have these. And we just a 172.16.10 subnet up behind R1. So let’s begin just by verifying that we’re actually missing some routes.

00:12:02
So up on R1 we should have at least 20 EIGRP learned routes. And a great way to take a look at an overview of the routes we’ve learned is with the command show IP route summary. And it’ll give us like the Reader’s Digest version of the counts of routes that we have.

00:12:18
So I’ve got some great news, that’s this guy right here, EIGRP. We have zero classful networks, zero subnets. Effectively, we don’t have any EIGRP learned routes. Now why is that so great? Why do you sound so happy, Keith? Because now we have something to track down, to discover the reason for these missing routes.

00:12:37
So there could be several reasons for that. Let’s see if we even are enabled for EIGRP. So we’ll take a quick look and do a show IP EIGRP interface. So if our interfaces are enabled for EIGRP– which they are, that’s the interface connected to the frame cloud.

00:12:51
It’s enabled. I’ve also got a physical interface gig 2/0 which is also enabled for EIGRP up on R1. So why are we not learning any EIGRP routes? Well let’s just verify that we have neighbors. In our Nugget on troubleshooting EIGRP adjacencies, we identified how to troubleshoot a neighborship.

00:13:08
But you know what, this neighborship looks solid. R1 is a neighbor, an EIGRP with R2 and R3. So let’s go take a look at R2. R2 should be serving up at least 10 different routes. Let’s go take a peek at him. So on R2, we’ll simply do a show IP route EIGRP. That’ll show us if we’re learning anything from R1. It’ll also reveal if there’s any type of summarization that’s happening.

00:13:32
And sure enough, check this out right here. We have a summary route and yet another summary route. So this is the 10 network with an 8-bit boundary and the 172.16 with a 16-bit boundary. And those would be tail tale signs that this router is currently doing auto summarization.

00:13:49
As Anthony mentioned, in iOS 15 and higher, auto summary is off by default but you can turn it on. With iOS 12 and older, the auto summary is on by default. So based on these entries right here, that’s absolutely what it looks like. We can also confirm that with show IP protocols.

00:14:07
And that would tell us explicitly regarding summarization. So it’s really simple to disable the auto summary which is on based on this output. We’re just going to go into router configuration mode for EIGRP and we’ll just go ahead and disable auto summary with a no auto summary.

00:14:25
And here in R2 let’s also just validate our neighborship. We should have a neighborship with R1 because we are in a frame relay multi-point network. We don’t have a PVC directly connecting us R2 or to R3. We only have a PVC up to R1. And we do have our neighborship which is great.

00:14:42
So let’s see if we have any routes here in R2 that we’ve learned from R1. We’ll do a show IP route EIGRP. Now this is great news because check this out, our summaries are gone so we’re not doing auto summarization anymore. That’s fantastic. And we’ve learned one network, that’s the 172.16.10, that’s the network that R1 is advertising. That’s a very good sign.

00:15:04
So let’s do this, now that we’re not summarizing anymore, let’s go back up to R1 and just do a quick peek and see whether or not we have any routes that we’ve learned from R2. So we’ll just issue a show IP route for EIGRP and we’ve got absolutely nothing.

00:15:21
So we’ve got our neighborship, but we’re still not learning any routes. I wonder if R2 is actually sending any routes. How can we determine that? Let’s go talk to R2 and ask him about which routes he’s advertising up to R1. Now to do that, in a closed track environment, I’m going to turn on a debug.

00:15:40
In a production environment, if we did an incorrect debug, we could bring down a router. So I want to make sure that doing debugs is a pretty serious thing. And in this lab environment, I’m going to pretty much just do a debug of all IP EIGRP. And then I’m going to force the relationship.

00:15:55
I’m going to do a clear IP EIGRP neighbors and that will force the neighborship to reestablish. And we should see all the routes that are currently being advertised. In a production environment, we would consider using an access control list or something else in conjunction with a debug to help limit exactly what we’re going to see.

00:16:13
So here I’ve got these routes. I’ve got the 21, 22, 23, 24, 25, all the way through 29, and there’s 20 right here as well. And it says, you know what, I was thinking about sending these routes, however, I’m not going to do it because I’m configured as a receive only stub, says Mr. R2. Well if we need those routes, if we don’t have some type of a summary or a static route up on R1, we absolutely need these routes to show up on R1. So to do that, we’d go into router configuration mode and simply remove the receive only stub configuration.

00:16:43
So in R2 let’s go ahead and turn off the debug because we’re done with that. And we’ll do a config T, router EIGRP one, and we’ll say please turn off the receive only stub function on EIGRP. And it says, great, no problem. Happy to turn that off. Now that we’ve reestablished the neighborship between R2 and R1, those routes should all be flowing. Let’s go check that out up on R1. So we’ll go to R1. Let me give us a little visual separation here.

00:17:08
And let’s do a show IP route for the EIGRP learned routes. Hey, that’s improvement! So we’ve got five EIGRP learned routes. That’s terrific. But as I look at these routes, something seems odd. And the fact of the matter is each of these routes are odd numbered networks.

00:17:28
They’re odd numbered subnets– 21, 23, 25, 27, 29. Where are the even ones? Something is causing those routes– the even ones– not to be seen. And it’s very likely R2 not sending them or R1 not being willing to accept them and put them in the routing table.

00:17:46
So since we’re right here on R1, let’s take a look at the IP protocol. We’ll do a show IP protocols. And the part I want to focus on here is this piece right here. So if we had to distribute list that was set for inbound routes, that would be shown right here and there isn’t any here on R1. But it could be some type of a filter or distribute list down on R2. So let’s make a road trip down to R2. And on R2, let’s go ahead and give us a little bit of visual separation here.

00:18:16
And let’s turn on a debug again because I turned it off earlier. We’ll do a debug IP EIGRP. And then we’ll clear the EIGRP neighborships just so we can force it to happen. We don’t have to wait. So now that the neighborship has been cleared, we’ll give it a moment to come back up which should trigger all the routes being sent over to R1, or at least the routes that are being sent, we’ll be able to see the details behind it.

00:18:39
And that’s interesting. So here we have the odd route. It says go ahead and advertise, and advertise just the odd ones here. And then on the even ones, it says it’s been denied by a distribute list. So that’s absolutely what’s happening. The even routes are not making it out of the gate from Mr. R2. So let’s take a look at a show IP protocols which was also very quickly show us any distribute list that are currently in place.

00:19:04
And here we go. Outgoing update filter list for all interfaces is one. And if we want to take a peek at that access list, it would look something like this. So the wildcard match says the first two octets have to match, so 172.16. And then regarding the third octet, it says I do not care about the first seven bits of that, but I do care about that last bit position, which based on this being a one and that last bit position also being on, only routes in the third octet which are odd are going to be permitted by this ACL.

00:19:36
And then, of course, the last octet we aren’t caring about. So that’s the distribute list that was currently being applied outbound on R2. Well that’s a pretty simple one. We can simply remove it or if we had other considerations we could edit that ACL to permit specific routes.

00:19:52
In this case, let’s just go ahead and remove it because our goal is finding out and correcting why the missing routes aren’t showing up. So we’ll simply go back into router configuration mode and we’ll say please no distribute list one out Mr. Router. And once that’s done, all of the routes on R2, hopefully, unless there’s something else, all of those rules should be showing up over at R1. So let’s see whether or not those routes show up on R1. So up on R1, we’ll go back up there and just do a show IP route for EIGRP.

00:20:26
And sure enough, there we go. We have routes 20 through 29, which is 10 EIGRP learned routes that R1 now knows about because R2 advertised them. Now the next piece is, do those routes show up on R3? So let’s go down to R3 and see if those same routes show up over there.

00:20:46
Simple command, show IP route EIGRP, and it says great news, there are some of those routes. So we’ve got the 20, the 22, the 24, 26, 28, and 32. Now the 32 here, that’s a summary route going to null zero. So it appears that R3 is doing some manual summarization based on this.

00:21:09
It also appears there’s some kind of filtering going on. Now earlier we looked at R1. We did a show IP protocols, and it did not have an outbound filter list at all. So as a result, it’s very likely that R3 has some type of an inbound filter. A quick verification of show IP protocols will reveal if there’s any distribute lists involved.

00:21:30
And sure enough, there’s an inbound distribute list. And it’s also using access list number one. So we have two basic options. We could edit that access list and permit those additional routes or we could go ahead and simply remove the distribute list. Because our focus is on getting those routes, let’s just go ahead and remove that distribute list.

00:21:48
So to do that, we’ll go into router configuration mode and simply say no distribute list one in. And based on the output of show IP protocols, that should remove that distribute list. So it appears that R1 and R3 have had a little love chat, and now they have the additional routes.

00:22:04
Let’s see if those routes now show up on R3. We’ll simply do a show IP route for EIGRP learned information, and sure enough, there we go. One of the things I love about iOS 15 is that routes are ordered. If you look at older versions of iOS, it didn’t always put things in order for us.

00:22:21
In any event, there’s 20 all the way through 29. So there’s the 10 routes that R2 is sourcing. And then I’ve got this local summary. But you know what I don’t see? I don’t see the 172.16.10 network. That’s the subnet that R1 is sourcing. R2 saw it, but I don’t see it here in R3. And we do not have any distribute lists involved.

00:22:42
So now it gets interesting. Let’s go up to R1, and see if R1 has any of the routes from R3. So we’ll do a show IP route EIGRP. Again, there’s no distribute lists in place. We should have the subnets that begin with three in that third octet coming from our three.

00:23:00
And I don’t see any of those, nor do I see that summary, the manual summary that R3 has. All of those routes should be showing up here at R1. We’ve got the neighborship. We are exchanging information. R3 knows about the routes from R2, and that came through R1. What about the routes from each other, R1 and R3? So in R1, let’s take a closer look at show IP protocols and see what this could possibly be.

00:23:25
One of the things that’s not often realized by many people, and Anthony told us about it, was that there is a router ID inside of EIGRP. Now how does it select a router ID? It does it the same way OSPF does it, the highest loopback. If there’s no loopbacks, the highest IP address on any interface.

00:23:41
Or you can also configure the router ID manually in router configuration mode. So router one here has a router ID of 200.200.200.200. That’s just a loopback address on this router that it shows. Now why is that a big deal? Well, let’s say you’re running Anycast, and you have a couple of different devices that have the same exact loopback address and it happens to be the highest numbered loopback.

00:24:04
If that happens, you’re going to have multiple routers with the same router ID. And although they will become neighbors, and although they will share other people’s routes, a router will not accept a route that it has learned through EIGRP if the originating router has a router ID that matches its own.

00:24:25
It kind of freaks out, and says you know what, if this was sourced by my router ID, perhaps it really is my route that I sourced and there’s a loop somewhere. In any case, it will not install that route. So for example, let’s take a single route, let’s take the 20 subnet that R1 has learned from R2 that’s currently in the routing table for R1, do a show IP EIGRP topology for that specific route.

00:24:48
And in that route information, it says OK, the originating router is this. Now what is that? That is the router ID, the EIGRP router ID of R2. Because it’s different than the router ID in R1, it’s not a problem. However, if we make a road trip over to R3, and on R3 we do a show IP protocols, unfortunately, we’re going to see that the router ID on R3 is 200.200.200.200, identical to the router ID on R1. And that’s the reason that R1 just says no to any route sourced by R3. And vice versa, R3 says no, any route sourced by R1. Because it has that router ID as the originating router, neither one of them will buy into or be willing to implement each other’s sourced routes.

00:25:40
And that’s a problem. So to fix that, we just need to change the EIGRP router ID on either R1 or R3 because at the moment, they both have a loopback with a high numbered address. So to fix that, since we’re here at R3, let’s do it here. We’ll go into router configuration mode, and simply say EIGRP router ID, and then just make up a 32-bit number, one, two, three, four will work.

00:26:03
And poof, we’re done. As soon as this neighborship has their little love chat, now they should both be willing to accept and implement the routes from each other because the router ID is different. So let’s go back up to R1. We could do it at R1 or R3. Let’s just do a show of the routes for EIGRP with a show IP route EIGRP.

00:26:23
So now we have the 30 network, the 31, and this summary that’s being advertised over from R3. Now if the summary is OK, great. It accounts for the routes that hide under that summarization. However, if we wanted the details, we could remove that summarized route off of R3, again, if we wanted each of the detailed routes.

00:26:42
Because we are looking for all the routes, let’s go ahead and take off that summarization on R3. We’ll do an interface config for a serial 3/3. That’s where the show IP protocol says it was being implemented. And we’ll simply say no. No to the summary address for EIGRP one, for the 172.16.32.0 network with the appropriate dotted decimal mask.

00:27:04
Now that that is done, if we go back up to R1, and we issue the same command of show IP route EIGRP, we now have our detailed routes which include 32 all the way through 39 which previously were hiding beneath a summary. So now if we do a show IP route summary like we did at the beginning of this verification and validation and remediation of these missing routes.

00:27:27
So now we’ve got a full network that’s this guy right here, classful 192.168.33 and 20 subsets, that’s 10 from R2 and 10 from R3. So in answer to the question that Anthony posed, where, oh where have my EIGRP routes gone? They were hiding behind summaries, we had distribute lists, we had stubs, we had matching router IDs.

00:27:49
And there’s one other place that we also could’ve looked if we were still missing some routes, and that’s right here. In our topology, if we are running frame relay, and we have a multi-point interface like R1 does, we’d also want to make sure that for EIGRP we want to disable split horizon.

00:28:08
That’s the attitude of learning routes on a single interface, and not being willing to advertise them out of that same interface. We have had a great time in demonstrating the verification and remediation of missing EIGRP routes. On behalf of Anthony and myself, we hope this has been informative for you, and we’d like to thank you for viewing.

OSPF Refuses to Neighbor!!!

00:00:01
I just got a call from a level two engineer who says the OSPF routers will not form adjacency and they asked for our help. Let’s begin. Our objective for you and I in this video is simple. We want to identify what would prevent two routers like R1 and R2 from becoming fully adjacent OSPF neighbors. So we probably ought to start with that term right there.

00:00:23
What does it mean to have a full adjacency? It simply means that the two OSPF routers are prepared, willing and able to exchange link date updates with each other. And it’s also important to note that it is sometimes OK not to have a fully adjacent neighbor ship.

00:00:38
For example, on the same network if we had R3, R4, and R5 and we we’re all connected to the same common network, one of these devices would be a designated router, one would be a backup designated router. And then as an example, R1 and R5 would not become fully adjacent with each other.

00:00:58
Because they don’t need to. They would each have adjacency’s with the designated router and the back designated router. And that’s how they’d do their exchanges of the links date updates on this network segment. But here’s what you and I get to do, we’re going to use this topology right here where there’s only two routers on each segment.

00:01:13
So we’re looking for full adjacency’s between router 1 and router 2. And full adjacency’s between router 2 and router 3. Also for our discussion, let’s have all of the interfaces connected to R1, they’re going to be an area 1. And we’ll have this network segment right here between R2 and R3, that’ll be area 0. Quite often when I think about OSPF adjacency’s and what’s required to pull it off, I think of a home owners association.

00:01:38
So for example, on a street or in a community all the homes that want to live on that street have to agree or abide by the principles of the homeowners association. And with OSPF, if we have a device for example that does not agree with all the parameters, it doesn’t get a fine or something else like that like we would with the homeowners association.

00:01:58
It simply is not allowed to become a neighbor on that street. Or in this case, on that network segment. So let’s take a look at the laundry list of items that have to be in agreement for two devices, two OSPF routers to become fully adjacent. They have to agree on what the actual street name is, the subnet.

00:02:14
And that includes the network address, as well as the mask length. The area, if router 1 thinks that this network is area 0, and the other router thinks its area 1, that’s going to be a problem. The timers for the hello and dead interval have to match. We have to have unique router IDs on each router.

00:02:31
So if we have two routers in a OSPF area and they are both for whatever reason using the same exact router ID, those guys are not going to neighbor up. They need to agree on what type of area they’re connected to. Is it a normal area or a stub area, or a not so stubby area? The MTU for the actual frames have to match up.

00:02:50
We do have options where we can tell OSPF, hey, you know what we don’t care about that. But by default it does care about the MTU size. For security reasons if we’re implementing authentication, we need to make sure that each device on the networks they’re connected to agrees on two things.

00:03:05
What type of authentication? And then, if we’re using a password we also have to have the correct password configured. So although with OSPF we have three types– 0, 1, and 2. So this is called null authentication or no authentication required. This is simple text and this is MD5. Another item that we need to agree on is to have a compatible network type.

00:03:25
Effectively it means, do we need to have a designated router for this segment? For example, a broadcast or non broadcast network type. They’re both looking for a designated router role on that network segment. On the other hand, a point to point or point to multipoint network type doesn’t require a designated router.

00:03:44
And there is a teeny bit of leniency in mixing and matching network types as long as you have two network types that both agree on whether a DR is needed or not. However, what we’re going to find is if we have two different network types that both agree on not to use a DR for example, point to point and point to multipoint, the timers may not be exactly the same.

00:04:05
So if we don’t use the same exact network type, we might have to go and manually manipulate the timers to get it to work correctly. At the end of the day, it’s a lot simpler to simply just use the exactly correct network type for those two devices. Here’s what you and I get to do, we’re going to take this topology and our objective is simple.

00:04:23
We want to have full reach ability between the loop back on our 1, which is 1.1.1.1, and the loop back on our 3, which is 3.3.3.3. And our focus is to fix the adjacency, the full adjacency issues that we’re currently having between R1 and R2, and between R2 and R3. This topology is also available inside of the Nugget Lab files for this video.

00:04:46
Let’s begin our journey on R1 by just taking a look and seeing if there’s any current OSPF neighbor ships to begin with. And then we’ll work from there. So we’ll do a show, IP OSPF neighbor. And survey says, we got nothing. One thing we might want to check is for basic connective.

00:05:03
Now, we could do a show CDP neighbors. See which switch were connected to. And do that on R2 as well. But another really good way to validate we have basic connectivity, would be to run trace route to the IP address, in this case to R2. Now, if that works it’ll verify one, that we have connectivity.

00:05:19
And secondly, it also verify that we’re not going through multiple hops to get to that destination. So in this case, it’s great. We have direct connectivity to 10.12.0.2, which is the IP address of R2. Another thing we may want to verify is that OSPF is enabled on our interfaces, as well as the interfaces of R2. One way of checking a remote device or a neighbors device interfaces to see whether they’re enabled or not, is to do a ping to the multicast address used for OSPF hello’s.

00:05:49
So here we did a ping of 224.0.0.5 and we got a reply back from our self. That’s kind isn’t it? And more importantly, we got a response back from 10.12.0.2. So we know that router 2 is enabled for OSPF and listening because he replied to our ping request.

00:06:07
It would also be really important for us to verify that R1’s interfaces are also enabled for OSPF. We can do that very simply with a show, IP OSPF interface brief. And that’ll give us a Reader’s Digest overview of which interfaces are enabled for OSPF. And Houston, we have a problem.

00:06:25
We have loop back 0, which is fine. So we have the process ID for OSPF is 11, the loop back is an area 1. However, this shows us that the interface gig 1/0 on our 1 is currently not enabled for OSPF. Now, could be enabled a couple different ways. We could enable it in interface configuration mode or in router config we could add the appropriate network statement.

00:06:48
So let’s validate. There’s a couple ways of verifying this. Let’s do a show IP protocols. And that’ll show us the network statements currently in OSPF. So here we have a network statement of 1.1.1.1. And that’s an area 1, so that’s good for a loop back 0. But we’ve got nothing for interface gig 1/0. And also just for grins, let’s take a look at the running config, but just the router protocol section for OSPF.

00:07:11
And that’s yet another way to validate the network statements inside of router OSPF. So here we have our network statement, which is including loop back 0. We also have area 1 is a stub. And that’s not a problem, as long as router 2 also agrees that area 1 is a stub area. That shouldn’t pose a problem.

00:07:29
So let’s go in and we’ll simply include gig 1/0 into OSPF. Now, we could use an interface command. Or we could use a network statement. And for this demonstration, let’s go ahead and simply use a network statement. So we’ll go into configuration mode, into router OSPF 11, based on what we looked at earlier.

00:07:45
And then, add network 10.12.0.0 with a 1 octet wildcard mass, putting it in area 1. A common mistake that’s made is that in practice if we always use for example, OSPF process ID 1, and we go to make a change to an existing environment, if we did OSPF 1 we would be creating a new process, and not modifying the existing process.

00:08:06
So just pay attention to what the current OSPF process ID is. And then make sure on that router that you’re modifying the correct router process. So now that we’ve made a change, let’s do two things. Let’s validate whether or not that interface gig 1/0 made it into OSPF.

00:08:21
And if it did, let’s also verify and see if we have any OSPF neighbors. So we have some good news. We have the interface now included in OSPF, however, we have no OSPF neighbor showing right here. And now we’re one step closer to a working fully adjacent OSPF neighbor ship.

00:08:38
We’re not there yet, but we’re well on our way. Another item that has to be agreed to by the two OSPF routers that want to become a fully adjacent neighbor, if they have to agree on the type of area. So we already saw on the configuration here on our one, that area 1 is a stub area. A couple of different ways of verifying that.

00:08:56
So this command, show IP OSPF would give us the output to validate the type of areas. And all I did is I used the pipe symbol. And I said, please begin the output where the line says area 1. So here’s area 1 and there’s the output from that point down. So I’ve got two interfaces in that area.

00:09:12
It says it’s a stub area. And currently it has no authentication. So let’s do this, let’s go over to R2. We’ll make a little road trip over to R2. And see what type of area 1 R2 believes it is. So let’s validate area 1 on R2, to make sure that the area type is exactly the same.

00:09:29
And we do that with the same command. Show IP OSPF. And again, I’m just going to restrict the output to where it begins with the word area 1. So here’s what I don’t see, what’s missing here is it doesn’t say that area 1 is a stub. So that’s a problem. I also see that this area on R2 says we’re using simple password authentication.

00:09:48
And over on R1 it says area 1 is not using any type of authentication whatsoever. So that my friend is two additional things that we are going to have to correct for R1 and R2 to become fully adjacent. Now, before I rush in and we start making changes to the OSPF process.

00:10:04
I just want to validate what the OSPF process that’s currently running on R2 is. So what I’m going to do is show IP OSPF interface brief. And this will validate the interfaces that are currently enabled for OSPF. In addition to what process ID I’m using and the area those interfaces are assigned to.

00:10:20
So on router 2, we’re going to pay attention to make sure we modify the correct OSPF process of 22 as we make these changes. To validate the OSPF processes ID, we also could have done show IP OSPF that also would have shown us what the process idea is. Now, I suppose the question is, who’s right? Is R1 right that it should be stub area and R2 is wrong? Or is R2 right that it should not be a stub area and R1 is right? So based on the scenario that we’re given and the details that we’re provided or a diagram that’s given, we would want to flush that out and verify who’s right.

00:10:54
For the purpose of this demonstration, let’s say that area 1 really should be a stub area. So that case, R2 is going to need to change. So we’ll go into configuration mode on R2. We’ll go into router OSPF 22. And we’ll say, area 1 stub and make that change. Now, area one is a stub on R2, as well as on R1. It’s usually a great idea to validate that the changes you made actually took place in the configuration.

00:11:20
So let’s do this. Let’s do a show IP OSPF again just to validate that area 1 is now a stub area in the mind of R2. And this output confirms that that’s the case. So if that was the last weak link that was preventing are fully adjacent OSPF neighbor ship, we should now have neighbors.

00:11:39
Now, I didn’t see any console messages pop up that indicate a neighbor ship came up. But what we do to validate that is to do a show IP OSPF neighbor and hope for the best. And survey says, no neighbors. So we still have some type of a problem with OSPF adjacencies.

00:11:56
One of the items that we mentioned that has to match for OSPF neighbors to become fully adjacent is that the network and mask both have to be in agreement on both sides. Now, we already know that we could trace route from one device to the other. So we’ve established that there’s basic connectivity.

00:12:11
However, we haven’t validated what the actual mask lengths are on both devices. So let’s do a show IP OSPF interface brief on R2 and take a look. So on gig 1/0 it has the IP address of 10.12.0.2 with a mask of 24 bits. So let’s go over to R1 and take a look at that same information.

00:12:32
Just to validate the network and the mask on R1 as well. And what this says is gig 1/0 is 10.12.0.1 in a 23-bit mask. And my goodness, what a difference 1-bit in that network length can make. Because that’s a requirement for OSPF, if one has a mask of 23 like R1, and the other has a mask of 24, they will not neighbor up.

00:12:56
So let’s go ahead and correct that here on R1. I also want to point out, it’s really important before we start making changes we might want to have a backup copy of the configuration before we started working with it. That way if we need to restore something or see how it was originally, we can always go back and take a look at it.

00:13:12
Because we saved it for example, in Notepad or something similar. So now it has a 24-bit mask. Let’s go ahead and just validate that. We’ll do a show IP OSPF interface brief, just to validate the slash 24. We’ll also check and see whether or not we have the new OSPF neighbors.

00:13:28
Now, in absence of a console message indicating that the neighbor ship came up, it’s very doubtful that we have one. And based on the results of the command, show IP OSPF neighbor, we don’t have that neighbor ship yet between R1 and R2. So we already have an indication that there might be some type of authentication problem or issue.

00:13:46
Let’s just validate that here on R1 and R2. We’ll do a show IP OSPF and we’ll focus on just the contents where it says area 1 and below. So we’ll do a show IP OSPF. And so this says for area 1, that it’s a stub area, and has no authentication. Let’s take a look at those same results over on R2. So over on R2 we’ll do a show IP OSPF again, just focusing on area 1 and the output below that. And that indicates that this is a stub area.

00:14:14
And that it’s using simple password authentication. So 0 represents null authentication, which means it’s not required in the area. 1 equals simple, which is plain text. And 2 is using MD5. And the reality is even if we don’t actually configure any passwords whatsoever, if they don’t agree on the type of authentication null, simple, or MD5, they won’t become neighbors.

00:14:38
And they won’t become adjacent with each other if they can’t agree for that area whether or not authentication should be set. So we have a couple of choices. Number 1 we could go ahead and add it to R1. Or two, we could remove it from R2. And based on the scenario that we’re in, maybe there’s a topology diagram or instructions.

00:14:56
We’d go ahead and follow whichever direction that was leading us. For this demonstration, let’s take off the requirement for a simple authentication off of area 1 on router 2, and that’ll make it match up with R1. So we’ll go into configuration mode. Router OSPF 22, which is the process ID we’re running on R2. And we’ll say, no area 1 authentication. And let’s see if there’s any neighbors.

00:15:17
We’ll do a show IP OSPF neighbor. And I didn’t see a message pop up on the console indicating that a neighbor ship came up. But just to be sure, we’ll do a show IP OSPF neighbor. So it appears my friend, that you and I have at least one more issue to deal with before these guys will become adjacent with each other.

00:15:34
Another item on our list of things that have to agree are the MTU. So let’s validate what the MTU is on R2 and also R1 for the gig 1/0 interface. Now, by default on ethernet, the MTU is set by default to 1,500 bytes. And it appears here that we are at 1504. Now, is that a show stopper? The answer is no, not necessarily.

00:15:57
If R1 is also using 1504, the MTUs would match and we would be in good shape. So let’s go over to R1. We’ll issue that same exact command of show interface gig 1/0. And I’ve piped it to only include the line that shows MTU. And sure is shooting, this says 1500. It is the default.

00:16:16
So R2 has a different MTU. And because it doesn’t agree with R1, that would prevent R1 and R2 from becoming neighbors. So let’s go over to R2 and let’s remove any manually configured MTU on interface gig 1/0. And that would set it back to the default. And hopefully it’ll match.

00:16:35
We also want to validate what that MTU is. So we’ll do a show, interface gig 1/0. And simply double check that now the MTU does say 1,500 bytes for that interface. The other option we could have done, we also could have told OSPF to not care about the MTU being a little different.

00:16:52
And that would be another option for dealing with MTUs that didn’t exactly match between would be OSPF neighbors. So I still don’t see the message indicated that OSPF neighbor ship is being formed between R1 and R2. So I’m presuming it’s not happening yet.

00:17:08
Let’s take a look at yet another parameter that needs to match. And that is the timers on the interfaces of gig 1/0 on each of the routers. So let’s do a show IP OSPF interface for gig 1/0 right here on R2 to see what the details of that are. Now, from this output it’s showing us lots of information, including the router ID, the IP address, the network type, and the intervals.

00:17:32
So here’s the hello interval of 11. And the dead interval as four times the hello interval by default. And so it appears that someone has changed or modified the default behavior. Because the default hello timer on a ethernet broadcast network should be 10. So is this a problem? Well it could be if R1 is using the default and here on R2 we’re not using a default. Yes, that would be a problem.

00:17:55
So let’s make a road trip over to R1 and take a look at what the timers are on R1. We’ll do a show IP OSPF interface for gig 1/0. And sure enough, we have a hello interval of 10 with a dead interval of 40. And those are not the same as R2. And that is yet another problem causing our OSPF neighbor ship not to be formed.

00:18:17
So let’s go over to R2 and we’ll remove any administratively configured timers on gig 1/0. So we’ll go into interface config for gig 1/0. And we’ll simply say, no, no IP OSPF hello interval. So that I’ll remove the administratively configured ones and allow the default ones to go back in their place.

00:18:36
And look at that, we have a neighbor ship. Or at least what appears to be a neighbor ship. So let’s validate we have the neighbor ship with show IP OSPF neighbor. That looks great. So from the output here, it says we have a full adjacency with that neighbor.

00:18:50
And that neighbor, which is R1 is a BDR. He is acting as the backup designated router. And because there’s only two routers on the segment, that implies that we ourselves are the designated router for this segment. And one quick way to verify that is with a show IP OSPF interface brief.

00:19:07
And that will show us each of our interfaces, that they’re enabled with OSPF. And for our purposes right here, what our current state is. So gig 1/0 is currently the designated router. A full output of show IP OSPF interface would also reveal that same information.

00:19:23
Now, just to validate that we really are receiving link date advertisements from this neighbor, we can do a show IP route. Just the OSPF learned routes, please. And if we have any OSPF routes learned from that neighbor, they’re going to show up right here.

00:19:37
So that’s great. That’s the loop back address that’s in R1. We’re learning it via OSPF. It’s on to troubleshooting the relationship between R2 and R3. So on R2– we already know that serial 2/0, that’s the interface that’s heading through the network over to R3– we already know that’s enabled for OSPF.

00:19:56
There’s the actual IP address and the mask. And the network type is point to point. And currently we have absolutely zero neighbors. What I’d also like to do is take a look at the details for that interface. For the show interface serial 2/0. and I just want to take a quick look at what type of encapsulation are we using.

00:20:13
Is it PPP, or HDLC, or something else? And it appears based on results, it is something else. We’re on a frame relay network. And that’s important to be aware of. Because there’s lots of variables that come into play with frame relay. For example, at frame relay we can have sub interfaces that could be point to multipoint, or point to point.

00:20:33
And we’re going to need frame relay mappings if it’s not point to point. And so all of that kind of comes into play. So what I’d like to do just to make sure now that we know it’s frame relay, let’s make sure we have a current and working frame relay map that tells the local router how to reach the IP address of R3. So to verify the frame relay mapping and whether or not it was dynamically created, whether the broadcaster support on that mapping, we’ll do a show frame relay map.

00:21:00
And this output shows us that it was dynamically built. So that was done very likely through inverse ARP. We didn’t have to manually put it in. It supports broadcasts. So if we have a routing protocol like OSPF, or EIGRP, or RIP version 2 that uses multicast, those would be allowed over this circuit.

00:21:18
We also know that the local PVC, the local Delsey identifying our end of the PVC is Delsey number 23. And if we use Delsey 23, that PVC we can use it to reach the IP address of 10.23.0.3 at the far side. So using local Delsey 23, we can reach the remote IP address of 10.23.0.3. Let’s also just validate that we can reach that peer with basic IP.

00:21:44
If layer one and two aren’t unhappy and we don’t have reach ability, there’s no way that OSPF neighbor ships can form. So we’ll do a simple ping that IP address and that looks 100% successful. And that’s a great start in building an OSPF relationship is having IP connectivity between the two devices.

00:22:00
So next, let’s take a closer look at the OSPF specifics on interface 2/0. And that’s the interface connecting out to the frame relay cloud. Now, let’s take a look at the details of what this is revealing to us. It shows us the OSPF network type is point to point.

00:22:15
Now, what that boils down to is a couple things. Number one, the timers. We have some default timers for a point to point network. As well as point to point says, we are not going to use a designated router for this network segment. Another network type that also does not use a designated router is a point to multipoint.

00:22:33
And in our checklist, I mentioned that we have to have compatible network types. As long as both routers are not expecting to use a designated router. And the reality is, we can use slightly different network types. For example, on one end we could use point to point.

00:22:47
On the other end, we can use point to multipoint. Because they both don’t want to use a designated router. However, we’d also want to make sure that the timers are identical on those two neighbors. Otherwise, if the timers are different, they will not neighbor up.

00:23:02
So let’s make a road trip over to R3. And take a look at R3. Let’s do a show IP OSPF interface brief. Just so we can verify that OSPF is enabled on its interfaces. And it is. So we have serial 1/0 and loop back 0, which is great. And what I notice over here is we have P2MP. So that might be OK.

00:23:22
Point to multipoint also is not looking to have a designated router on the network segment. And if we want to take a closer look at that interface, we’ll do a show IP OSPF interface for serial 1/0. And sure enough just as a confirmation, the network type is point to multipoint, which also does not require or is not looking for a designated driver on the segment.

00:23:44
However, check this out, they timers are significantly different than point to point. So we have a couple options of solving this. Number one, we could go to this router interface and say this is an IP OSPF network type of point to point just like R2. And that would also modify the timers and that would be great.

00:24:02
Or we could leave this as point to multipoint and just adjust the timers to match the other side. And the secret is because both network types– point to point and point to multipoint– because they both do not use a designated router, as long as we get the timers matched up it’s possible to make that work.

00:24:21
Another really cool option to help validate what’s going on with timers, is a debug IP OSPF hello. And what that’ll do is it will actually give us the details of what’s really going on inside of those hello messages. So let me go ahead and turn those debugs off for a moment.

00:24:35
I’ll do an un debug all. And right here what it’s saying is that we have a configured value of 30, but we received a value of 10 from the other side. And for the dead interval, we have a configured value of 120 and we’re receiving a value of 40. And because those timers don’t match up, that would cause that OSPF adjacency from forming.

00:24:54
So now our options include either setting the network type appropriately on the serial interface. Or we can just manipulate the timers. And I think just for fun, let’s go ahead and manipulate the timers. So for example, there might be a scenario where it’s saying, you must have this network type.

00:25:09
This is yet another way to get around some kind of wacky requirement like that. Just by making sure the network types are compatible. And then making sure the timers match exactly on each of them. So if this works, it should bring that neighbor ship up, which it does.

00:25:24
And now we can verify that we do have a full adjacency. Now, how do you do that? One way would be to take a look at the OSPF routes that we’re learning. If we are learning routes via on OSPF neighbor, we know we have a full adjacency. So let’s do a quick verification of that with a show IP route via OSPF.

00:25:42
And if we have some routes that are learned, which we do. We got the one network that’s the loop back on R1, we got the loop back on R2. And we have the network between R1 and R2 all learned via OSPF over our serial 1/0 interface from our neighbor R2. Another way to verify that our entire OSPF network has converged, or at least between R1 all the way through R3, in this case we could do a ping of the loop back on R1. And source it from our loop back on R3, that will also validate that R1 has a route back to 3.3.3.3. I have had a great time.

00:26:17
I really appreciate you joining me. Isn’t it amazing at how many different ways OSPF can be broken so that OSPF neighbor ships will not form? In this video, we’ve taken a look at most of the common reasons why OSPF neighbor ships won’t form. And we’ve worked our way through them together.

00:26:34
The actual topology diagram used in this video, along with the commands for troubleshooting are both included inside the Nugget Lab files for this video. On behalf of Anthony and myself, we hope this has been informative for you. And we’d like to thank you for viewing.

Where are My OSPF Routes?!?!

00:00:00
Troubleshooting Missing OSPF Routes. Let’s begin. Our objective for you and I in this negative is really simple– we want to identify some reasons why a perfectly healthy OSPF route isn’t showing up in the routing table on various OSPF speakers. And then secondly, we’ll go to the command line, and walk through a couple troubleshooting steps to help reinforce those concepts.

00:00:23
To get the ball rolling, let’s start by drawing a ball. There we go. And let’s say that that ball is on a hill. Now let’s say it’s also on planet Earth with normal gravitational forces. What is going to happen with this ball? Now the answer you might say is, well, Keith, it’s going to roll downhill.

00:00:39
And that’s what we’d expect. However, what would happen if we never had the ball to begin with. It’s not going to roll downhill. And that’s one reason, by the way, that OSPF routes aren’t showing up. They might never have made it into OSPF in the first place.

00:00:55
For example, we could have interfaces that are shut down. So you have OSPF enabled for an interface, but the interface is down, so that networking the interface is connected to isn’t going to make it into OSPF. We might have missing network statements, or if you’re using interface configuration commands to bring in interfaces into OSPF, perhaps they are not present.

00:01:14
That would be another reason why that ball never got rolling with that network. Or perhaps were doing some type of redistribution. We’re taking routes from another source. For example, maybe we’re taking static routes or a different routing protocol on an autonomous system boundary router.

00:01:29
We are redistributing those into OSPF. For example, like right here. Let’s say we have the 777 network. It’s at EIGRP, and we’re redistributing it. If we don’t do that correctly– again, that’s another reason why the route may never get into OSPF. And if it never gets into OSPF, it’s not going to make it too far across the null OSPF network.

00:01:50
So now let’s turn our attention to a route that does make it into OSPF, but isn’t correctly being shown on all the routers. Where is it making it into the routing table? What could cause that? We can use distribute lists, very much like on other routing protocols where we have access lists, we apply it as a filter, and OSPF, because we have link-state advertisements that get propagated– for example, here is area zero, here’s area zero, here’s area zero– the same LSAs are going to be flooded to all the routers who are connected to area zero.

00:02:19
They’re all going to see it. However, if we have a distribute list in OSPF, it can prevent a route from making it into the local routing table on the local router. So as an example, even though R4 has seen the link-state advertisements regarding the network 1111, a distribute list could prevent that 1111 network from making it into the routing table on the local router.

00:02:42
5 crazy thing that can happen– and don’t shoot the messenger, but it’s quite possible– is that we could have incompatible network types between neighbors, and– this is the bad part– it appears that we have a full adjacency with our normal show commands, but in reality, for example, let’s say that one side is a point-to-multipoint and the other side is set to, for example, non-broadcast.

00:03:05
If that’s the case, the non-broadcast, we might use a neighbor statement that’s pointing to this neighbor. And this guy believes that we should be using a designated router. On a point-to-multipoint, he thinks that we should not use a designated router. Another challenge could be simply an improper OSPF network design.

00:03:22
What do you mean, Keith, an proper network design? Well, for example, we have in OSPF, we have area 0, which is great. And then we can have other areas, as well. So let’s say you have area 1 and area 2 and area 3, no problem. And let’s say we have also area 4 right here. There’s nothing wrong with an area 4, but the reality is is that every single area must have an area border router to get to the backbone.

00:03:48
So if we put some routers here, for example, router 1 and router 2 and router 3 and router 4, it’s easy to see the area border router for area 1 is R1, the area border router for area 3 is R2, the area border router for area 2 is R3. And let’s answer the question, who is the area border router for area 4? The answer is, there’s only one router there.

00:04:15
It’s R4, and he is not connected to the backbone. So area 4 does not have an area border router. And for that reason, the networks inside of area 4 would not be successfully propagated across this OSPF network. Another reason that we might have some OSPF routes that don’t show up is perhaps on purpose or by design.

00:04:33
For example, we could have summarization. You could manually summarize at boundaries to intentionally not allow routes to show up. For example, this area right here, we’re using it as a totally stubby area, and that means that inter-area routes will not show up in this area.

00:04:48
So for example, R5, who’s connected to area 45, if this is totally stubby, he would not be seeing the backbone routes or these other non-backbone routes, because those LSAs don’t make it into a totally stubby area. Now to compensate for that, R5 would be receiving a default route from the area border router.

00:05:09
So he’d still have the ability to forward even he wouldn’t know the details of the non-area 45 networks. Now that we’ve taken a look at the common reasons of why OSPF routes don’t show up in a topology, let’s go ahead and apply what we’ve learned to a live network.

00:05:23
So let’s begin here in R1 by simply verifying that we have interfaces that are up, and that they are enabled for OSPF. And we can do that very simply with a show ip ospf interface brief. And that’ll give us the information on the process ID, the interfaces that are enabled for OSPF, and what area those interfaces are in.

00:05:43
So this is great. We have three interfaces. We have two loopback interfaces. They’re both in area 1. They’re both 32 bits. One is 1.1.1.1, the other is 11.11.11.11. And the gig 1 slash 0 interface is in area 0. Currently, we’re acting as a backup designated router, which implies that our neighbor is the designated router.

00:06:03
So let’s go ahead and see what routes that we’ve learned right here on R1 via OSPF. We’ll do it with a show ip route via OSPF. And survey says, well, we have one. We have an inter-area route of 2.2.2.2. If we look at the topology, that’s hanging off of a loopback on R2. However, we are missing a whole bunch of routes.

00:06:26
We should have loopbacks from R3 and R4 and R5. We should also have network 7.7.7.7, which is being injected or redistributed into OSPF from EIGRP on R2. So we have a ton of missing routes. Let’s do a quick check, and see if there’s any type of inbound filter right here on R1. We can do it with a show ip protocols.

00:06:49
And that will reveal if there’s any inbound filtering list applied. And it says right here, incoming filter is not set. So there’s no access list on R1 that is preventing routes from making it to the routing table. Now if we need to, we can also take a closer look at the OSPF database.

00:07:07
But for right now, I want to take a closer look at this 7.7.7.7, and find out why it’s not appearing on R1. We have a neighborship with R2. We’re receiving OSPF-learned routes. Why aren’t we getting 7.7.7.7? So let’s make a road trip over to R2, and just verify that R2 actually has that route– getting the ball rolling, per se.

00:07:29
So we’ll do a show ip route for 7.7.7.7. I just want to validate that he has that at all. So it currently shows that he’s connected, so he’s got the route. And this is confirming for us that it’s directly connected to loopback 7.7.7.7. If the redistribution is working correctly on R2, what should be happening is we should be taking any interfaces and networks that are EIGRP, redistributed them into OSPF, and they should be advertised as LSA type 5’s as external OSPF routes.

00:08:01
So let’s take a quick look. Let’s do a show ip ospf database. And I’m going to say, let’s just see the self-originated LSAs. So this is a little heart to heart with R2, saying, hey, what LSAs are you personally generating? And then I’m going to hit the Space bar to go to the very bottom.

00:08:17
And what I don’t see anywhere is I don’t see any LSA type 5’s So it appears that R2 is not getting the ball rolling. That route of 7.7.7.7 is not in the database. And if it doesn’t make it to the database on R2, it doesn’t have a chance of being advertised to any of the neighbors of R2. So let’s validate our redistribution.

00:08:42
It’s a really simple command. You go into OSPF and say redistribute EIGRP. Let’s do a show ip protocols, focusing on the OSPF section. And let’s just validate whether or not that redistribution and EIGRP into OSPF is happening. And sure enough, right here, it says that we’re redistributing external routes from EIGRP autonomous system 1. Now the question is, why in the world aren’t those showing up in the database, and subsequently being advertised to our neighbors? And what’s important to note right here is what’s actually missing.

00:09:15
By default, when we redistribute into OSPF, it’ll only redistribute classful networks by default. So because 7.7.7.7 is a 32-bit network, the redistribute command is not paying attention to that specific network because it’s not on a classful boundary. So because this is a class a address, if it was a slash 8, that would be no problem.

00:09:39
It would come right in. But because it’s a slash 32, that’s the problem. So let’s do this. Let me walk you through how to do the redistribution without the subnets option, so you can see the look and feel of the message that we’ll get that’s kind of like a big warning, saying, hey, Will Robinson, you’re not going to get your subnets here.

00:09:57
And then we’ll go ahead and correct it. So right here, by default, it says, only classful networks will be redistributed. So if we want to include the subnets, our redistribute command should include the “subnets” keyword, as shown right here. So now, let’s go back, and let’s take a look at the show ip protocols one more time.

00:10:17
And this time, we’re going to see in the details– hopefully, if it took– the fact that we are not just getting classful networks, we’re also getting subnets. So this one little added statement right here is our indication that the subsets are now being included.

00:10:31
So let’s do this. Let’s go ahead and do a show ip ospf database again. And we’ll tell it we only want to see the information that has been self-originated by R2. And what we should see, if it is now correctly making it into OSPF, is we should see the LSA type 5’s. So I’ll hit the Space bar one time, and sure enough, right here, we have this network 7.7.7.7. That is the external network that is being advertised by R2 who’s acting as the autonomous system boundary router, bringing in non-OSPF routes and putting them into OSPF.

00:11:04
And if we want to see a more detailed picture of that external route, we could do a show ip ospf database external, and that would give us the nitty-gritty on that 7.7.7.7 network. So let’s go ahead and do that right now, the show ip ospf database external.

00:11:21
So there’s our network, 7.7.7.7. There’s the advertising router. And now what should be happening is R1 should now be able to see this route, which was previously missing because it had never been correctly redistributed into OSPF by R2. But now that we’ve corrected it, let’s go take a look at R1. So let’s make a road trip over to R1. It’s not too far away, just a click of the mouse.

00:11:44
And let’s do a show ip route, and we’ll limit the output for just OSPF-learned routes. And we’re looking for that network, 7.7.7.7. Hopefully, it will be there. Survey says, yes, we have an external type two route of 7.7.7.7. So now that we have this route in the routing table on R1, is our work done? Well, the answer is no.

00:12:05
We’re missing a boatload of other routes. Look at the topology. We have the area 3 routes, the area 4 routes, the area 45 routes. We’re not seeing any of those in the routing table. We know we have a good relationship between R1 and R2 because they’re exchanging information.

00:12:21
Let’s go over to R2, just do a quick reality check. Let’s show ip route ospf, and see which OSPF-learned routes we can see here. So here we have one network that we’ve learned via OSPF, and that is from R1. However, if we take a look at the topology, there’s also an 11.11.11.11 network that we’re not yet seeing. Now earlier, when we looked at R1, we identified the OSPF-enabled interfaces.

00:12:47
And I know that the both loopbacks, including the one that had the 11.11.11.11 address, it was included in OSPF. So why isn’t R2 putting it in its routing table? Let’s do this. Let’s take a quick look at R2 and say, dear Mr. R2, is that network, 11.11.11.11, do you have an LSA for that that you’ve received, and is it in your database? Perhaps it’s in the OSPF database, but for some reason never made it to the routing table on R2. Now because the 11 network is in a different area– it’s in area 1– in area 0, it would be a summary advertisement coming into the backbone from R1. So the command show ip ospf database summary is referring to summary LSAs, and I’m asking the output to focus on 11.11.11.11. So down here, it says, yup, there’s 11.11.11.11. The advertising router is router 1, 1.1.1.1. Why in the world is this network not showing up in the routing table on R2? It’s in the database.

00:13:48
We learned about it. How come we’re not using it in the routing table? So if we look for a filter, check this out. Show ip protocols reveals that we have an incoming update filter set using access list number 1. So that’s very likely our culprit right here.

00:14:03
So our next question is, I wonder what’s in access list 1. So we’ll ask R2, hey, tell us the contents of access list one with a show access list one. And that will show us exactly what’s going on with that. And that should reveal quite a bit. What this is saying is that so access list 1 says we’re denying 11.11.11.11, and we’re permitting everything else. And that’s exactly why that 11 network is not making it to the routing table on R2. Even though it’s still in the database, and all the other routers in area 0 are all going to know about that, as well, it’s simply that R2 is not going to put it in its routing table.

00:14:40
And that’s only a problem if somebody tries to reach the IP address of 11.11.11.11, because R2 is not going to know how to reach it. So let’s go ahead and correct this. Now there’s several options we could do. We could remove the list from the routing protocol, or we could simply edit it.

00:14:58
I’m a big one for editing it. If something was there in the first place, maybe just a tweak to allow our protocol to work would be better. So in access list configuration mode, I’m simply going to remove line 10, which is the deny statement, leaving only the permit.

00:15:12
And let’s take a look at the results of that. So just to confirm, we have one line left in that access list, which says permit any, which now should not be filtering anything whatsoever. And let’s do a show ip route for OSPF-learned routes, and see whether or not that route shows up.

00:15:28
And sure enough, there’s our 11 network in the routing table, learned via OSPF. So now our challenge is this– we’re missing a boatload of routes from R3 and R4 and R5, so let’s check the neighborship. Now in a previous video on troubleshooting OSPF adjacencies, we looked at troubleshooting adjacencies.

00:15:48
I just want to validate that R2 and R3 are indeed OSPF neighbors. And based on the output here of show ip ospf neighbor, it appears that we are. So we’ve got our neighbor at 3.3.3.3, which is R3. Shows the state as full, and that our neighbor is at DR. And there’s the IP address of our peer just to confirm who that device is.

00:16:10
So what I think we ought to do is go over to R3, but before we do that, let’s just validate by doing a quick peek at the running config for OSPF on R2, that there’s no other type of monkey business happening with that router process configuration. So as we take a look at these, there’s the router ID, a couple of good network statements, a neighbor– which is appropriate, by the way, on a non-broadcast network like a serial interface overframe relay.

00:16:38
The neighbor statement pointing to R3 is perfectly acceptable. So we’re doing unicast updates over to R3. And then we have our distributed list, which we basically neutered by removing the deny statement from it, leaving only the permit statement. So that, my friend, all looks really good.

00:16:53
So over on R3, we’ll make that road trip right now. Let’s do a quick check and see the OSPF neighborships there. I want to focus just on the neighborship between R3 over its serial interface going over to R2. And as we look at this– so there’s our neighbor, there’s R2’s address to be sure. And it says the neighbor state is full.

00:17:12
And this dash symbol– oh, that is bad news. That is done by either a point-to-point or point-to-multipoint network type that doesn’t believe that we should be using a designated router. So R3 is saying no to DR, and R2 says yes to DR. In fact, R2 thought that R3, based on the output on the previous screen, he thought that R3 was the DR. Something is terribly, horribly wrong.

00:17:43
And as a result, the common network that’s connecting these two routers, these two routers don’t see that common network segment as the same network type. And unfortunately, although the LSAs are going to flow across that and be shared, but because they do not agree on that network type, neither of these routers are going to buy in to the LSAs that they see.

00:18:05
So they will see the LSAs, and they’ll be in the database, but they won’t use the routes and the information from those LSAs in the routing table, because they disagree on this common network segment between them. For example, right here on R3, let’s do a show ip ospf database.

00:18:22
And the output of this is going to reveal that R3 knows. There’s the router LSAs, there’s the type 2 LSAs for the designated routers on the respective segments. Here’s the summary LSAs. And if I hit the Space bar, this guy even knows about network 7.7.7.7 with an advertising router of 2.2.2.2. All the information is there, except we’re not going to believe in any of it enough to put it into our routing table.

00:18:49
So if we do a show ip route ospf, all that really cool information that’s coming from R2 and R1 is not in the routing table. And check this out– because R3 believes that network segment connecting to R2 is a point-to-multipoint, if we do a show ip ospf database for router 2.2.2.2 to see what R2’s opinion is of that network, look what he says. So regarding the LSAs from router 2, check this out– R2 believes that 10.23.0.3, which is R3, is the designated router for that network segment, and that it is a transit network.

00:19:26
If we went over to R2 and said, hey, Mr. R2, can you tell me about the LSAs from R3? And we do that with a show ip ospf database router 3.3.3.3 from R2’s perspective. Let’s compare the output. So I’m going to hit the Enter key a few times so we can see it.

00:19:42
So what R2 thinks is that R3 three believes that this network is a point-to-point. That’s its interface on the network segment. And we also have this nastygram, saying the advertising router– so this is R2 referring to R3– the advertising router is not reachable in the topology.

00:19:58
And it all boils down to them disagreeing on the network type that connects them together. So we need to find a happy medium. We need to tell both routers that there either is or is not expected to be a designated router, and make sure the timers are correct.

00:20:11
The best and easiest way to do that is use the same network type on both sides, and that would correct the problem. On a physical serial interface with frame relay, the default network type is non-broadcast. So let’s go back over to R3, and simply take off any administratively configured OSPF network type.

00:20:31
So to do that, we’ll go into interface configuration mode on R3 for serial 1/0, and just say no. We’ll say, no ip ospf network type. So it’ll default back to the non-broadcast. And because R2 already has a neighbor statement pointing to R3, we don’t have to add a neighbor statement on R3 pointing back to R2, although we could. All it really takes is one of the two sides to initiate the conversation, and that will start the OSPF.

00:20:59
Let’s do a quick check of the interface. We’ll do a show ip ospf interface for serial 1/0. And I also noticed that we had a console message that just came up right here that snuck in, indicating that a neighborship came up. So there’s the default network type.

00:21:16
Also, the timers– these timers are not the default for non-broadcast. The default timers are 30 seconds for hellos, and 120 seconds for the dead timers. I adjusted those personally on both sides to make them match, and I made them shorter so it wouldn’t take full 120 seconds for a neighborship to form if I brought the two routers up together at the same time.

00:21:39
So as many people do, if you’re going to lab this up, if you’re using the default timers, it could take a full 120 seconds after you bring the interface up for the actual neighborship to form. So just be prepared for that. So we had a network type that, even though it looked like we had full adjacencies, was causing a problem with OSPF routes not showing up.

00:22:00
Let’s do this. Let’s go back to R1. And on R1, let’s do a show ip route for OSPF-learned routes just to see what is present. Hopefully, we’ll see all the routes from router two and router three and router four and router five. And there’s a whole bunch of them.

00:22:16
And let’s do a ping, as well. So we have the external route for 7.7.7.7 coming from R2. We have a whole bunch inter-area routes. And we can successfully ping 3.3.3.3. That’s a very good sign. Let’s also try to ping 4.4.4.4. And we’re also sourcing these from our loopback 0, which helps to validate that that far destination has a route back to our loopback 0, which is being advertised in OSPF. And that looks successful, as well.

00:22:43
But I have a question for you. As we look at this routing table, what’s missing? I don’t see network 5.5.5.5. It’s not in there, and it should be. So what is it that’s causing 5.5.5.5 not to show up? Well, let’s do this. Let’s go to the source, and ask R5 if he even has that route to begin with, just to verify that the proverbial ball has started rolling on R5. So to do that, let’s just do a quick show ip route for 5.5.5.5, whether he learned it or is directly connected, a show ip route for that address should reveal whether or not router 5 even knows about it. So that’s a huge problem.

00:23:25
We need to make sure that that network on R5 is available so that we can start the ball rolling, and start getting it advertised into OSPF. So let’s find out whether or not that IP address exists, or should exist, on this router. And sure enough, it’s loopback 0. What’s the problem? So loopback 0, 5.5.5.5, yes. And it’s administratively down.

00:23:46
OK, so that’s a huge problem. A down interface obviously won’t get the ball rolling as far as that network is concerned in OSPF. So we can fix that. We have the technology. We’ll go into configuration mode, we’ll go into interface loopback 0, and we’ll say please come up with a beautiful no shutdown command.

00:24:04
And now that it’s up, let’s go back and validate whether or not that interface has made it into OSPF. And we can do that with a command show ip ospf interface brief. And if it shows up, it’ll not only tell us that it’s involved in OSPF, it’ll also show us the area that that interface is participating in from an OSPF perspective.

00:24:26
And it is not there. So we discussed some reasons why a route wouldn’t show up. One was that we’re missing a network statement or an interface command to enable it for OSPF. And that appears to be what’s happening here. Because if there was a network statement that would catch that 5.5.5.5 address, it should show up as a interface enabled for OSPF right here.

00:24:48
So how exactly are we going to look at the network statement? So there’s a couple ways of doing it. One, we could do a show ip protocols. And sure enough, our network statement is just 192.168.45, with the last octet being a wild card putting into area 45. And as a result, that’s why are 5.5.5.5 network isn’t making it in.

00:25:09
Again, that’s no problem for us. We know how to bring in a new interface. We’ll simply make another network statement that includes the IP address on that interface. And we’ll make it very specific. We’ll say, if you have any interfaces that have 5.5.5.5 exactly, we want to put that in area 5. Then let’s go ahead and verify that.

00:25:28
We’ll do a do show ip ospf interface brief just to make sure it’s now showing up as being enabled for OSPF. And it appears that it is. That’s fantastic. We may be done. Our troubleshooting may be complete. Let’s just do a couple things. Let’s verify that we’re neighbors with R4. I see from the show ip ospf interface brief that we do have a neighbor.

00:25:51
It’s very likely R4, but we’ll confirm that with a show ip ospf neighbor. That looks great. And just to verify that we have a full adjacency that’s actually working, and doesn’t just appear as full, let’s do a show ip route ospf. And if we have routes that we’ve learned from OSPF, that would also indicate that we really do have a functioning full adjacency.

00:26:13
And look at this. We have a default route that we’re learning, and we’re learning that from R4. And based on our topology, this is a totally stubby area, so that is perfectly acceptable. R4 is suppressing and not sending all the summary routes from other areas of the OSPF network.

00:26:29
The only summary advertising it’s sending in is this default route that we see right here. And that would be a good explanation of why we’re not seeing the more detailed routes. If we wanted to not make this a totally stubby area, we could reconfigure R4, and make it just a stub area as opposed to making it a totally stubby area.

00:26:49
One way to validate that right here on R5 is just do a show ip ospf, and take a look at the area configuration. And that will reveal the details regarding this area, including the fact that it is a stub. And that just confirms why we’re not seeing the other summary advertisements coming from that area border router.

00:27:07
So let’s do a test, and let’s ping across the entire network. Over on R1, if we look at that topology, R1 has the network 11 in its area 1, and we should have full reachability to that. So a quick test is in order. We’ll do a ping over to 11.11.11.11. And if that works, it validates that not only does the network function that direction over to R1, it also indicates that R1 has reachability back to us. And just for grins, let’s also ping 7.7.7.7. That’s the EIGRP network that has been redistributed by the autonomous system boundary router, R2, into OSPF. And that looks successful, as well.

00:27:47
Now before we sit on the beach and say, yes, we’re done, OSPF is perfectly working, we really ought to do at least one more thing, and that’s ping over to 11.11.11.11, but let’s source it from loopback 0. That would also verify that R1 has reachability back to 5.5.5.5, which we just added into OSPF. So if this works, I think we are done, and our network’s going to be functioning.

00:28:13
So what this implies is that R1 does not have a route back to 5.5.5.5. And we know it’s enabled in OSPF. We just looked at that on R5. Now for this challenge, let’s talk about this topology for a moment. We mentioned that improper network design also could be a reason that some routes don’t propagate and don’t show up, and that’s what’s happening right here.

00:28:35
If we look at this OSPF network for 5.5.5.5 that we just enabled, it’s in area 5. Now the question is, who is the area border router that can support area 5? And the answer is, R5 is not connected to the backbone. It’s not an ABR. And that is the only router connected to this area.

00:28:55
So what’s happening is we’ve broken the rules of OSPF, is that every area has to have at least an ABR to connect it to the rest of the infrastructure. So some solutions would include creating a virtual link between R5 and R4 right here. However, you can’t do that over a stubby area, so that’s out.

00:29:12
We could create a GRE tunnel, and add that to area 0. And that would logically connect R5 to the backbone area, but then we have to add additional addresses and everything else. The best solution, if it’s allowed, is to simply change this loopback interface, and take it out of area 5, and simply make it in area 45, because area 45 does have an ABR– that’s R4. So by making that change, and following the rules for OSPF, that should allow the 5.5.5.5 network to be propagated and advertised through the OSPF infrastructure.

00:29:47
So here’s what we’re going to do. Let’s go ahead and remove the 5.5.5.5 network from area 5, and we’ll go ahead and put it as part of area 45. And because area 45 does have an ABR, that network should then begin to advertise throughout the OSPF network. So we’ll do a network 5.5.5.5 area 45. And with that done, we should now give it maybe a few seconds for convergence to happen, and then we should try our ping as we did previously.

00:30:17
So I’m going to use the up arrow key a few times. We’ll ping 11.11.11.11. We’ll source it from loopback 0. And if this does work, it implies also that R1 has a route back to R5. And there we have it. I have had a lot of fun. I appreciate you joining me. I hope this has been informative for you, and I’d like to thank you for viewing.

Surviving RIP

00:00:00
Keith Barker created a pretty famous saying now in the IT industry. It’s friends don’t let friends run RIP. But we might find ourselves working in an environment that does include the routing information protocol, and this would indeed include many levels of Cisco certification including CCENT through CCIE.

00:00:23
In this Nugget, let’s fix problems with RIP, and no, no I don’t mean ripping it out of your infrastructure. One of the things that’s really going to jump out at you when you start troubleshooting RIP– and its annoying, it really is– is that we have a technology here that does not form adjacencies.

00:00:42
So we’re really going to need an alternative mechanism when it comes to insuring that these devices are exchanging information. This is why it’s one of the rare cases that we turn to a debug. Debugs are never really our primary method of troubleshooting and you haven’t seen us do all that much of it in this particular course, but with RIP, we’re forced to often times.

00:01:10
So we do debug IP RIP. Debug IP RIP is going to really replace a series of show commands that we might be able to use for adjacencies because there’s no adjacencies here. So debug IP RIP is going to be our lifeblood to ensure that a particular device is sending the information that we expect and receiving the information that we expect.

00:01:38
Now something else that we need to be aware of when it comes to troubleshooting RIP is that there are indeed two versions– version one and version two. Both are distance vector routing protocols but the big difference is version one is indeed classful and version two is our classless version of RIP.

00:01:59
Obviously, classless is very preferable, and that’s what we would typically be utilizing today if we’re utilizing RIP at all. So each of the interfaces on your router has the ability to be configured to send or receive specific versions, and this obviously becomes a big area of our troubleshooting efforts.

00:02:23
Now another key difference between RIP version one and RIP version two is that RIP version one will broadcast its updates where version two will multicast its updates. Now right away an interesting area of troubleshooting here involves, does your link support multicasting? You can indeed like we saw with EIGRP, you can indeed utilize the neighbor command in order to use unicast updates instead of multicast them.

00:02:57
But what is very interesting about RIP is that you use this in conjunction with the passive interface command. Yeah, it’s pretty amazing. The passive interface command is needed to suppress the multicasting and then the neighbor command is utilized to actually send the unicast updates.

00:03:19
Now what about filtering? Well one of the interesting filter techniques that exists with RIP and certainly could be a reason why you’re not seeing particular RIP prefixes is an offset list. An offset lists allows you to add to the metric of particular prefixes.

00:03:38
We remember with RIP we have a very small maximum metric. The maximum number of allowable hops is 15 with this particular protocol. Sixteen is considered unreachable and the update will be trashed. So keep this in mind. And offset this can increase this metric and an offset list could even kill off the particular update by hitting that maximum metric amount.

00:04:06
So all of this adds up to quite a bit to actually worry about and troubleshoot with this particular protocol. I want to emphasize, and I’m sure Keith we’ll do this too, I want to emphasize practice. You know, what definitely happens is students will tend to forget, they’ll tend to leave off their practicing with RIP both its configuration, its verification, and its troubleshooting because they say to themselves, well it’s such a ridiculously simple protocol, I don’t need to worry about that.

00:04:39
Well one of the reasons that we elaborated on the fact that it does not form adjacencies like other more robust and scalable routing protocols that we’re used to really does indeed make troubleshooting a bit more challenging than you would expect. So please don’t fall into that category of certification students at least that really catch themselves getting points lost due to routing information protocol.

00:05:08
They don’t realize just how tricky it can indeed be to troubleshoot. In production environments, I suppose people avoid problems with RIP by simply not running it. Well Keith, let’s have some fun at the command line troubleshooting this not so exciting protocol.

00:05:25
All right, Anthony, thank you for the console. Anytime we get in a little bit of fun at the expense of RIP, I am all in. Let’s start off on R2 and just verify that RIP is running. The topology is in the upper right hand corner. Just to make sure that RIP is running, if RIP is not enabled on the router that would be a great reason why it’s not working correctly.

00:05:43
And so here we see that RIP is enabled. There’s currently no filter set inbound or outbound. I also notice that the timers are not the default. The default update interval is 30 seconds on RIP and this is set to 10. I also note that the invalid timer and also the flush timer are not set to the default.

00:06:02
So timers in RIP don’t have to be identical to each other but we certainly want to make sure that we have updates that are happening before they get flushed by another router. And check this out, we’ve got it enabled on these two interfaces due to this network statement of 172.16, but we’re only sending RIP version one.

00:06:19
So unless we had some crazy reason or requirement to only send version one updates that do not include a mask, we want to go ahead and correct that. So let’s correct that version one issue right off the bat, and then we’ll verify that the changes we make by changing into version two of RIP, that they also are reflected in the show IP protocols output just to make sure that it’s doing what we think it’s doing.

00:06:43
So based on the new output, we are sending and receiving version two. And I suppose the only great reason to send version one would be to some legacy old device that didn’t know how to process or work with version two. Otherwise, if all the RIP speakers on your network, all the routers doing RIP, understand version two, setting it to version two would be a great step administratively just to make sure we’re not missing any masks.

00:07:06
Now regarding those timers which are absolutely not the default, let’s take a look at the configuration of RIP on R2 just to see what they have the timer set for. So we’ll do a show run, and we’ll say, I just want to see section router RIP. So this is our update timer.

00:07:21
This is the invalid timer. That’s the hold down and that’s the flush timer. So those are all reflected up here as well. Here’s 10, 15, zero, and flush. So you know what we could do is we could make all the timers on all three routers identical to each other.

00:07:36
And again, they don’t have to be identical like to the second, however, we have to make them tolerable of each other. For example right now, if this router does not see an update within 15 seconds, it’s going to consider the route invalid. A neighboring router with the default set, for example, like R1, if he only sends updates every 30 seconds, that’s a problem.

00:07:56
Another concern that I hear all the time is OK, how does the hold down timer work? How does this work and how does that work? And I always say, lab it up. Lab it up. And the response is, I need to bring a lunch or something if I lab it up because it just takes so long for things to happen like the default flush timer, for example, being 240 seconds. Who’s got that much time to wait? So one of the recommendations for studying and practicing with RIP is go ahead and set your timers down intentionally low, make them the same across all your routers, and then you don’t have to wait, for example, 240 seconds for a route to be flushed from the routing table.

00:08:33
In this example, all we have to do is wait for 20 seconds after that route has no longer been seen and it’ll be flushed. So let’s do this, let’s set the timers to some very low values in all three routers including R2. So let’s use an update timer of five seconds, an invalid timer of 10, zero for hold down, and a flush timer of 15 seconds. And we’re also going to set version two on all the routers just to make sure that everybody is running version two for both sending and receiving.

00:09:02
Let’s do that same treatment over at R1. So at R1, we’ll simply go into configuration mode, router RIP, set the timers for RIP, make sure it’s set for version two. If it was already set for version two, it’s not going to harm anything. I just want to validate that version two is being used on this router as well.

00:09:18
And we’ll do the same treatment over on router three. So over on router three, again, into configuration mode we go, router RIP, and set the timers exactly the same there as well. And now as things change in our network, we won’t have to wait. For example, if we do a debug, we’ll only have to wait like five seconds to see the updates as opposed to doing a debug and waiting potentially 30 seconds to see an update. So let’s see if we have some routes that we’re even learning.

00:09:43
Let’s go to the left side of our network over to R1. And on R1, we’ll do a show IP route. Please show us the RIP learned routes if there are any, and sure enough, we’ve got one network, that’s the 172.16.23 network which is the network connecting between R2 and R3. We’ve learned it via RIP.

00:10:00
That’s a good first step. However, what I don’t see here is I don’t see the loopback, the 10.33.33 network that R3 has. We should also be learning that via RIP. And I don’t see it here. So let’s make a road trip over to R3. Let’s just make sure that R3 has that network directly connected and that it’s being advertised as part of RIP.

00:10:22
So we’ll do a show IP interface brief for just the interfaces that have IP addresses assigned. And sure enough, there’s the IP address 10.33.33.3. I wonder why it’s not advertising it. Let’s take a look. We’ll do a show IP protocol so that’ll show us a lot of information about RIP including which interfaces are participating in RIP.

00:10:41
And this shows us that the only interface that is participating is loopback zero. It’s also showing down here that we’ve got a passive interface for gig 1/0. So let’s do this, let’s go ahead and remove the passive interface command on R3 regarding gig 1/0, and take one step closer to having this work for us.

00:11:02
So to do that, we’ll go into configuration mode for RIP. And we’ll simply say no, no passive interface for gigabit 1/0 and see if that makes a difference. Now one thing that Anthony mentioned was that because we don’t have formal adjacencies with RIP, we might use some debugging tools to help see what’s really going on as far as sending and receiving of updates.

00:11:22
So let’s do that now and see whether or not R3 is advertising that 10.33.33 subnet. So with debugging on, the cool thing is it won’t take very long because our update timer is set for every five seconds. So I’m going to go ahead and turn off debugging with an un-debug all.

00:11:40
And all I see going on here is I see updates via the loopback, but I don’t see any updates being sent through the gig 1/0 interface. Now a moment ago, we validated that it was a passive interface which would prevent the setting of updates on that interface.

00:11:55
However, we removed that. So let’s take a closer look at the output of show IP protocols and see if there’s something else that perhaps we missed. And sure enough, where it says routing for networks right here, this reflects the network statements in our router configuration mode.

00:12:11
And unfortunately, it’s only including interfaces that begin with 10, the class A address space of 10. It’s not including gig 1/0, which has an IP address of 172.16, et cetera. So what you and I get to do is go back into RIP and simply add that network. So we’ll do a network 172.16.0.0. Because of the legacy nature of RIP, when we enter a network statement, it’s only going to pay attention to the classful boundary.

00:12:37
So for example, even though we’re running RIP version two, if we want any interfaces that start with 172.16 anything, even if they’re 24-bit or 28-bit or whatever networks, we still only care about the first two octets for a class B. And if you put in extra characters, it’s simply going

00:12:54
to zero out the extra characters beyond that 16-bit default mass for a class B in the running config. So now let’s do this, let’s turn on a debug and see whether or not we are now advertising that 10.33 network. And I’m going to go ahead and turn off debugging because it appears that we are advertising that.

00:13:12
So with debugging off, it now says out of the gig 1/0 interface, we are now advertising the 10 network. OK, so we’re close. We’re closer than we were. It’s now advertising the 10 network but it appears, based on this advertisement, that auto summary is still on.

00:13:31
And auto summary is on by default with RIP. So let’s do a show IP protocols just to verify that once again. So with the command show IP protocols, sure enough, the auto summarization to classful boundaries is on. We want to go ahead and disable that. So we’ll go back in the configuration mode and under RIP, we’ll say no auto summary please.

00:13:52
And then, if we do another debug, we should see the 24-bit, the real network length of the address being advertised. So with our debug IP RIP in place, and again, it shouldn’t take long because our updates are every five seconds. I’ll go ahead and turn off the debugging as well because we’ve seen all the information that we need to and that is that we are now sending these updates the 10.33 network with a 24-bit mask that is currently being advertised. So now, if we go over to R1, we should be able to see that in the routing table.

00:14:24
So we’ll do a show IP route based on RIP learned routes, and sure enough, there’s our 10.33.33 network right there. That is fantastic. So now that we have the route in our routing table, let’s verify that we have reachability. It’s one thing for a router to believe that it has reachability, it’s another to go ahead and do it.

00:14:42
So we’ll do a ping to the actual loopback on R3 which is 10.33.33.3 just to verify it works. We’re also going to source that from our loopback zero which will validate that R3 has a path back to our 10.11.11 network. And bummer, that is timing out. So who has the problem? Is it us who really can’t get to 10.33.33? Or is it the far side who can’t get back to our 10.11.11? One way of validating that is we could source the ping from the normal closest interface gig 1/0 instead of from our loopback.

00:15:19
And if that ping works, that implies that R3 has reachability back to 172.16.12 but not to 10.11.11. And that ping did work. Now what’s so funny about that is right there. Sometimes on ethernet we would expect to have a lost packet due to ARP somewhere in the path, but I wouldn’t expect it to be the last packet that doesn’t make it.

00:15:43
So let’s go ahead and do that ping one more time and just verify that all five pings went through. OK, that’s great. So what we can deduce from this is that R3 doesn’t have reachability back to 10.11.11 network. And why is that? Maybe R1’s not advertising it. Let’s take a look at the show IP protocols on R1 just to validate if R1 is advertising that network. So we don’t have an outgoing filter, great.

00:16:07
So we’re not stopping it. Our timers are set just like everybody else’s. Our gigabit interface is participating in RIP, but I do not see the loopback zero interface which is the 10.11.11 network. And that is very likely due to this network statement that says only include interfaces that begin with 172.16. We’re simply missing a network statement for the 10 network on R1. And you know what, while we’re at it, we also may want to fix this problem because if we don’t disable that, we’re going to be advertising a 10/8 network instead of the full 24-bit network. So let’s take care of both of those issues right now.

00:16:45
So we’re going to go into configuration mode, into router configuration mode specifically, take off the auto summary with a no auto summary. And then we’ll also add a network statement for the 10 network which is on loopback zero. And having done so, what we could then do is try once more an end to end ping.

00:17:03
And if this ping works, source from the loopback on R1 all the way to the loopback on R3, it verifies that both R1 and R3 have correctly learned the routes. And the only routing protocol that we’re running is RIP. And with that success, we have an end to end home run.

BGP Refuses to Neighbor!!!

00:00:00
Hey, in this CBT Nugget, we are going to examine, and by we, I mean myself and my dear friend, Keith Barker, we are going to examine what could go wrong in a BGP adjacency. We’ll have to look at iBGP and eBGP in this Nugget. Let’s jump in. (SINGING) Why won’t you be, why won’t you be, my neighbor? In this Nugget, we’re going to take at look at iBGP and eBGP adjacencies.

00:00:26
I want to make sure that we thoroughly review both of those with you. Remember, there’s going to be nuances to an eBGP peering compared to that of an iBGP peering. If you can’t keep these straight, there’s a good chance you’re going to have issues. And those issues, of course, will lead to the non-proliferation of those prefixes.

00:00:49
We’ll examine those possible issues in great detail. And as always, we’ll ask our dear friend, Keith Barker, to do some devilish demos for us at the CLI. By the way, this particular Nugget is relevant for CCNP and CCIE certification levels. If you’re interested in CCENT or CCNA certification, I certainly encourage you to watch this Nugget.

00:01:16
But please be aware, the information contained herein will not be tested at your certification levels. So of course, we recall when it comes to border gateway protocol, there’s actually really two categories we need to master, right? There’s the iBGP category of BGP’s operation.

00:01:38
And there’s the eBGP. iBGP involves peering between members of the same autonomous system. So here in our illustration, we have autonomous system 65120. And we have autonomous system 65300. R1 and R2 are both members of AS 65120, and therefore, they form an iBGP peering.

00:02:04
R2 and R3 are members of different autonomous systems, and as such, they form an eBGP peering. Now you probably recall that border gateway protocol is going to use open messages in order to initiate this adjacency formation. And remember, BGP is going to use transmission control protocol for its transport, specifically TCP port 179. Now if we look at R1 and R2, for example, one is going to play the role of a client, if you will, in this relationship.

00:02:41
The other is going to play the role of a server. The client will attempt to initiate the adjacency first, using a destination of TCP port 179. The device that’s playing the role of server might also attempt to initiate the connection. So there’s always going to be some question as to whether it was R1 that successfully initiated the relationship or R2. There are commands in BGP that you can use to manipulate this, forcing the TCP communications in a certain direction, if you will, for the formation of an adjacency.

00:03:19
And this is obviously an important process that you would want to consider when you’re dealing with a firewall that might be in place between these two devices. Obviously, one technique with your firewall would be to open up TCP port 179 in both directions, and that would permit for either side to initiate the adjacency formation.

00:03:42
Now something else to remember as we review the basics when it comes to adjacencies with our border gateway protocol is the fact that we are going to want to form the adjacencies between loopback addresses. Why we love loopback interfaces for the adjacency formation is the fact that if there are multiple paths between our devices, the border gateway protocol can take advantage of those multiple paths from a peering perspective.

00:04:16
Another great reason, of course, is that the virtual loopback interfaces aren’t going to be subject to the same problems of physical interfaces. So we have some more reliability built into this adjacency process. The utilization of loopbacks isn’t the biggest deal for iBGP peerings.

00:04:37
But for eBGP peerings, it can present a particular issue. And that issue, of course, is the fact that eBGP peerings are, indeed, expected to be directly connected. It’s assumed that an eBGP speaker will, indeed, be directly connected to its other eBGP speaker.

00:04:59
Since loopback interfaces don’t provide us with the opportunity to be directly connected, we have a particular issue when it comes to loopbacks and eBGP that is easily solved. We just need to make darn sure that we remember this potential problem in our BGP configurations.

00:05:21
Hey, by the way, when we use the loopbacks for the peerings, even in an iBGP relationship, we do need to remember the update source configuration command, right? We need R1 and R2, in this example, to know that the updates should be sourced from those loopback interfaces instead of the physical interfaces.

00:05:47
Well that’s enough review of BGP adjacency-type situations. From this discussion, let’s go ahead and begin our rather lengthy list of things that could go wrong when these devices are trying to form their neighborships. So as we discussed in our review, one the first things we want to think about is are these BGP open messages making it through? Is TCP port 179 punched open in access control lists or in firewalls? And is the directionality of that particular access control correct? And this actually leads me to something else that I love to check, by the way, and it can be an easy thing to forget.

00:06:37
Do you have reachability? Yeah, that’s right. Can you actually get to the peer address that you have configured in border gateway protocol? Great example is with our lookback peerings, right? So we have a loopback 0 on R1, and we want to peer with the loopback 0 of R2. Probably a pretty good idea to test if you have basic TCP/IP reachability between those loopbacks, because if you do not, of course, border gateway protocol will not meet with a joyous relationship.

00:07:13
Now something else that we want to give close inspection to, pretty easy to mess it up, believe me, especially if we are in an exam type of an environment and we must configure border gateway protocol very quickly, more quickly than we might do in the real world.

00:07:30
Really took take a look at your neighbor statement, right? Remember, you’re neighbor statement is going to include the IP address that you are wanting to peer with. And your neighbor statement is also going to include the AS number that you are attempting to peer with.

00:07:50
Really pause for a moment as you enter or in these numbers and ensure that both the peer IP address is correct, as well as the peer AS number. Obviously, in the case of an iBGP peering, this number is going to match your autonomous system, locally, on the device.

00:08:11
In the case of an eBGP peering, it is going to differ. And something else that we need to be aware of is timers. That’s right. We can specify a particular interval that your BGP speaker will send its keepalives. And as you might guess, there could be an interval that is configured for the hold time, how long your BGP speaker cannot hear a keepalive from its neighbor before it declares that neighbor dead.

00:08:42
You can have varying values here between two BGP speakers. And of course, if they agree on some value that’s too low for some maybe unstable environment that you’re in, obviously that could destroy a neighborship when maybe you wanted a more liberal value, a less conservative value for the devices for their hold time.

00:09:09
Something else to be aware of is as of 12.x of the IOS code, we can put in a minimum hold time. And this can be done on a neighbor by neighbor basis. The idea behind the minimum hold time is your router won’t form an adjacency with a particular speaker out there if they do not feature a hold time of at least some minimum value.

00:09:36
You could see Cisco having a lot of fun in a certification exam with this particular setting. Now, by the way, if this were to be set and your speaker doesn’t feature the minimum hold time configuration required, there would be a console message indicating this fact.

00:09:55
And one game that Cisco loves to play in a certification environment is they love to suppress those messages from showing up. So that would be a tricky one to track down, certainly worth you looking at timer values that are set both globally under the process as well as on a neighbor by neighbor basis.

00:10:16
Now as we discussed in our review, watch out for the update-source command. Remember, the update-source command is going to need to be configured when we are peering from those loopback addresses. Because a packet would never naturally be sourced from a loopback address.

00:10:37
So update-source tells our neighbor, look, have no fear, expect packets to come from me, from a loopback address, and go ahead and form the peering. In the case of eBGP, as we discussed, there’s that assumption of direct connectivity. In order to get around that with peering from the loopbacks, we use the command ebgp-multihop.

00:11:05
Yeah, the ebgp-multihop command allows you to put in like three or four or 10. It allows you to put in a value that indicates two ebgp. There will, indeed, be multiple hops, and that’s OK in order to get to my eBGP peer. A lot of students will get lazy with this, and they’ll set it to its maximum value of 254. I think this is dangerous.

00:11:37
I think that the ebgp-multihop setting should be set to the realistic number of multiple hops that it’s going to require to get to your peer. And finally, we remember that border gateway protocol does, indeed, support MD5 authentication. And this configuration is amazingly simple, especially if you compare it to something like the key chain approach that would be used in an EIGRP.

00:12:09
In border gateway protocol, the MD5 authentication is done very simply with a neighbor password type of command. And obviously, this is quite simple to configure. But something to watch out for, in fact, Keith will demonstrate this for us, is that Cisco’s adaptive security appliances will not permit this particular option, by default.

00:12:36
So when we have an ASA between our border gateway protocol speakers that want to do authentication, the ASA, in its default configuration, will indeed kill this neighborship. Speaking of Keith, and speaking of his devilishly fun demonstrations, Keith, please take over right now, and bring this to life at the command line for us.

00:13:00
Hey, thanks Anthony. I would love to. One of the challenges of troubleshooting is sometimes we didn’t put the networks in place. So what I thought would be fun for us, today, would be to put in the BGP configurations from scratch on each of these routers R1, R2, and R3. And then we can’t blame the BGP config on anybody else, because we’ll have done the whole thing together.

00:13:22
So here on R1, we’re simply going to make one neighborship over with R2. I’m including the update source as loopback 0, which is 1.1.1.1. And this configuration on R1, I think, is complete. Let’s head over to R2 and configure it. Now here on R2, we’re going to have two peers. We’re going to have an internal peer with R1, and we’re going to have an external peer with R3. So we’re going to use autonomous system 65,120. I’m going to go ahead and say redistribute connected so we have something to share with both of our neighbors.

00:13:53
And then I’m going to specify neighbor 1.1.1.1. I’ll include their same autonomous system as we’re using, because he’s the same AS as us. And we’ll also specify that we’re going to update source, meaning we’re going to go ahead and send those BGP requests for connection from our source IP address of loopback 0. Then we’ll set the same treatment for R3. Except for this time, because it’s an external BGP neighbor, we’re going to include the appropriate AS for that remote system.

00:14:18
And I think we’re done with R2. So we make a road trip over to R3, and on R3, we simply configure router BGP, the autonomous system number. We’ll throw in all the connected networks, so BGP has something to play with. And then we’ll specify the neighbor over at 2.2.2.2. We’ll specify the remote neighbor’s autonomous system number, which is different than ours, which makes it an external BGP neighborship.

00:14:41
And we’re also going to set up authentication with a password of Cisco. You might notice that I’ve got the console logging disabled on each of these routers. And that gives you and I the opportunity to troubleshoot by focusing on the output of the commands that we issue.

00:14:58
So let’s, you and I, head over to router two. It’s in the middle. I mean, that’s a great place to start troubleshooting, divide and conquer, and all that. So let’s go ahead and just do a quick, basic verification regarding our BGP neighbors with show IP BGP summary.

00:15:11
And this is saying that the current state is idle. You know what, that doesn’t sound too good. Now since this says this connection has never been up, this isn’t due to a keepalive being missed or anything like that. Because we’ve never seen that neighbor before, based on this output, we’re in the idle state, because we cannot establish a TCP connection with that peer.

00:15:34
So I’m thinking about something that Anthony said regarding reachability. We want to make sure that we have basic IP connectivity and routes to a given destination. That’s really, really important. And TCP makes its initial connection from a client to a server to the destination port of TCP 179. So let’s do this.

00:15:52
Here on R2, let’s just verify that we have a route to the IP address of 1.1.1.1. We’ll start with that neighborship first, and then we’ll work our way over to R3. And this says, hey, I have no idea how to get to 1.1.1.1. There’s not a default router, anything else on this router that’s telling me how to get there.

00:16:11
That’s what this output means to us. Now fortunately, you and I know how to add a static route. So in this case, we need to make sure we have a route to that 1.1.1 address, so let’s go ahead and put one in. So we’ll simply do an IP route in configuration mode to get to 1.1.1.1. We’ll put a 32-bit mask on that. The next hop is 10.0.0.1, which is R1’s ethernet address on a directly connected network.

00:16:37
Now that we have this static route in place, let’s go ahead and do a couple things. I’m going to do a clear IP BGP asterisk. I mean, we don’t have any neighbors or didn’t a moment ago, so we don’t need to worry about a whole bunch of neighborships. And we’ll do a ping of 1.1.1.1, source it from our loopback 0, which is 2.2.2.2, and that is successful. So now, at least we have IP connectivity.

00:16:59
Let’s go ahead and verify whether or not the BGP neighborship is now coming up. To verify that, we’ll do a show IP BGP summary. And if that looks good, which it does– so we have two prefixes that we’ve received from neighbor 1.1.1.1. Somebody told me once that whenever you have a success in troubleshooting, you may not want to lay out on the beach of that success for too long, because there’s other problems that we need to deal with.

00:17:24
So we will say that this, right here, the fact that we have two prefixes we’ve received from R1, is a great sign that it’s a great neighborship. Let’s move on to troubleshooting R2 to the external BGP neighbor of R3. So on R2, let’s just verify that we have a route to R3, as well, because we didn’t have a route to R1. If that’s the case with R3’s 3.3.3. network address, that’s a problem.

00:17:50
So we don’t have a route there. We need to go ahead and add one. So let’s go ahead and add a static route to reach 3.3.3.3. The next hop is 23.0.0.3. And that’s the ethernet address on R3. Let’s do a quick ping just to verify that we can reach that IP address.

00:18:07
So we’ll do a ping of 3.3.3.3. We’ll source it from our loopback 0 interface, which is 2.2.2.2. And wow, that’s great. We have success. We have connectivity. And that’s a huge first step. So let’s take a look and see if our BGP is doing OK between R2 and R3. So we’ll do a show IP BGP summary.

00:18:26
Now check this out. We were a neighbor with R1, and we had prefixes that we learned, and they are currently gone. And it looks like we don’t have success with R3, as well. So my friend, our troubleshooting work it is not quite done. Now, one of the temptations that I have is let’s go jump back at R1, find out what happened to him. However, since we’re working between R2 and R3, let’s finish that train of thought first.

00:18:50
And then we’ll go back and take a look and see what’s up between R1 and R2. So on R2, let’s do a show IP BGP neighbor for neighbor 3.3.3.3 to get a better look at the details of that neighborship or what we hope to be a neighborship. So I’m going to use the space bar and advance the page so we can see the rest of the output.

00:19:10
And I’m noticing some interesting details. First of all, it says right here that we do have a route to 3.3.3.3. That’s great. It’s also showing that the external BGP neighbor, because that neighbor is not the same AS as we are, it says it’s not directly connected, which BGP, by default, is expecting it to be.

00:19:29
So what that’s screaming out is that we need to add that eBGP multihop option. Otherwise, these external BGP neighbors trying to use non-directly connected to each other interfaces for connectivity is not going to fly. Also, if they’re seeing each other as their looback 0. That’s how they’re configured.

00:19:46
So R2 is pointing to R3’s loopback, and R3’s pointing to R2’s loopback. By default, they’re sourcing the packets from their physical interfaces, on the edge. So to make all this work, we’d want to include the multihop, because they are not directly connected.

00:20:02
And secondly, we need to specify the update-source as loopback 0’s, so when the requests come in from one side to the other, the receiving side can say oh, here’s an inbound connection coming from, and let’s say it’s R2 making the initial request, R3 can say, oh here comes a request from 2.2.2.2. Oh, that IP address is listed as a neighbor, I’ll go ahead and become a neighbor with him, or at least attempt to become a neighbor with him via BGP.

00:20:30
So let’s fix those two items, the external BGP multihop option, as well as the update loopback source on both routers, just to make sure they’re set. So here on R2, we’ll go ahead and add both those items. Now, one of the things that often people will say is they’ll say, Keith, you know with an external BGP neighbor, we may or may not be actually peering with the loopback of that neighbor.

00:20:50
And the reality is, so what? In an environment where you’re being tested on troubleshooting or validating BGP, if we’re given a scenario and we need to make it work, we need to make sure we understand all the tools and all the pieces to make those pieces work together.

00:21:05
So in this case, on R2, I’m going to do the eBGP multihop option saying it’s OK to be two hops away. And I’m specifying update sources loopback 0 on R2, as we peer with a neighbor, R3. We’ll also go over to R3 and apply the equivalent commands from R3’s perspective. So here, on R3, we’ll go into configuration mode for BGP 65300. And we’ll say hey, for that neighborship over with R2, we need to do multihop.

00:21:33
And we’re also going to specify an update-source of our loopback 0, as we communicate with the peer or the neighbor, 2.2.2.2. And that way if R3 initiates the connections, the source IP address will be 3.3.3.3, which will match the neighbor statement inside of R2, which again, should make for a happy scenario.

00:21:52
So next, let’s just go ahead, and I’m going to turn on logging. Because I want to share with you something that might be tricky to identify without logging enabled. And it would be helpful if I had turned on that logging into the console from configuration mode, so it works.

00:22:09
So now we’re logging into the console. And I want you to notice what’s coming in. Earlier, we configured authentication using the password Cisco on both sides. However, R3 seems to be complaining about no MD5 digest coming in from R2. And let’s do this. Let me go ahead and do a no logging console, now that we see that message.

00:22:29
And this also indicates something interesting. This is showing us that 2.2.2.2, which is R2, is using this TCP port. And R3 is using this TCP port. And what that implies is that R2 initiated this connection. It grabbed a high numbered port that wasn’t in use and sent a connection request over to 3.3.3.3 at the well known port of 179. So in this example, R3 is acting as the BGP server, and R2 is acting as the client. And you know what? I have a little secret to share with you.

00:23:01
I could have predicted that, because in this topology, there is an ASA sitting between R2 and R3, and it’s in transparent mode. So it looks and feels like a bridge with an attitude. This is the inside interface, and this is the outside interface. Now one of the things that Anthony mentioned is that inside of BGP, if you’re going through an adaptive security appliance, and you’re doing authentication, the ASA has a little problem with option 19, specifically TCP option 19. And that’s what’s being used as part of the BGP authentication method.

00:23:38
So as the ASA strips that out, that MD5 authentication is not going to work successfully between R2 and R3. So to fix that, we go to the ASA or whatever your firewall is and say, stop it! And to do that on an ASA, we’re going to create a class map and a TCP map and then apply it to a global policy.

00:23:57
But the key element I want you to know is that the ASA eats TCP option 19 for lunch. It does not let it pass by default. Well, tell it that option 19 is perfectly fine. Do not strip it out. Let it go. So let’s make a road trip over to the adaptive security appliance, where logging also is not enabled.

00:24:17
And let’s go into configuration mode, and I’m going to create an access list. In effect what I’m going to do is make an access list that says, for any TCP traffic that is destined to or from BGP– so in a nutshell, the access list is looking for and trying to match on BGP traffic.

00:24:34
The class map says, you know what, I want to use that access list to identify traffic. The TCP map is saying, you know what, option range 19 through 19– that’s going to be perfectly OK. That’s what the TCP map is doing. And then we apply that to the global policy by simply saying for this class, BGP MD5 class map, we want to go ahead, and we want to disable the random sequence number for initial TCP sessions.

00:24:58
And we want to go ahead and apply the TCP map of BGP MD5 option allow, that we just created a moment ago. Now, having done all that, what that should do is allow the BGP authentication to work. In troubleshooting, what we could also have done is temporarily disable the BGP authentication between R2 and R3. And if that was the problem, the neighborship should come right up without the authentication applied and just as a temporary test.

00:25:26
Another thing that’ll kill the BGP authentication is that if we’re doing network address translation between R2 and R3, they see each other as different IP addresses due to NAT, that also will break the MD5 authentication. So let’s make a road trip over to R2, just for a moment. Let’s do a clear IP BGP for that peer of 3.3.3.3. And let’s do another show command just to see whether or not the changes we’ve made are now causing it to come up.

00:25:53
So it appears, check this out, we have two prefixes from both peers. [LAUGHTER] I honestly did nothing between R2 and R1. I have not touched it. I’ve stayed with you between R2 and R3 the entire process. But it appears that whatever it was that was that was working its way through between R1 and R2 has resolved itself.

00:26:16
So perhaps we just needed to give it a moment. So our neighborship with R1 and R2, based on the prefixes that we’ve received– and again, even if this had said zero prefixes received, that’s a very good sign. Because it means we have an active, working BGP relationship with that neighbor.

00:26:32
So in this Nugget, we’ve taken a look at the reachability, making sure we had connectivity between the peers, making sure also we had the neighbor statements correct, along with the update-source. We had to manipulate the external BGP multihop option between R2 and R3. And we had an MD5 authentication challenge because of a firewall that was hiding out in transparent mode between R2 and R3. I have had a blast spending this time with you in troubleshooting BGP neighborships.

Where are My BGP Routes?!?!

00:00:00
You’ve ensured that your BGP neighborship’s working great. You’ve used the network command, you believe, appropriately. Yet, prefixes just aren’t showing up in the IP routing or forwarding table. In this CBT Nugget, Keith Barker and myself will walk you through the step by step procedure that’s so critical for solving what can be a very tricky issue.

00:00:29
In this Nugget, we’re going to, specifically, take guidance from Cisco Systems on troubleshooting missing BGP prefixes from the routing table. And Keith and I are going to improve upon this step by step. As we go through our new and improved step by step, we’ll review the key concepts of BGPs operation.

00:00:51
And, of course, a demo is worth a thousand words, as we know. And we will examine these different scenarios, live, at the command line. Now, before we dig in too deep here, a quick reminder. Remember, there is a distinction in BGP between the BGP table and the routing table.

00:01:12
That’s right. So our inspection– our step by step of this problem– begins with the BGP table. Is the particular prefixes that we expect in the BGP table? If they’re not in the BGP table, there is absolutely no way they’re going to make it to that important routing table and actually have packets forwarded based on their information.

00:01:39
Now, again, by way of review, what is the BGP table? Well, it is a table that consists of all of the BGP prefixes that have been learned from all of our BGP neighbors. Yeah, it’s like the topology table that we have in EIGRP. It’s like the OSPF database that we have in, of course, OSPF.

00:02:02
It’s only prefixes that meet certain criteria that can be moved from the BGP table to the route table. It’s only those BGP prefixes that are marked as best routes that make it over to that routing table. And we’ll review some of this key criteria as we move throughout this Nugget.

00:02:23
So our very first point here in our step by step troubleshooting process for missing routes in our routing table that are from BGP is going to be with an inspection of the BGP table. How do we do this? The command show IP BGP. That right. Show IP BGP is going to show us our table.

00:02:51
And this is our first decision point because how we progress in our troubleshooting is going to be determined by whether the routes ever made it into this table to begin with. Let’s go ahead and begin our discussion by making the assumption that we run show IP BGP and they are not there.

00:03:15
The prefixes that we would expect to see are not in that BGP table. No need to panic. In this scenario, you’ve just gained valuable information. And you can begin these three important troubleshooting steps in order. Step one, you’re going to do show IP BGP summary.

00:03:37
You know what you’re checking here because you’ve covered our Nugget already on troubleshooting BGP neighbor relationships. Yeah, we are checking to see if we have a neighbor. It’s been scientifically proven that if you don’t have a BGP neighbor, you won’t get prefixes from that neighbor.

00:03:56
So let’s ensure we have our neighborship. If we do not, you want to revisit our Nugget on troubleshooting those particular neighborships. OK, our neighbor’s there. Great. Step two, we’re going to do show IP BGP neighbor– the IP address of our neighbor– and then the routes keyword.

00:04:17
We want to see if we are getting any prefixes from that particular neighbor. And what are those prefixes? This is information for us to go ahead and do step three. We’re going to go ahead and check any filtering. Seeing what we’re getting from our neighbor will help us determine if there’s filtering outbound and filtering inbound on either of these devices.

00:04:46
It will help us determine what may have gone wrong with the filter. Something else that we can check, based on this important information, is the use of the network command on the appropriate advertising device. We remember that the network command can be somewhat tricky.

00:05:06
Especially, if you’re not used to it. The network command must reference the actual prefix that is in the IP routing table of the device that intends to advertise that prefix. For instance, if we had a prefix in there of 192.168.17.o with a 24-bit mask, we need to make sure the network statement reflects that appropriately.

00:05:35
I suppose a better example would be a case where we’re, actually, doing crazy mask, right? Like 10.10.10.0/24. This mask is not the classful mask for this particular address space. And the network statement must reflect this. So if we were lazy and we went in and did a network statement of just network 10.0.0.0, this is not going to, actually, advertise the prefix.

00:06:09
So here is our simple step by step process for what appears to be a nightmare of a situation. We have no prefixes in the BGP table. Look, this is not a terrible situation because we have this step by step for you that can easily troubleshoot that scenario.

00:06:31
And now it’s time to turn our attention to our initial check. And we learn that the prefix is in the BGP table. Hmm. I think there was a ’90s song, “Things That Make You Go Hmmm.” This is definitely one of those moments, right? And this seems like it would be much, much more complex.

00:06:53
But it’s actually not. Join me as we turn to the internet for assistance with this particular situation. Our Google search, here, is a simple one. Site cisco.com in order to restrict our search to just cisco.com. And we do BGP best path. I remember when I was studying for my CCIE lab practical examine in the area of routing and switching.

00:07:20
And BGP was a real weakness of mine. This was a document I spend a lot of time with. It, literally, spells out the very complex BGP Best Path Selection Algorithm. It’s interesting in our case here though because we’re not really going to be interested in that Best Path Selection Algorithm.

00:07:40
We’re interested in a section that comes before that. Let’s choose the very first link. The BGP Best Path selection algorithm document. And right here, why routers ignore paths. What does that mean? Why routers would have a prefix in the BGP table and ignore it from a routing table scenario.

00:08:07
My goodness, that’s exactly what we’re interested in. This is the last steps in our troubleshooting process, here, for this category. They’re in the BGP table. But they’re being ignored from a routing table perspective. First up, if it’s an IBGP learn path, which you can easily see in the show IP BGP results, and we are not synchronized, then it’s going to be ignored.

00:08:35
Now, synchronization is disabled by default for a long time in iOS software. But realize that someone could’ve inadvertently, or maliciously, turned it on. A great example of someone maliciously turning it on is an exam author. Yeah, sure. So paths are not synchronized.

00:08:57
Remember what synchronization means. The BGP process will ignore the prefix if it doesn’t also exist in the underlying IGP. So this could be our issue. How about the Next Hop is inaccessible? Sure. If BGP sees that it can’t get to the Next Hop reported for that particular prefix, it will ignore it.

00:09:24
It will not insert it into the routing table. Paths from an EBGP neighbor that have a local autonomous system in the AS path. Ah ha. We know this is an important check. So this is, in fact, a loop prevention mechanism. And that might be the reason why it’s being ignored.

00:09:45
If you enabled BGP enforce first AS and the update does not contain the AS of the neighbor as the first AS number in the AS sequence. This is, obviously, a rare corner case type scenario. But these are the things that exam authors love to bury in the configuration to really try and mess you up.

00:10:11
And finally, paths that are marked as received only in the show IP BGP longer prefixes output. What’s happening here is that you have a filtering policy that’s rejected the paths. But the routers actually got them in the show IP BGP output because you’ve configured soft reconfiguration inbound for the neighbor that’s sending the path.

00:10:37
So there we have it, folks. From the, kind of, obscure to the much more common. Next Hop inaccessible, right? A loop prevention problem. A synchronization problem. This is a comprehensive list for us of why the paths are ignored. And, of course, once paths meet all those criteria and they’re no longer ignored, then the best path algorithm can work its magic.

00:11:08
So, Keith, it was my intent to go ahead and make the seemingly complex very simple for us in this Nugget. And I wonder if you wouldn’t mind bringing us to the command line to show us this particular process live with actual routers, actual BGP processes, and, yes, actual prefixes.

00:11:30
Hey, thanks Anthony. I would love to. As far as the actual routers we’re going to be using, let’s use these four right here. Each router has a loopback zero interface. And that interface address is the router number. So R1 is 1.1.1.1, and R2 is 2.2.2.2 and so forth. Also, the last octet on these Ethernet interfaces also matches the router.

00:11:50
So on the 10 network, R1 is going to be .1, R2 is going to be .2 and so forth. So our mission in this troubleshooting exercise is to go through and identify what the problems are and correct them. And as we do, we’ll reinforce all those concepts that Anthony just shared with us.

00:12:05
So let’s start off on R1 and just validate what’s in its BGP table. So we can really easily do that on any these routers. The simple command show IP BGP just to see what’s in there. Now, at the end of the day when everything’s working, the BGP table should have every single loopback in it.

00:12:23
And the routing table should also have every single loopback in it. Now, unfortunately, as we look at R1, there’s something missing. Now, what’s missing on R1 in the BGP table is that there is no 1.1.1.1. Now because Mr.R1 owns that route or is supposed to own that route, if it doesn’t make it into BGP, we are definitely not going to be sharing it via BGP with anybody else.

00:12:45
So, together, let’s just validate that that IP address really does exist on R1 to begin with. And then we can start investigating why it’s not making it into BGP. I’m going to do a show IP interface brief. And I’ll say exclude unassigned. And that way, we don’t have to look at any other interfaces that don’t have IP addresses.

00:13:03
And what this says is that loopback0 does indeed have an IP address of 1.1.1.1. But what I think we ought to do is we ought to also verify what the actual mask is. That way we know what we’re dealing with. Is it a slash 24? Or is it a slash 32? And we can easily see that with the show IP interface.

00:13:20
And we’ll lock it down to loopback zero. And we’ll restrict the output so we’re only seeing lines that include the word internet address. So there’s the address. So it’s 1.1.1.1/32. Now, the challenge is how do we get that into BGP because, right now, it’s not there.

00:13:37
So there’s lots of different ways of getting a route into BGP. We could do a redistribute command. We can use a network statement that includes it as long as we get the mass correct with that network statement. So let’s just take a look at the router configuration section of R1 regarding BGP. And see if the network statement is correct to include the network 1.1.1.1. And it appears, based on results, that we have no network statements here, nor do we have any redistributive statements.

00:14:06
And that would explain why the poor network 1.1.1.1 is not making it into BGP. So it looks like a misconfiguration of BGP. So we’ll, simply, add that network in. So we’ll go into configuration mode. We’ll go into router BGP for Altona System 12. And we’ll put a network statement of 1.1.1.1. We’ll put in the appropriate mask of 32-bits. And that should, in just a moment, add that into BGP because we already saw that we have that IP address with that mask.

00:14:39
This worth statement should bring it in. So how would we validate that? What I like to do is, when we do a show command and then you make a change, you want to go back to that same show command looking for the difference. So let’s do a show IP BGP. And that does look a lot better.

00:14:54
So now we’ve got this network 1.1.1.1. We’ll notice that greater than symbol that indicates that’s the best route and usable, which is fantastic. The Next Hop is 0.0.0.0, which means it’s our self. The next challenge we have is what about the other prefixes? We should have 3.3.3.3 and 4.4.4.4 all in the BGP table. And we don’t have it all currently.

00:15:16
So while we’re here at R1, let’s just take a quick look and see if there’s any hard coded filters that would be preventing us from receiving those routes that R2 may be passing along to us. So to look for filtering, we could do a show IP protocols. And it’s going to quarantine off the output so we only see the show IP protocols related to BGP and any other IGPs that might be running.

00:15:39
So as we look at this, as far as the neighbor, we have 10.0.0.2. That’s R2’s address. And regarding all of these, it appears that nothing is set. And no filter list, inbound or outbound, are set. So the question here about the missing routes are is R2 not sending them to us? Or is he sending them to us and we’re filtering them? It doesn’t appear that we’re doing any filtering inbound.

00:16:00
Let’s ask R1, hey, can you tell me about the routes you’re receiving from R2? And he’ll tell us. Let’s do a show IP BGP neighbors, the IP address for our peer, and the keyword received-routes. And it’s telling us, hey, you don’t have self reconfiguration configured for the peer 10.0.0.2. Well, that’s no problem.

00:16:18
We can add it. So let’s go ahead, just for a moment, and add that feature. By default, you can see advertised routes. But in able to see received routes, you have to enable that peer that you want to see them from for soft reconfiguration. So we’ll go into router configuration mode.

00:16:33
We’ll specify, for that peer, we want to enable soft reconfiguration inbound. And now, that same command that we used a moment ago should now give us better results. So we’re just going to ask you, hey, what routes are you receiving from good old R2? Maybe he doesn’t have them to begin with.

00:16:47
Or maybe he’s sending them to us and we’re not accepting them for some reason. And it says the total number prefixes is just one. I’m just learning 2.2.2.2 from R2. And that’s it. That’s all I’ve learned. So we could say, hey, we might be halfway done with our troubleshooting because R1 knows about 1.1.1.1 and 2.2.2.2. There’s only two more to go.

00:17:06
And maybe it really will be that easy. However, I don’t think so. Let’s continue. It doesn’t appear that we have any inbound filtering here on R1. So let’s move over to R2. And we’ll take a look at him next. So we’ll make a road trip over to R2. And on R2, let’s do a quick show IP BGP. And it appears that we have a couple of routes there in BGP.

00:17:27
We have the 1.1.1.1 and the 2.2.2.2. So we look at this guy right here, the 1.1.1.1. You notice this little greater than symbol for 2.2.2.2? It says it’s the best route. And notice how the 1.1.1.1 doesn’t have it. So even though R2 can see 1.1.1.1 is being advertised from R1, for some reason, R2 doesn’t like it. R2 knows how to get to the Next Hop. He’s directly connected to that network.

00:17:54
So we can rule out, for that one, a non-reachable Next Hop. So we’ll make a mental note that this network, right here, has some type of a problem in BGP where R2 really doesn’t like it. So we get to cherry pick, you and I, as we troubleshoot. Let’s do this.

00:18:08
Let’s focus on why we’re not seeing the prefixes of 3.3.3.3 and the 4.4.4.4. We’ll follow that to completion first. And then we’ll come back, and we’ll take a closer look at this 1.1.1.1. So regarding R2 and its neighborship over with our R3, let’s just go ahead, real quick, and check for any filters in the BGP process.

00:18:27
We’ll do a show IP protocols. Please show us just BGP. And here, I don’t see any filters that are set. And I don’t see any kind of filtering here either. So in this scenario, we could have a situation where R2 was getting the routes but filtering them out based on some type of filter that was applied.

00:18:45
Or it could be that R3 is not sending the routes to begin with. We really need to dig in to find who’s at fault here. Is it R2 that’s filtering, which it doesn’t look like based on this. Or is it R3 who’s not sending? We can use that same trick of asking, hey, please show me the received routes from your peer.

00:19:03
Like we did over on R1, we could do that here on R2 referring to the peer R3. So let’s do a show IP BGP neighbor, the IP address of R3, and the keyword received-routes, and see what we have. And true to form, R2 is saying, hey, you know my peer 23.0.0.3. Inbound soft reconfiguration is not enabled.

00:19:22
And if you want to see the received routes, we need to enable it. That’s no problem. We can do that again too here. So we’ll go into router configuration mode for BGP, and we’ll enable it for the peer 23.0.0.3. And then we should be able to issue the command show IP BGP neighbors looking for received routes.

00:19:40
So now as we issue that command, hopefully, we’ll see some routes that are being received from R3. Now, if we don’t see any received routes, huh and look at that. It says total number of prefixes received from that neighbor is 0. So we aren’t learning any routes.

00:19:55
So our next step is to go over to R3 and find out what’s going on. Is R3 not sending them? Does he have a filter set up of some type? So we’ll go ask him. So as we make our trip over to R3 and we consider what’s going on here, it could be several things. One is maybe he doesn’t even have the routes in the BGP table to share.

00:20:13
Or maybe he has them and doesn’t want to share. Or maybe he’s sharing them and, for some reason, R2, who says I’m not receiving any routes, isn’t correctly accepting them. So let’s take a look on R3. So it’s show IP BGP. And this says that he’s got the two, and three, and four.

00:20:30
They all have the greater than symbol next to them, which is a good sign. This little r right here is because R3 also learned 4.4.4.4. through an IGP. And the IGP has a better administrative distance. So in the routing table, we would see 4.4.4.4 through the IGP. In this case, on R3 and R4, they’re running ISIS. And it wouldn’t be showing up as a BGP learned route.

00:20:52
But that’s perfectly fine. So let’s go ahead on R3 and ask him what are you sharing or sending over, as far as prefixes, over to R2? Are you really sending the 3.3.3.3 and the 4.4.4 networks? Please tell us. And to do that, we do a show IP BGP neighbors, the peer address of R2, and then say advertised routes. And look at this.

00:21:15
Wow. So what this says is, yeah, I’m advertising these two prefixes over to R2. And his attitude is I sent them. What else do you want me to do? I’m sending them over. And for some reason, R2 is not picking up on those. Now, what we could do is we could take a look at show IP protocols.

00:21:32
But it doesn’t look like we’re filtering them because up here it said, hey, I’m sending them. So now the question is if R3 is sending them to R2 and R2 is not accepting them, why is that? Now, sometimes we have to dig into the nitty gritty detail in the output of a show IP BGP neighbor command.

00:21:51
And what I’d like to do on this one is we’re going to do a show IP BGP neighbors for R2. However, I don’t want to include all of the output. I’m going to say please only include lines that have the word map in it. And what we’d see in that output is that there is a route map applied for outgoing advertisements.

00:22:08
And the name of that route map is called tag-1-on. So even though we’re not doing filtering, per se, we are manipulating some traffic as it’s going out with this route map. Now, the question is what is going on in that route map? And that’s a great question.

00:22:24
Let’s take a look at that. To see the details of that specific route map called tag-1-on, we just do a show route map, the name of that route map, and then we can look at the details. So any routes and any prefixes that are being advertised from R3 to R2 are going to be manipulated, or potentially modified, by this route map before they’re sent.

00:22:44
So in this route map, sequence number 10, there’s no match clause, which means everything that I send is going to have this applied to it. We are doing as-path prepending. We’re adding the AS number of 12, 34, and 34. [LAUGHING] Oh, that’s so mean. Now, that could be done to help make the return path through that autonomous system not look as good.

00:23:06
And that would certainly be OK, normally. However, look at one of the autonomous systems that were prepending. It’s 12. So what is R2– just think about this– going to see when it receives a prefix, looks at the autonomous system path, and says, oh, in the path it says autonomous system 12? That’s us.

00:23:23
Anthony told us that one of the rules about receiving routes and accepting them is that we cannot see our own autonomous system number, by default, in that path and accept it. So poor R2. Those prefixes come in, has its own AS number in it, and for that reason, he is rejecting those prefixes.

00:23:42
So if we look at the running config for the BGP, it’s going to show us that, sure enough, that route map called tag-1-on is applied outbound to the neighbor R2. And because the route map is prepending, R2’s own AS in the path is kicking it out. R2 is not happy about those prefixes and is rejecting them.

00:24:03
So we have a couple of options here. One, we could just remove the route map completely and take it out of the BGP configuration. Or we can modify the route map. Let’s go ahead and do that one. Let’s just tweak the route map. Let’s take out the autonomous system number 12, which was in sequence number 10 in that route map. And we’ll just put a sequence 5 in that says we want to go ahead and prepend 34 and 34. We should have a route map that has one set statement.

00:24:28
And that set statement should not prepend 12 as part of the AS path. And to verify it, let’s take a look. So here’s our sequence 5. We’re going to match everything because we didn’t put a match statement in. And we are now prepending just with 34 34. Now because we weren’t sharing anything of importance with R2 and this is a lab environment, I’m going to do something.

00:24:49
But in production, this could be fatal. I’m going to clear that BGP neighborship completely with R2. And that’ll force everything to come brand new. [LAUGHING] In a service provider environment, if you did this with a production peer, there could be a whole lot of rumbling going on.

00:25:06
So now that that’s done, let’s go over to R2. And let’s see if the prefixes of 3.3.3.3 and 4.4.4.4 correctly show up. So we’ll do a show IP BGP. Wow, that looks mostly good. So we have these guys. I still notice that we’re missing the greater than symbol there for the 1.1.1.1 . But we are now getting 3.3.3.3 and 4.4.4.4 because of the AS path prepending.

00:25:31
So let’s do this. Let’s ping– right here from R2– the loopback of R4 from our loopback. So if everybody’s got those loopbacks in their BGP and corresponding routing tables, that should have success. And look at that. The ping that didn’t make it– normally that’s the first one because of an ARP issue.

00:25:51
Let’s do that ping one more time. OK, so we have five out of five that are making it through. I’m just going to do that one more time just to be sure. All right. So it looks like we have really good connectivity between the loopback on R2 and the loopback on our R4. And that is most of the way.

00:26:05
I mean, two to four? That’s almost the whole way from end to end. However, let’s go ahead and do a ping from R1 all the way to R4 and see if that’s going to fly. So we make our journey over to R1. And on R1 we’ll do a ping of 4.4.4.4. We’ll source it from our loopback zero.

00:26:21
And I’m going to put a repeat count of two. That way we don’t have to do a control shift six to break the sequence, or wait for five pings to time out. Neither one of those pings made it. So let’s just take a look at the routing table on R1 and see if we have the route to 4.4.4.4. That would be a good thing.

00:26:40
And we don’t. We don’t have routes to 3.3.3.3 or 4.4.4.4. It’s no wonder our ping didn’t make it. So let’s take a look at the BGP table. Anthony told us about the process about prefixes having to be in the BGP table before having any chance of making it to the routing table.

00:26:57
So as we look at this BGP table, it looks pretty exciting, at first, until we notice some of the details here that we don’t have this greater than symbol. Now, in a large BGP environment, we might have multiple possible paths to get to a destination. So maybe not every single one of those prefixes we learned is going to be considered the best.

00:27:15
But I’ll tell you what, one of them needs to be the best if we want a possibility of that making it into the routing table. Another really important aspect that Anthony told us is that a route will not be considered usable if we cannot reach the Next Hop.

00:27:29
So these two routes, the 3.3.3.3 and the 4.4.4.4, it shows the Next Hop as 23.0.0.3, which is the IP address of R3. As prefixes are advertised into a different autonomous system, by default, the Next Hop for that prefix will be the IP address of the advertising EBGP router, which was R3 whose passing along these two routes. So our question is, does R1 know how to reach that Next Hop? So let’s do a show IP route just to see whether or not we can reach 23.0.0.3. And we could have looked up here as well and said, oops, it’s not there.

00:28:07
And this one right here just confirms it that R1 has no clue how to reach 23.0.0.3. And that is a very good reason why that wouldn’t be considered a best route in the BGP table. So here’s what we’re going to do. We’re going to go have a conversation with R2 and say, hey, you know what R2? Here’s what we want you to do.

00:28:25
Any time you’re going to send a prefix over to R1, we want to do a little manipulation first regarding the attribute of Next Hop. Instead of passing along R3’s IP address as the Next Hop, we want you to go ahead and change it to your own because good old R1 knows how to reach you. It’ll be great.

00:28:43
So that’s what we’re doing right here. We used the option Next Hop self, which manipulates the Next Hop attribute in the prefixes being advertised. And also because you and I are in a lab environment right now not in production, I’m going to clear that neighborship just to force everything to be fresh.

00:29:00
So now let’s take a road trip over to R1. It’s been a few moments. And let’s just validate whether or not that makes a difference. We’ll do a show IP BGP. We’ve got greater than symbols next to every single one of those routes in the BGP table, which is perfect.

00:29:14
Now, what we can do is let’s just go for a home run ping. So we’ll do a ping to the loopback of R4. Four. And let’s see if this one flies. And survey says, no. It’s not working. Let’s take a look at the routing table on R1 just to validate that it has a route to 4.4.4.4 because if he doesn’t, he wouldn’t be forwarding.

00:29:37
So let’s do a show IP route. And sure enough, right there, he’s got a route to the 4.4.4 network. Next Hop being 10.0.0.2, which is R2. So from a routing prospective, it looks like all the BGP networks are currently in the routing table of R1. However, I think somebody doesn’t have all the routes yet in the routing table.

00:29:58
One quick way of finding out who is failing is to go and do a trace route. Let’s just go ahead and do a trace route saying I want a trace route over to 4.4.4.4. We can source it from loopback 0. And we’ll find out who drops the ball. Is it R2? Is it R3? Is it R4? And in this is case– [LAUGHING]– the very first Hop, R2, is not replying back to us. So I’m going to go ahead and hit Control Shift six to stop that madness.

00:30:28
So when we source that from 1.1.1.1, which we did, and it gets sent, the Next Hop is R2. The TTL was equal to one. When R2 gets it, it says, oh, sorry. So what R2 would do is kill and, normally, would send a message back to the source saying I killed your packet.

00:30:45
We’re not getting that feedback here, which implies that R2 does not know how to reach 1.1.1.1. Or at least, that’s a possibility. Let’s go visit R2 and have a heart to heart chat. So a road trip over to R2. Let me give us a little visual separation here.

00:31:02
And let’s do a show IP BGP. And you’ll notice we have that same problem with the greater than symbol missing from the 1.1.1.1. Now, why is that? It sees it, but it’s not considering it to be a good BGP route or the best BGP route. Why is that? What would cause that? Let’s also just do a show IP route from the routing table on R2 to see what’s there. And I see something noticeably absent.

00:31:28
See, iOS 15, which is one of the features that I love, sorts routing tables numerically, which makes it a lot easier to find stuff. In iOS 12, that wasn’t the case. So at the top of the list, we do not have 1.1.1.1 in our routing table. Anthony mentioned that a long, long time ago in a galaxy far away, synchronization was on by default.

00:31:53
Like some very old versions of 12.x, the current versions of 12.x and 15 have synchronization off. But he did mention that somebody could turn it on. And from a BGP prospective, if synchronization is on and we don’t have that route that we’ve learned through our IGP as well, it’s not going to get the two thumbs up from BGP.

00:32:14
And that appears to be exactly what’s happening here. So we could validate that. Let’s do a show IP protocols. We’ll focus just on BGP. And in the output here, let’s just take a look at synchronization. And sure enough, look at that. Someone– I have no idea who it was; could have been me, maybe– has turned on synchronization.

00:32:33
And that’s why that prefix in BGP is not considered to be the best path. And that’s causing grief in BGP for that 1.1.1.1 prefix. So we do a couple of options here. Number one, we could go ahead and make it part of our IGP. Maybe, add that loopback in OSPF for whether routing process is running.

00:32:51
Or we could go into BGP configuration mode and turn off the synchronization role. And again, because we have the luxury in this lab environment of resetting appear without consequences, we’re going to do that, as well. So let’s give that just a moment. And let’s see if that makes a difference.

00:33:08
We’ll do a show IP BGP once more just to see whether or not that prefix looks happy. And it does with that greater than symbol. So the next question is– yes or no– did it make it to the routing table? And let’s just do a quick check. We’ll do a show IP route just to validate whether or not it made it.

00:33:28
And I’m thinking that, if all the other conditions are met, it’s going to be there. Sure enough, there’s the 1.1.1.1 learn via BGP. So let’s do this, let’s make a road trip back over to R1. Let’s go ahead and clear off some space. And let’s do a trace route one more time.

00:33:44
And if there’s another problem, we’ll go ahead and attack it, as well. So we’ll do a trace route to 4.4.4.4 sourcing it from loopback 0. And sure enough, we have connectivity between R1 and R4. They are loopbacks. We’d also want to take a moment and validate on all the routers that all the loopbacks have successfully propagated into the routing table and are reachable from each of the routers.

Foolproof Policy-Based Routing (PBR)

00:00:00
Ever found yourself wanting to bypass the normal way in which a router would route your traffic? There are many ways to do this in Cisco networking, and one of the primary methods is called policy-based routing. But what if you set it up and it doesn’t work as you would expect? Well, in this Nugget, Keith Barker and myself, Anthony Sequeira, we’re going to show you how to take this policy-based routing feature and make sure you get it right the first time.

00:00:27
Do you ever think about how strict and inflexible one of your Cisco routers are? Yeah, it’s going to have a routing table. And various routing protocols will put prefixes in this table. You, as an administrator, might even put static entries into this particular table.

00:00:46
And then the router will take a packet that comes in, it will look at the destination IP address, and it will do a scan of this routing table. We remember one of the things that it’s doing is trying to find a longest match entry in that routing table and then following the instructions in that routing table entry.

00:01:07
Either to send it to a specific next hop or just to go ahead and send the traffic out a specific interface. All of this is quite inflexible. Sure, we can vary whether it’s OSPF that’s believed, or RIP that’s believed, using administrative distance. And we can manipulate how the prefixes get in the routing table.

00:01:29
But once they’re in there, things get very inflexible. The router does its process of longest match and then finding that longest match. Obviously, if there is no specific match in the routing table, it will look for a default route. And it will go ahead and send the traffic per the default route to our gateway of last resort.

00:01:53
Always sounds like something out of a Stephen King novel to me. But anyways, we have this inflexible process. Policy-based routing allows you to scan this on its head. Yeah, it allows you to completely bypass the logic, the algorithm that is used in this routing table process.

00:02:13
And it allows you to send packets out of the router any particular way you choose. We have a lot of flexibility here based on route-maps that we can tie into the policy-based routing. Let’s go ahead and take a look at a sample configuration and this is going to bring to light what could go wrong.

00:02:39
So what we do when we engage in policy-based routing is we go to the ingress interface. So let’s say we were interested in controlling what happens when traffic comes from R2 to R1, destined for remote destinations. We go ahead and we set on ingress our policy-based routing statement.

00:03:02
OK, so there’s a policy-based routing statement that we’re going to put here that links to a particular route-map that is going to have the definition of what’s going to occur with the traffic. For instance, we could easily say if the traffic is sourced from the address A, or it is sourced from address B, then go ahead and send that to different IP next hops.

00:03:33
Interesting. So we’re actually able to route based on source address here instead of destination address. Wow. This demonstrates the power of policy-based routing. So a couple of things here, right, from a troubleshooting perspective that are real obvious for us to look at.

00:03:55
Are we setting up PBR? Are we doing that under the correct interface? Do we have it set up where the traffic is coming ingress to the router? Do we have the appropriate route-map reference in our policy-based routing statement on that interface? Have we constructed the route-map properly? Maybe the route-map references access lists for the identification of these different source addresses.

00:04:29
Are we linking all of these names correctly, right? So is the PBR our statement referencing the correct route-map? Is the route-map referencing the correct access control lists? These are all areas where potentially we could have a breakdown in the referencing of the different components.

00:04:50
And of course, that’s going to have undesirable results. Something else that’s worth noting here, and this is the most common mistake I see with learners when it comes to policy-based routing, is they will configure the PBR here. And they will do their testing from here.

00:05:10
And that’s not going to work because, by default, policy-based routing will not function with packets generated by that router. This is a feature for transit packets, and we want to make sure that we test it appropriately from, in our case, an R2 device. Now, Cisco did build in support for something that’s called local policy-based routing.

00:05:40
Now, if we engage in local policy-based routing, as you might guess, this is indeed a feature where we can enact policy on locally generated packets. So this would be an exception. But typically, we see learners enact non-local PBR and then test from the wrong device.

00:06:02
Now, the last area that I really want to have you watch out for is, are you setting the correct commands on your policy-based routing? For instance, in your route-map, you can say set ip next hop. And what this will do is set the next hop, not based on the routing table, but based on the logic that you’ve constructed in your route-map.

00:06:27
This is, obviously, a wonderful command for initiating traffic to take some non-routing table next hop. But notice this is very different from set ip default next hop. So we got to make sure we’re very careful when we’re selecting the syntax in our route-maps for policy-based routing.

00:06:49
What does set ip default next hop do? Well, it says if there is no match in the routing table, and we’re about to use a default route type action, then in that case go ahead and set the next hop for the traffic per the policy-based routing. So notice, big difference between these two very similar commands.

00:07:14
Well, Keith, with these main troubleshooting areas that I’ve highlighted in mind and, of course, any that you have through your many, many years of experience working with these Cisco routers and policy-based routing, let’s head to the command line and see this in action.

00:07:29
Hey, thanks, Anthony. I would love to. One of my fondest memories of PBR and troubleshooting is specifically was I was at a customer site on a non-PBR related project. And on a break, they came up and said, Keith, we have a challenge. I wonder if you could help us with it.

00:07:43
I said, absolutely, yes. What can I do for you? They said, well, we bought this content filtering server. And what we want to do is we want to have some of our customers, before they actually go out to the internet, we want to send their traffic to this content filtering server.

00:07:57
We’ve been trying to use PBR. We can’t quite get it to work. Can you help us? And it was a blast. Because with just a few simple commands and a couple tweaks, we were able to get it up and working. So as traffic went in on this interface, based on the conditions set, it could then conditionally route the traffic down to the content filtering engine.

00:08:14
Another fun thing for me is that just the acronym PBR reminds me of peanut butter and jelly sandwiches and just brings a smile to my face. So we’re going to use the topology that Anthony provided for us. I’ve added this network over here, the 172.16.0 subnet. And I’ve got a host .50 that we’re going to use as a target.

00:08:32
We’re also going to use R2 as a launching point. It’s got a few addresses on. It has 10.12.0.2 right here on this interface. And I’ve got a couple of loopbacks. Loopback 1 is on 2.2.2.2, and loopback 2 is 22.22.22.22. So what you and I are walking into is this.

00:08:48
They’ve set it up with the hopes that any traffic from loopback 1 will take this top path through R3, and any traffic that’s sourced from loopback 2 will take the bottom path through R4. So that’s the intent of what they’ve set up with PBR. However, it’s not working.

00:09:04
And it’s the job of you and me to go through it step by step and resolve any issues that we find. To make is a little more easy for us, I created some host entries on R2. So I created one for R3, R4, and R1. And that’ll just be a little easier as we do trace routes and such so we can put logical names with the IP addresses that we’re seeing.

00:09:25
So I’ve got the server’s address. That’s the .50 on the far left. R1 is at 10.12.0.1. R3 is 10.13.0.3, and R4 is 10.14.0.4. And before we test the connectivity, let’s just validate together here on R2 that we have some basic routes in place. I want to make sure that we can reach the 172.16.0 network. And it appears that we do indeed have a route.

00:09:48
So that looks pretty good. Let’s do a basic ping from our IP address all the way to the server at 172.16.0.50. And that appears to be working as well. So because I did the ping without any qualifiers regarding the source address, we’re sourcing this ping from 10.12.0.2, which is our interface that’s closest to R1. So another question is what path does that packet take when it’s just being sent? Is it going over the R3 path or the R4 path? Well, we can use some tools to verify that.

00:10:18
Let’s go ahead and use trace route. Let’s do a trace route to server. And again, that server is the IP address 172.16.0.50. And this shows us the path is going from R1 to R3 and then finally to the server. Now, the intention was to not have any kind of specialized policy-based routing except for the source address of 2.2.2.2 and 22.22.22.22. We haven’t tried those yet.

00:10:41
I just wanted to validate we have basic connectivity and the path that we’re currently using. So regarding the PBR that this customer has put in place that we’re about to troubleshoot, the objectives are that packets sourced from 2.2.2.2, which is loopback 1 on R2, should go through the path on R3 while packets from 22.22.22.22 should go through the path through R4. So that’s the objective, and we need to make sure that’s happening.

00:11:05
And if it’s not, we’ll troubleshoot it. I suppose also it’d be great to validate that we really do have those loopbacks here on R2. So let’s just do a quick show IP interface brief. I’ll exclude the output if there’s a line that has the word unassigned in it.

00:11:19
We don’t need to worry about those interfaces. And sure enough, loopback 1 and loopback 2 both have those respective addresses. So we’re good to go. So for our first test, let’s do this. Let’s do a trace route to that server at 172.16.0.50. We’ll source it from loopback 1, which is 2.2.2.2. And then we’ll specify some TTLs.

00:11:38
I’m going to specify a minimum of TTL of 1 and a max TTL of 10. That way if it gets too crazy or if we have a loop condition, it won’t run rampant. So what the output is showing us here is that our first hop was R1 from the trace route. The second hop was R4, and then it finally hit the server.

00:11:54
And that’s very interesting to me because that’s different from sourcing it from 10.12.0.2 that we did earlier. So we went through R3 for a normal trace route, and we’re going through R4 for the trace route sourced from 2.2.2.2. Now, there’s a couple of possibilities here.

00:12:10
Number one is just how the flows are being managed to R1 and maybe PPR isn’t working at all. Or perhaps PBR is configured, but it’s configured incorrectly and putting us across the wrong path. So let’s try a trace and we’ll source this one from the loopback 2, which is 22.22.22.22. And that should be going over the path over R4 if the policy-based routing is working.

00:12:32
And let’s just take a look at the results of that as well. Now, I’m thinking that it could be a crap shoot. We might not have PBR running at all, and we might just have the router forwarding at will. But it appears that both of these are taking the path through R4 regardless of whether we source it from 2.2.2.2 or sourcing it from 22.22.22.22. So let’s do this.

00:12:52
Let’s go over to R1 and let’s turn on a debug just to validate if there’s any PBR happening at all. And the syntax for that is debug ip policy. So if there’s any traffic that matches the route-maps, and it’s doing policy-based routing, we should get some output from the debugs regarding that activity.

00:13:09
So let’s go back to R2. Now it’s a debug running on R1. And we’ll simply do a ping, again, sourcing it from loop 1. So trace route to the server’s IP address, sourcing it from 2.2.2.2. And the minimum and maximum TTLs, we’ll set as well. Just in case there’s a crazy condition where we have a loop or something else, it won’t go on indefinitely.

00:13:28
And it’s still going over R4. Now check this out. Insecure CRT right here. I have mine set up so if there’s any console changes, it changes from a check mark to a blue circle, indicating that, hey, something changed on the screen. Because this did not change, there is no new debug output on R1. So let’s go just take a look to verify that.

00:13:50
And sure enough this indicates that there’s been no PBR activity whatsoever. So let’s take a look at some route-maps on R1 just to see what’s present. We’ll do a show route-map and just to make sure we have the route-maps in place on this router that are being used for PBR.

00:14:07
So here’s a route-map called SPLIT-Them-UP. It has sequence number 10. And that says if traffic matches access list number 10, then we’ll go ahead and set the egress interface. And then we have a second sequence here, sequence 20. This is if traffic matches access list 20, then we’re going to set the next hop.

00:14:25
So one thing that would cause our PBR not to trigger is perhaps we have an access list at 10 or 20 that isn’t being matched. So let’s take a close look and see what’s inside those two access lists. So we could do a show access list 20 or show access list 10. But let’s just do a show access list and take a look at all of the access lists that are on this router just to validate the contents of access list 10 and access list 20. So here’s access list 10 right here. And it’s saying permit 22.22.22.22. It’s also saying permit 2.2.12.2. And so that’s a problem.

00:15:00
So for our sequence 10, we want to use access list 10. And that should match 2.2.2.2. And for sequence 20, it’s very likely we want to have 22.22.22.22. So it looks like we have some bungled access list commands that we need to fix first. So let’s go ahead and clean those up to make access list 10 match on 2.2.2.2 and access list 20 match on the 22 address. So to do that, we’ll just do a little cleanup.

00:15:24
We’ll go into configuration mode, and we’ll make or that access list 10, where we’ll just remove both entries, line 10 and 20, and we’ll say permit 2.2.2.2 for that one. And then once we’ve cleaned up and verified that access list 10 is OK, we’ll need to create access list 20 because it currently doesn’t exist based on the show access list we did just a moment ago.

00:15:47
It is always a good idea to look at access lists before you make any modifications just to make sure you know what currently is there. So we’ll create access list 20. It didn’t exist previously. It’ll permit the 22 address. And then before we leave configuration mode, let’s just do a show access list just to make sure that access list 10 is matching on the 2.2.2.2 and access list 20 is matching on the 22 address. And based on this output, that looks absolutely correct for our purposes.

00:16:17
So let’s exit configuration mode. I also want to just validate that I still have our debug running. So we’ll do a show debug. And sure enough, policy-based routing debugging is currently on. So now that we’ve corrected our access list, let’s go back to R2. And on R2, let’s do that same– let me give us a little space here.

00:16:34
Let’s go ahead and do that same trace route. And what we’re hoping for is some debug output over on R1 that’s going to indicate that policy-based routing is currently active. So a couple of pieces of bad news. Number one, we’re still going through the path through R4 while we should be going through the R3 path. And secondly, we don’t have any change in this icon here, which means that R1, on that console screen, there’s no new debugging messages.

00:16:58
So it appears that PBR is not actively functioning yet on R1. You know, another thing that can happen to us, and it’s happened to me more than once, is I could go ahead and be connected like, for example, via SSH on a VTY line. And as a result, you wouldn’t, by default, be seeing debug messages.

00:17:15
So also just validate that if you are remotely connected, and you want to see those console messages coming at you, we want to make sure that we do a terminal monitor command so that they’ll be shown to us. Otherwise, you may be doing tons of debugs, and there’s lots of output being thrown at the console but not to your VTY session.

00:17:33
So that’s a good thing to check as well. Because I am connected logically to the console of this router, I am seeing all the debugs. So that’s currently not the reason why there’s no debugs. In our case, there’s no debugs in our case because there simply is no PBR currently active.

00:17:48
So let’s verify that we have the policy applied to the interface. Let’s do a show run for interface serial 1/0. That’s the interface that’s on R1 connecting over to R2, just to validate that the policy is in place. So show run interface serial 1/0 reveals that there’s the IP address and here’s our policy, ip policy route-map SPLIT-UM-UP.

00:18:09
So the policy is there. And let’s together just do another quick check of the contents of that route-map. So to see the contents of the route-map, we’ll do a show route-map and let’s take a look. So here’s sequence 10. And it’s using ACL 10 for a match and sequence 20, which is using access list 20 for a match. And what I noticed is this.

00:18:29
This beautiful route-map called SPLIT-Them-UP is not the same name as this guy. And Anthony told us that that could be one of the things that could cause PBR to fail is that we’ve applied it with an incorrect name. So we have this beautiful route-map right here, but it’s not the same route-map that we’re trying to apply to the interface.

00:18:47
Effectively, because this route-map does not exist and we’ve applied it to the interface, we’re not doing any type of PBR. So let’s go in and correct that problem. We’ll go into configuration mode. We’ll go into interface serial 1/0, and we’ll simply apply the new and correct route-map called SPLIT-Them-UP.

00:19:07
And that should overwrite the previous one. Now, one thing I’d like to encourage you do often is that, when you make a change, to go back in and validate that what we think we put in is what really got into the configuration. So let’s do this. We’re going to show run interface serial 1/0, and I just want to validate that the route-map we applied really is the route-map that’s sitting there.

00:19:28
And there it is. So that was definitely a problem that we were experiencing. Now that we’ve made that change, we have the correct, real route-map applied to the interface. Let’s go back to R2 and we’ll try it again. So we’re going to do the trace, and This is sourcing from 2.2.2.2. So what should happen, if the PBR is working, we should not be going over the R4 path. We should be going over the R3 path. So that doesn’t look good.

00:19:53
So one of the benefits of capping off the max TTL for a trace route is that if we’re in a loop condition, which it appears– so we sent it R1, R1 sends it back to us. 10.12.0.2, that’s our address. We send it back to R1, and it’s just looping, looping, looping.

00:20:07
So that absolutely is not a good sign, but I’ll tell you what is a good sign. Check this out right here. See that little icon? That indicates that there’s been some change on the other router. So let’s go back to R1. And on R1, check this out. We’ve got some debug regarding policy-based routing.

00:20:22
So when there’s a match like this– so we have a policy match, and then we have some policy routing going on as a result of that. Now, unfortunately, we’re not doing correct policy routing because it appears that R1 is shipping it out back to R2. So let’s take a closer look at this route-map on R1. I’m going to turn off debugging as well.

00:20:42
We’ll do a show route-map. And sure enough, in sequence number 10, it says if traffic matches ACL 10, which is a source IP address of 2.2.2.2, this is saying we want our action to be set, the exit interface to be serial 1/0. Well, that’s the interface it came in on.

00:21:00
And unfortunately, this interface, if it’s used is going to send it back to R2. R2 is going to look at it. It’s a router. It’s going to forward it back this way what it thinks its routing table says. And then it just loops over and over again. So if this was a ping or something like that, based on the TTL, that’s how long it would loop with every router deck remaining the TTL until it kills the packet.

00:21:20
So let’s you and I correct that on R1. We’ll go into configuration mode. We’ll remove completely sequence number 10. Then we’ll recreate sequence number 10. And we’ll say if traffic matches access list 10, then we want to set the egress interface to be serial 1/1, which is the path over to R3. We also could have set the IP next hop to the IP address of R3, which is 10.13.0.3. Either one of those options would be perfectly fine.

00:21:48
And let’s just verify by looking at the route-map that it is correct. And if it is, we’ll exit configuration mode. And then we’ll do some testing. So our exit interface is now serial 1/1, which is heading over to R3. So to test this, let’s make a road trip over to R2. And on R2, we’ll do that same exact trace. I’ll just the Up Arrow key.

00:22:07
Press Enter. And that looks a lot better. You notice the absence of the looping. That’s terrific. So our trace is successfully going through the path that R3 is on, and that’s because of the policy-based routing. Another thing we should definitely try is doing a trace route, sourcing it from the 22 address. And that should take the path over R4. So let’s validate that as well, sourcing it from 22.22.22.22. And survey says, absolutely, yes.

00:22:35
That is going over the path on R4. So the results are if our source address is 2.2.2.2, it’s going to go ahead and use the path that includes R3 based on setting the exit interface to be serial 1/1. And if we use the source IP address of 22.22.22.22, PBR on R1 is saying use the next hop of 10.14.0.4, which causes it to go over the R4 path. So as we review what we had to correct, we had access control lists that were incorrectly identifying the traffic we want to do PPR on.

00:23:05
Secondly, we had the incorrect name applied to the interface. Also in our route-map, we had an incorrect egress interface for the policy-based routing. And once we corrected all three of those items, we now have successful PBR. I have had a lot of fun in this video.

Solving Generic Routing Encapsulation (GRE)

00:00:00
Name Brand Troubleshooting of Generic Routing Encapsulation. Let’s begin. I’d like you to imagine that you and I have been called over to one of our customers to assist them in troubleshooting a GRE problem. And here’s what we’re walking into. They set up a GRE tunnel.

00:00:16
They used logical tunnel interface zero on R1 and R4, and they also sourced and terminated the tunnel at loop back zero on both R1 and R4. So the tunnel is between these two points in the network. Now when we asked them why they implemented this tunnel, they said they wanted to use a routing protocol on the logical tunnel to go ahead and share information about reachability for the 44 network and the 11 network. So R2 and R3 and the rest of the infrastructure doesn’t know anything about the 11 network or the 44 network. But we want to make sure R1 and R4 have reachability and to use the GRE tunnel.

00:00:53
So that’s what we’re walking into. This topology right here is in the Nuggetlab files. I’ll also have it present during our troubleshooting exercise together. So our customer has set this up. They’ve tested it, but it’s not working. And they haven’t told us exactly what they think the problem is.

00:01:08
They just said it’s not working. So here’s what you and I do. We go off into a room for a moment and we just go over some details about what could possibly go wrong with GRE. And here’s the list that we come up with. Number one, they have the incorrect configuration.

00:01:22
Maybe R1 is not pointing to the right source address, or the right destination address, for the actual tunnel interface. Or maybe the configuration of the IP addresses is correct, but there’s no route. So maybe OSPF isn’t advertising the 444 network or the 111 network, and as a result R1 and R4 can’t even establish the tunnel.

00:01:41
The source interface of a tunnel could be down. So if this is loopback zero that we’re sourcing the tunnel from on R1 side, and that interface is down, that would also cause a problem. But out of all the problems with GRE that can come up, here is one of my favorites, and that is if your router– for example, R1– thinks that the destination IP address of the tunnel is 4444, if because of a bad routing day, R1 believes to reach 4444 it should go through the tunnel itself to get there, that will cause the tunnel to fail.

00:02:12
And it’s kind of fun to watch. It’s not fun to have, but it’s kind of fun to watch. Because, as the 444 network is advertised into OSPF, the tunnel comes up. If these guys become neighbors, and then all of a sudden R1 says, to reach 4444 I can use my tunnel zero interface, and tries to use it, effectively that breaks the tunnel.

00:02:31
The tunnel fails. Then with that OSPF converge again, and it just cycles like that, over and over and over again. The short of it is, if your router believes that the path to the other side of the tunnel is through the tunnel, it’s game over. What you and I get to do is troubleshoot this customer’s GRE problems as they exist.

00:02:50
We’ll get it working, but I also want to make you aware of some other challenges in a production network that we’re very likely to face. Things like MTU sizes. Let’s say you have a computer over here on the left– computer A– and a computer on the right– computer B– and they negotiate a maximum segment size of x.

00:03:07
And so they assume, because they negotiate it, they can go ahead and use it. However, when they start sending traffic, if that traffic begins to go over this GRE tunnel, the GRE header– the overhead of GRE– is going to take some space. For example, we’re going to have at a bare minimum 20 bytes for the new outside GRE header.

00:03:24
And if we’re using a key or other details that would make that header bigger, that additional overhead is going to leave less room to go ahead and take the user’s data. So it might turn out that even though our network can support 1,500 byte frames of data, that because of the additional overhead from GRE, we have to fragment it.

00:03:40
You might say, well Keith, that’s no big deal. We’ll fragment it and send it. And that’s absolutely true unless you have devices that are setting the do not fragment bit to on, which tells the router to not fragment. And that should trigger another series of events where the router then sends ICMP messages back to the client saying, I can’t fragment.

00:03:58
It’s too big. So fragmentation can become an issue inside of GRE tunnels. And there’s more than one way of dealing with that. We could, for example, on the router, on the inbound interface, where the clients are coming in on, we can simply set a policy that specifies on the interface to remove the do not fragment bit.

00:04:15
We could also configure all of our devices on the network to use a manually smaller maximum segment size. On the tunnel interface itself we could specify the TCP maximum segment size as well. So that when they’re negotiating, if they’re negotiating across this GRE tunnel, we can set that upper limit of what the TCP maximum segment size is allowed to be.

00:04:37
And one other factor I want to share with you is Quality of Service. If you have packets that are entering a GRE tunnel and they’re marked a certain way, we’re going to want to make sure we have our policies in place on these routers that are doing the tunnels so we don’t negate any of our Quality of Service that we have in place, and lose it as we send traffic through the tunnel.

00:04:55
So let’s begin on R1. Let’s just so a quick look at what interfaces are currently up or down, as well as the IP address associated with those interfaces. And we do that with the show IP interface brief. So there’s our tunnel interface right there. It’s the IP address of 10.77.0.1, and it shows a status of up and down, which is not good.

00:05:16
For functionality, we need to be up and up. Now check this out. Without even leaving this output, the source of the tunnel– based on the diagram we were given– is loopback zero. So the source of tunnel is loopback zero, and the source is administratively shut down.

00:05:32
And that. My friends, absolutely would cause a failure of this tunnel. So we absolutely need to fix that. The source of this tunnel is loopback zero. That interface on R1 needs to be in an up state. Another thing we could also do is just validate that loopback zero really is the source IP address or the source interface for the tunnel.

00:05:51
And to do that, we’ll show interface tunnel zero, which confirms, by the way, that it is not happy at the moment. And here we have the tunnel source indeed of 1.1.1.1 loopback zero, with a destination of R4 loopback. So let’s take a moment and bring that interface up.

00:06:07
So off to configuration mode we go. We’ll go into interface loopback zero. And we’ll use the infamous dog trick that I used on my dog to get him to stand up, and that is the no sit command. In a Cisco router of course, that’s no shut down in the interface.

00:06:22
And that brings that loopback interface up. I’m also noticing right here that tunnel zero also changed its state to up, which is a very, very good indication. So router one’s tunnel interface appears to be up. Let’s go over to R4– the other end of the tunnel– and let’s just verify the detail there as well.

00:06:40
We’ll do a show interface tunnel zero. So the total source is 4.4.4.4. That’s loopback zero on good ole R4, and the destination is 1.1.1.1, which is the loopback interface on R1. What I should have done was the full output, because we also want to know whether or not that interface is up or down.

00:06:58
So let’s go ahead and do a show IP interface brief. I’ll hide all the interfaces that don’t have IP addresses on them. And let’s take a look and see if our tunnel zero interface is up up. And survey says, tunnel zero. There’s the IP address. And it is not up.

00:07:17
So here, our loopback, which is the source, is up. That was a problem that R1 had. But we still show the interface, tunnel zero, as being up down. One thing we want to verify is that we absolutely have reachability to the far end of the tunnel. That is not in the routing table.

00:07:32
That will also be one of the reasons why the tunnel isn’t up. So let’s verify that. And we can easily verify that with a show IP route. Or we can get very specific with a show IP route 1.1.1.1, just to validate whether or not R4 believes he can reach that network.

00:07:48
And based on results, we can’t. And that’s a huge problem. If we don’t have reachability to the far side, we won’t show the local tunnel as being available. And for good reason. We can’t reach the far side of the tunnel. Now based on what the customer told us, they’re going to be using OSPF in the core of their network, but they also have EIGRP configured for the tunnel interfaces.

00:08:10
Let’s just validate and see if we have any neighbors with EIGRP. And we don’t. And let’s also just validate that we’ve got some OSPF neighbors. And we should have a neighborship between R4 and R3. And we do. That looks great. Let’s also verify that we are learning some routes.

00:08:27
So we’ll do a show IP route OSPF. And indeed we have learned some routes. We’ve learned the networks between R2 and, R3 and the network between R1 and R2, but there is no 1.1.1 network. And that is causing our tunnel interface to not come up. So where, oh, where is our missing route? It’s always a good thing, if possible, to go to the source.

00:08:47
Because R1 owns that network, is directly connected, let’s go over R1 and have a little conversation with R1 about that network. We’ll do a show IP interface brief. We’ll say exclude unassigned, just to validate, once again, that it really does have that IP address.

00:09:02
Which it does, right there. So loopback zero is up. It’s 1.1.1.1. The next question is, did that network and that interface make it into OSPF? And there’s a couple ways we can check that. We can show IP protocols. Or more simply, just do a show IP OSPF interface brief.

00:09:19
And if it shows loopback zero, we know it made it in. However, the only OSPF enabled interface that I see here is serial 1/0 itself. And that would explain why that 1.1.1 network is not in OSPF. It’s because router one, whose directly connected, never put it into OSPF.

00:09:38
And as I recall, we had a Nugget on why OSPF routes might go missing, and this was one of those reasons. And now it’s affecting our GRE tunnel. So let’s correct that. We’ll go ahead and go into OSPF configuration mode. And we’ll simply add that network. And there’s lots of different ways we could add a network into OSPF.

00:09:55
We could do an interface command for it, or a network statement, but this time let’s just do a specific network statement for that single IP address, which is 1.1.1.1. And now the interface should be enabled in OSPF. So the fact that we have an EIGRP message across that tunnel– that might be a good sign.

00:10:14
But let’s take one step at a time. Let’s just validate that the loopback interface made it into OSPF. And sure enough, it did. So there’s loopback zero, and it’s in area zero of process ID one of OSPF on this router. So a couple things are going on in mind at the moment.

00:10:32
Number one, I have a strong reason to believe that the 1.1.1 network made Into OSPF and was propagated, and R4 knows about it because they actually brought up the tunnel. So we wouldn’t be able to bring up the tunnel unless we had reachability. So just to validate that– I know we have a lot of really cool console messages coming up.

00:10:48
Not all positive, I might add. Let’s go over to R4. And on R4, let’s just validate that we have a route to 1.1.1.1, and we’ll see how we learned it. And sure enough, it says we know how to get to 1.1.1.1. We learned it via OSPF. And our next hop we’d use to reach it is 10.34.0.3, which is R3’s serial interface. So now we have these errors that are popping up.

00:11:11
And we ought to probably read one or two of them because the router is trying to tell us something. It’s saying tunnel zero is temporarily disabled due to recursive routing. So what exactly is recursive routing as it applies to GRE tunnels? And why is it causing our tunnel to flap? And if we watch this long enough, we’ll see the tunnel come up, EIGRP Adjacencies will form, and almost immediately the tunnel will go down.

00:11:36
And then a few moments later, it’ll come back up. And it’ll just flap back and forth, over and over again. And here’s a strong probability of what’s happening. This address, right here, 1.1.1.1 and 4444, are both being learned via OSPF, so it’s being propagated across this entire OSPF domain.

00:11:53
However, once the tunnel comes up, right here, between R1 and R4. And they form an EIGRP Adjacency, it appears that these addresses are also being advertised over the EIGRP Adjacency. So now, when R4 is learning about the 111 network, it has a little problem.

00:12:11
It learns it from OSPF which as an administrative distance of what? 110. Exactly right for internal OSPF routes. It’s also learning it from EIGRP which has a lower administrative distance for internal routes. And that’s an administrative distance of 90. So R4 says, AD of 90, I’m picking you as the path. Logically it says I need use tunnel zero as my route to get to 1.1.1.1, and that’s where the whole thing just breaks. So the tunnel comes down.

00:12:41
That’s no longer there. R1 and R4 both pick up the corresponding remote routes via OSPF. Once they do, the tunnel interface comes back up. The EIGRP Adjacency forms. And once again, they learn the destination of the tunnel address through the tunnel itself.

00:12:59
And that causes the tunnel to fail. And if you want to see the type of console message you’ll be getting over and over as a result– and this is the output of what we would expect to see when that’s flapping over and over and over again. And it reminds us of what’s exactly happening every single time.

00:13:15
We could also do debug of IP routing. And that would also give us visibility into these routes flapping back and forth. So the next logical question is, OK Keith, how do we fix something like this? Well the answer is, if the problem is that R4 and R1 are learning the respective other guy’s loopback over this tunnel interface, the answer is do not advertise those loopback addresses over the tunnel interface.

00:13:42
So find out if EIGRP– which I’m sure it is– is including that address space, and just don’t include it in a network statement for EIGRP. So if EIGRP doesn’t advertise, over the tunnel, the 444 and the 111, then R1 and R4 will never have the opportunity to learn the corresponding remote end of the tunnel over the actual tunnel interface.

00:14:03
And that’s what we need to do right here. Let’s start on R4. Let’s take a look at the routing protocol configuration on R4. And then together, we’ll modify the network statements in EIGRP so we are not including the 4.4.4.4 address. We’ll do the corresponding treatment over on R1 if it also has that same problem.

00:14:22
So on R4, let’s take a look and see what they have configured in the routing protocols. There’s OSPF and EIGRP running. We’ll do a show run and pipe with section router. So for router EIGRP1, this is IOS 15.x. So network 0000 says, any and all interfaces are participating in EIGRP.

00:14:42
And that would include our loopback zero which is causing the problem. We want to take that out. Not advertise loopback zero’s address, which is the 4.4.4.4, in EIGRP. So let’s do this. Let’s go into configuration mode. We’re going to do a couple things. One, I’m just going to wipe out EIGRP at tunnel assistant number one because there’s not much there.

00:15:03
I’m going to replace it with a couple network statements. One that will include the 10 77 network. That’s the GRE IP interface for the tunnel. And the second thing I’d like to include is network 44, because that’s the network that they wanted to actually have advertised and reachable over the tunnel between R1 and R4. Now we also want to make sure that R1 is not causing us a problem.

00:15:26
So we’ll go over R1, take a look at what it has configured for EIGRP. We’ll do a show run, and just look at the router config portion of the running config. And he has the same thing going on. And that’s just a really quick and, in this case, dangerous way of bringing all of the interfaces in the EIGRP.

00:15:47
We do not want loopback zero, which is directly connected to the 1.1.1 that we want to make sure we don’t have inside of EIGRP. So what we’ll do here is we’ll go into configuration mode. I’m also going to wipe out router EIGRP1 here as well. We’ll put it back in.

00:16:03
The auto-summary feature is disabled by default with IOS 15.x and higher, so I don’t have to disable auto-summary. And I’m going to put in two network statements. One for the 11 network, which want to advertise over that EIGRP tunnel connection, and also the 10 77, which actually enables EIGRP on the tunnel interface itself.

00:16:23
So it shows we have a new EIGRP Adjacency over the tunnel interface. And hopefully, it’s not going to start flapping. So let’s do a show IP route, and we’ll take a look at what we’ve learned via EIGRP and what we’ve learned via OSPF. So we have the remote end of the tunnel that we’re learning via OSPF, which is as it should be.

00:16:42
And we have the 44 44 44 which we’re learning dynamically via EIGRP. And the interface we’re learning it on is on tunnel zero. So now if we do a ping of 44 44 44 44, and it really doesn’t matter where we source it from, that is going to be GRE encapsulated and logically sent over the tunnel interface.

00:17:02
So let’s go ahead and do that. A little ping– and just for grins, let’s also source it from 11.11.11.11. And that way the return traffic is also going to go through a GRE tunnel to get back to us. If we put a packet capture between R1 and R2– as we did that ping– which, by the way, I did.

00:17:18
And this capture file is available as part of the Nugget lab files for this video. Here’s what we would see. In packet six, if we look at the details of it, that Layer 3, what the network sees is a packet that is sourced from 1.1.1.1 destined to 4.4.4.4. And all the routers– Router two and Router three– are simply going to route that and forward that.

00:17:38
But when Router four receives that– because that’s Router four’s IP address– he is going to de-encapsulate that and take a closer look at what the payload is. As he opens that up, he’s going to see this is GRE traffic. And that is going to cause him to further de-encapsulate.

00:17:54
As he further de-encapsulates the traffic, he then has the original IP header, which is sourced from 11 11 11 11, and destined to 44 44 44 44. And if we open up the ICMP information, it would be an echo request. And packet seven in the capture would be the reply to that request.

00:18:13
Now, even though in this example, we are carrying IPv4 traffic, we could also be carrying other Layer 3 types of traffic such as IPx, IPv6, et cetera. And because the transport network only sees this, the transport network between R1 R4 don’t have to know any of the details about the passenger protocol that we’re carrying.

Routing Redistribution

00:00:00
We don’t really want to engage in redistribution, taking information from one routing protocol and putting it into another. But in this video, as we’ll review, there are times when we’re stuck with it. And yes, disaster could happen. In this important Nugget, Keith and I are going to walk you through just how to make sure your redistribution, when you need to do it, is safe and error free.

00:00:26
From a Cisco certification standpoint, by the way, this Nugget is perfectly relevant for the professional level through the expert level. Now in the beginning of this Nugget, I stated, we don’t want to redistribute. Why is that? Well, when we redistribute, number one, it can be dangerous, as we’ll see in this Nugget.

00:00:48
But when we redistribute from one routing protocol, let’s say A, to another routing protocol, B, we inevitably lose some information. Think about EIGRP for a moment. EIGRP utilizes a compound metric made up all of a bandwidth value and a delay value. Now let’s say we redistribute into OSPF.

00:01:16
OSPF utilizes a cost value. As you know, that is influenced by bandwidth, and sure enough, we are going to lose the delay information when we redistribute. So we lose details. We lose information. But by far the main reason we don’t want to engage in redistribution is that it can be dangerous.

00:01:41
That’s right. What can happen is a routing protocol can end up taking an inferior path, yeah, due to redistribution. So we could have suboptimal pathing result from redistribution. But in the worst case scenarios, scenarios like Keith and I will demonstrate in this particular Nugget, we can have routing loops result or routing, what we call, feedback.

00:02:13
And that’s actually what Keith is going to demonstrate– a situation where the routing table becomes unstable, thanks to the way in which redistribution is done. A routing loop would be literally where we have traffic being sent into this domain and this domain sending it back into that domain.

00:02:33
Routing feedback is a situation of database instability, which Keith will demonstrate in this Nugget. So all reasons we don’t like to engage in redistribution, but when we have to, maybe because company A is merging with company B, or maybe we have a situation where we want to source some prefixes in an IGP into BGP– I mean, there are going to be times when we need to redistribute.

00:03:05
And it is possible to do it safely. Now routing loops and routing feedback are certainly scary things, because they can bring your network to its knees. What’s not a scary thing is when we have protocol A, and we’re going to go into protocol B, and we are doing redistribution in one point and in one direction.

00:03:28
Yeah, this is pretty safe. One of the things we’re going to need to do, of course, is always make sure we are setting a seed metric. Some routing protocols will set a seed metric by default during redistribution. Other routing protocols won’t. So if you don’t set a seed metric, you will not have successful redistribution.

00:03:51
I don’t like to memorize which routing protocols do and don’t set seed metrics automatically. So I always set a seed metric during redistribution. Something else that I do is I poison that metric a little bit. I mean, after all, these prefixes are external from the perspective of routing domain B, so you certainly wouldn’t want those prefixes seen as preferential.

00:04:23
They’re from an external routing domain. So I’ll poison the seed metric, a bit, to ensure that those routes from the perspective of routing domain B don’t look preferential. Now be careful when you’re poisoning the seed metric. You don’t want to poison it so much, obviously, that it becomes unreachable to routing protocol domain B.

00:04:47
Now something that’s also pretty safe is when we go bi-directional, so we go two ways with our redistribution, but we do it in one point. This is also pretty safe. Where you want to pay particular attention and where you want to give caution is when you do bi-directional redistribution in two different locations.

00:05:12
This would look something like this, when we’re talking about these two routing protocol domains. So this is where we have to use caution when we’re doing dual redistribution in a couple of different spots. As Keith will demonstrate, there are many ways that you solve any route loops or route feedback issues during redistribution.

00:05:35
For instance, we can manipulate administrative distance. We could go ahead and apply many different types of filters. Oftentimes, individuals will like to tag routes and then filter, based on those tag values. It really doesn’t matter your particular approach, as long as your particular approach works, and when we say works here, we mean, of course, solves the loop or quiets the routing feedback.

00:06:06
Now let me let you in on a little secret, here. I never, I repeat, never fear redistribution. Even if I’m doing it in multiple spots and I’m doing bi-directional redistribution in those multiple spots– no problem, I never ever fear this. Why? Because I have a handy tool in my tool belt that I always, and I repeat, I always, use.

00:06:33
That tool is debug IP routing. That’s right. Debug IP routing is what I am going to place on every major router involved in the redistribution process. Debug IP routing is going to really demonstrate route feed back issues. Route loops would be easily demonstrated, of course, with ping and trace route tests, once redistribution has been performed.

00:07:05
So debug IP routing is really going to be our best friend. Turn it on before redistribution on all of your major devices. Ensure things are stable, and then you can go ahead and turn off debug IP routing. By the way, debug IP routing is pretty safe. Because in a stable routing environment, we aren’t going to be overwhelmed by the output of debug IP routing, because by definition, things are stable.

00:07:36
If we get into an unstable environment, then we would get lots of noise by debug IP routing. But we would welcome that noise, because it is tipping us off to the fact that our routing infrastructure is unstable. Well, let’s have Keith walk us through safe redistribution.

00:07:58
And by safe redistribution, I would imagine what Keith will demonstrate is redistribution that causes problems, how we can easily track down and solve those problems in redistribution. Thank you, Anthony. This is the topology that we get to play with in this video.

00:08:14
This diagram, by the way, is also available in the Nugget Lab files for this video. I’ve got OSPF represented in blue. I’ve got RIP represented in red, and we have EIGRP in green. We also have some very funky requirements regarding where, if we want to, we are allowed to do redistribution– which on R3 is redistributing OSPF into RIP, as indicated by this purple arrow.

00:08:39
Bi-directional redistribution between RIP and the EIGRP, and then down at R5, bi-directional redistribution between OSPF and EIGRP. I’ve also the loopback interfaces associated with each of the routers, based on their router number. And for this troubleshooting exercise, every single router who’s supposed to be introducing networks, for example, R4 with network 4.4.4.4, that is all set up correctly to get the network into the respective routing protocol as a starting point.

00:09:10
Our focus, here, is redistribution on R3 and on R5. So our customer, bless their heart, has configured this redistribution. It’s not working correctly. They’ve asked us to come in and take a look at it. We asked them, what exactly are the problems? And they say, well, not all the routes are showing up, and that’s all the detail they give us.

00:09:28
So what we might do is start off just on R1 and say, OK, what do we see in the routing table on R1? And we have the loopbacks for R1, which is directly connected. And we’re learning about the loopback of R2, R3, R4, R5, and R6, and that looks like a very good sign. And so we might be scratching our head.

00:09:46
Well, exactly which routes are missing? Now check this out. My hands will never leave my arms. When I press the upper arrow key, do that show IP route one more time, and look at what’s not there. We do not have a 4.4.4.4 network. It was there a moment ago, and now it’s not.

00:10:05
So one of the things I love, that Anthony mentioned, is that he has his favorite debug for troubleshooting route redistribution. It’s called debug IP routing. Now we might put that debug just on our redistribution devices. But the reality is we can pick almost any router in our environment that should be learning routes, and if there’s some kind of a route flapping going on, where our route is being introduced and pulled away and introduced and pulled away, it can also indicate that, as well, so we can see it.

00:10:32
So that’s a good little sampling. Let me go ahead and turn off debugging, just for a moment. And let’s take a look at what’s going on. This debug output says, OK, I learned about the 4.4.4 network, great. My next stop is 10.12.0.2, which is R2’s address. And then a moment later, it deletes it, and it gets rid of it.

00:10:49
And it’s very likely, if we have route flapping like this that’s going, and it’s due to some kind of an incorrect route redistribution, that situation is going to occur over and over and over again. So let’s take a look at our topology. Who is responsible for bringing that route into OSPF? I mean, R3 is taking RIP routes like 4.4.4.4 and putting them into EIGRP.

00:11:12
It’s R5’s job to take the routes from the EIGRP and put them into OSPF. And that is the method that R1 would learn about the route 4.4.4.4. So let’s go up to R3 and ask R3 whether or not it’s doing redistribution correctly regarding the routes from RIP into EIGRP.

00:11:32
And we’ll do that with the show IP protocols. And it says regarding redistributing that, indeed, it is taking RIP and redistributing it into EIGRP. Another item that Anthony told us about was the fact that sometimes we have to have default metrics in place when we’re doing redistribution.

00:11:49
Well, EIGRP is one of those times. RIP uses hop counts. EIGRP is using bandwidth and delay. So if we take a look at the running config, and I focus the output just on the router sections, in the EIGRP process, we’re doing a redistribution of RIP routes, and we’re giving them a default metric of 1.1.1.1.1, which is really, really bad, by the way.

00:12:10
Because one of those values represents bandwidth, and with a bandwidth of one, which goes into the calculation, it’s going to be a terrible metric. Now it’ll still be reachable, but it won’t be anything close to the better metrics of the other native EIGRP routes.

00:12:24
Also as we bring these routes into EIGRP, they will be considered external routes from an EIGRP perspective. And instead of having the internal administrative distance of 90, they’ll get the external administrative distance of 170. So they’re going to look bad from a couple levels, from the administrative distance level, as well as the metric is going to be terrible.

00:12:45
Another thing that we should really understand about that is that when we bring RIP routes into EIGRP, what do we really mean? When we do this redistribution command, what we’re talking about from R3’s perspective are any interfaces that are currently directly connected and enabled for RIP.

00:13:00
Those networks will make it in. And any RIPed, learned routes that are currently in our routing table will also make it in. So if we look at the diagram up here, the directly connected network of 10.34 would qualify. And also the loopback interface of 4.4.4.4 would qualify, as long as they’re both in the routing table of R3. Because one thing that routers don’t like doing is they don’t want to advertise a route that they do not have.

00:13:27
So another thing we could check on is we could go down to R5 and say OK, dear Mr. R5, your responsibility is to take EIGRP routes, including those that have been redistributed into EIGRP and advertise them into OSPF. Are you doing your job? We can ask the question with show run pipe section router, take a look at the OSPF section, and see if we’re doing a redistribution of EIGRP into OSPF.

00:13:52
And what this says is absolutely, yes, we are taking EIGRP routes, and we’re introducing them into OSPF. Now OSPF, when we do redistribution into it, is one of those routing protocols that does set a default cost. And the default cost is 20. So even though we don’t have a cost set in this configuration, there is a default cost.

00:14:14
So now the question is, why did poor, old R1, if we go back there– how come he learned the route then deleted the route? And if we left debugging on, that would continue to happen until we correct this problem. So for grins, let’s go ahead and turn debugging of IP routing back on in R1, and let’s also go up R3, who is doing a lot of our redistribution.

00:14:34
And let’s also do debug IP routing, right here. We’re looking for more clues, as far as what is causing this route, the 4.4.4. Network to show up, magically, on R1 and then to disappear just as quickly. So it looks like R3 is having a similar experience as R1 was. It had a route via OSPF, and now it’s deleting it.

00:14:57
And we’re going to let this debug run here, just for a few moments, so we can get a nice, clear picture of exactly what’s happening. And that is great. And that’ll be enough for a sample. Let’s go ahead and turn off debugging. And here’s the first part of this debug output that’s very, very concerning.

00:15:14
It says, I’m deleting the route to 4.4.4.4 via R2 that I learned via OSPF. And if we look at the topology, 4.4.4.4 is not an OSPF native route. It is a RIP native route. And then what’s happening a short while later is we are now learning an updated RIP route, the 4.4.4.4. It gets added to the routing table for just a moment.

00:15:40
And then we update that with an OSPF route for 4.4.4.4. Why? Because OSPF, it says, has a better administrative distance for 4.4.4.4. So because RIP as an administrative distance of 120 and OSPF has 110, that is the reason it thought it was a closer, or a better administrative distance.

00:16:01
To really see what’s happening here, let’s go back to our diagram and talk about it for just a moment. One of the secrets that I’ve discovered in troubleshooting route redistribution is to focus on one single route at a time. And just pick one. For example, in this case, we have network 4.4.4.4 that is being implemented and removed, implemented and removed.

00:16:21
So it starts off as a RIP learned route. It goes into R3, it’s advertised. R3 learns it via RIP. It’s in the routing table as a RIP route, and it redistributes that into EIGRP. No problem so far. Now this EIGRP external route is going to be redistributed into OSPF.

00:16:39
That’s also not a problem. So 4.4.4.4 shows up, R1 knows about it, all the OSPF speakers know about it, and that is a problem. As that network is advertised in OSPF, not only is R1 going to learn about it, but R3 is also going to learn about the 4.4.4.4. But this time, it’s an OSPF learned route, and because, by default, the administrative distance for OSPF is 110 and RIP is 120, it chooses to go ahead and start to use the OSPF learned route in the routing table on R3. Now at first glance, we might think, OK Keith, what’s the problem with that? It has an OSPF learned route.

00:17:16
How come the route disappears? Well, the moment that R3 starts learning this network as an OSPF learned route, it is no longer a RIP learned route, which causes it to no longer be redistributed into EIGRP, which causes it also to no longer be redistributed intro OSPF, because it never made it into EIGRP.

00:17:36
Then R3 removes that route, and now we’re at the mercy of RIP to start that ball rolling again. The next time R4 advertises the 4.4.4 network, it gets installed as a RIP learned route that starts being redistributed back into EIGRP. OSPF learns about it, and the cycle continues.

00:17:54
That route flapping, by the way, won’t solve itself. So the question is, how do we go about solving this problem? And it really is a problem about administrative distance of 110 for OSPF verses 120 for RIP for the 4.4.4 network. So the solution to this, one solution, is to go ahead and simply tell R3, listen, Mr. R3, we want you to go ahead and consider that for OSPF, the administrative distance for all OSPF routes is 121. And that would cause R3, when it learns that route via OSPF to say, oh, I’m not going to install that route in my routing table, which would continue to allow the RIP learned route for 4.4.4.4 to be redistributed and advertised into EIGRP.

00:18:38
That is probably the simplest and easiest solution. Another possibility, though, is we could do a distribute list in OSPF. We could tell R3 that based on a specific access list that is denying the 4.4.4.4 network to go ahead and not install that route in its own routing table.

00:18:56
And that would also prevent the loss of the RIP learned route. And with that RIP learned route still in place, it would still be advertised. Using a distribute list with OSPF is a Band-Aid. Because the only person we’re really faking out is R3 regarding that specific route.

00:19:12
It would be much better to do the administrative distance technique on R3. So on are R3– in fact, let’s do this. Let’s go back to R1 just for a moment. And so that debug has been running for quite a while, and that route’s flapping. Let’s go back to R3, and on R3, we’ll go into router configuration mode for OSPF process number one.

00:19:31
And we’ll simply say distance 121, so that this router, router three, will think, OK, any OSPF learned routes have an administrative distance of 121. And that will cause it to continue to believe that the 4.4.4 network learned from RIP should stay in the routing table, which is exactly what we need to have happen.

00:19:50
So now that we’ve done that, let’s go back to R1 for a moment. And what we should have now is we should have no more flapping of that route. And we can verify that it’s present by simply doing a show IP route, right here on R1 and just validate that we have the 4.4.4 network in place. And there it is.

00:20:09
We could also validate our path to get to that 4.4.4 address by doing a trace. So before we do this trace, let’s just think about the path we should be taking. R1 is going to have to forward to R2. And because the route was introduced into OSPF through R5, who is the autonomous system boundary router, the next hop would be R5, then it should be R3. And then finally, the last hop or the last destination should be R4. Let’s go ahead and test that.

00:20:36
And we’ll do that with a simple trace to 4.4.4. We’ll also source it from our loopback address, which is 1.1.1.1. It also validates that R4 has a route back to 1.1.1.1. And that is the path that we expected. So let’s do another spot check. Let’s go up to R4. This is the RIP router, and let’s do a show IP route, just to see if we can see all of the loop backs that are in the topology.

00:21:02
So we have R1’s, R2’s, R3’s, R4’s. But we don’t have R5, nor do we have R6’s loopback addresses. Having loopback IP addresses that match the router number is very, very convenient. However, in a network where we don’t have a conveniently numbered system, we can just pick specific networks from various points of the network just to validate if those routes show up in the routing table on remote routers.

00:21:27
So because we’re missing EIGRP routes, let’s just go take a look at R3 and ask him if he’s doing redistribution into RIP from EIGRP. And a show run pipe section router should show us that information. So under RIP, right here, we’re doing a redistribution of OSPF into RIP, but I do not see any redistribution of EIGRP into RIP.

00:21:50
So to do that very, very simply, we’ll go into configuration mode, we’ll go into router configuration for RIP, and we’ll simply say, please bring in those EIGRP routes. And we’ll go ahead and give it a seed hop metric of five. This just helps to ensure that from a RIP perspective, these EIGRP routes are not going to look really, really great, but they should still be reachable.

00:22:13
So now that we’ve done that redistribution, let’s go back up to R4 and ask him one more time. Please show me your routing table. Do you have networks 5.5.5.5 and 6.6.6.6? And the answer is, sure enough, right there. We also have the 10.56 network, which is the serial link between R5 and R6, which we didn’t have previously. It’s missing right off the bottom of that list.

00:22:37
So now we can do some basic testing, for example. Let’s do a trace route from R4 to the loopback of R6. We’ll source it from our own loopback 0. And that will verify that R6 also has reachability back to 4.4.4.4. And the path was router three, router five, then router six.

00:22:54
We could do that same test over from R1, just to validate from OSPF over to EIGRP– it’s working, and back. So we’ll do that same trace to 6.6.6.6, source it from loopback 0 on R1. Of course, loopback 0 would be 1.1.1.1. And it looks like we have a little problem.

00:23:11
I’m going to do a Control-Shift-6 to break that sequence. And let’s take a look at this together. So the packet went to 10.12.0.2, which is great and then went to R5. And then it, very likely, made it to R6, and R6 does not have a route back to our loopback address of 1.1.1.1. That would be my guess.

00:23:33
Let’s go take a look at R6 and ask him. So over on R6, we’ll simply do a show IP route. We could also say, show IP route for 1.1.1.1. I just want to take a look at the entire routing table. And look at this, I’ve got 4.4.4.4, because it’s an external EIGRP route.

00:23:49
It’s got an administrative distance of 170. I know the loopback of R5. But what I don’t see here is I don’t see any of the OSPF routes, which would be the loopbacks of R1, R2, R3, or any of the serial links in the OSPF domain. So on R5, we’re allowed do bi-directional redistribution. Let’s just verify that it’s happening.

00:24:11
We already have the output, right here on the screen from a previous show command. And right here, in router EIGRP, we are not doing any redistribution of OSPF into EIGRP. And that would explain the missing routes on R6. See, R5 had no problem, because he had access directly to the OSPF domain, but EIGRP only speakers, who didn’t have direct access to OSPF, are not going to have those routes.

00:24:37
So on R5, we can fix it. We’ll go into configuration mode for router EIGRP, autonomous system number one. We’ll say, please bring in the OSPF routes. And we’ll set a metric that’s really, really poor, but acceptable. So now if we go back to R6, and we use a simple up arrow key, we should have all of the routes.

00:24:56
If we go back to R1 and do that same test that we did previously, this trace should be successful, from router two to router five to router six, who now has a route back to 1.1.1.1. I have a great time. I appreciate you spending it with us. On behalf of Anthony and myself, we hope this has been informative for you.

Troubleshooting in R&S Cert Exams

00:00:00
Now to conclude this Nugget course, Keith and I wanted to spend a moment here and really get specific on troubleshooting in Cisco certification. You know throughout this course we have alluded to how the course maps to CCENT, CCNA, CCNP, and CCIE for route switch.

00:00:20
So we wanted to wrap things up here by presenting to you how, specifically, you’re going to find troubleshooting approached in these various certification exams. So you’re probably wondering, what is the specific body of knowledge, If you will, that you will be tested with and that this particular Nugget series addressed.

00:00:43
Well here they are. Yeah, here they are. 100-101 ICND1, that’s the version 2 exam that of course maps directly to your CCENT certification. There’s 200-101 ICND2, version 2 that maps to what? That is your CCNA at that point. By the way, we are also going to cover, in this Nugget series, 200-120, that’s the composite exam, if you wanted to jump right to CCNA with one exam.

00:01:17
Then we’ve got route, switch, and TSHOOT. These exams make up the CCNP and then two exams for CCIE, the written and the practical. So this is the body of exams that Keith and I have been thinking about and have been addressing as we’ve moved throughout this course chock-full of exciting Nuggets.

00:01:46
So let’s start in a logical place. Exam 100-101 ICND1, it’s version 2 and this is first CCENT certification. Ironically enough, even though this is our entry level certification, troubleshooting questions can take a very wide variety of forms. The first type you need to be ready for is the standard multiple choice, where one option is going to be correct out of four options.

00:02:13
A perfect example here might be something along the lines of, how would we check what routes are populated in our routing table. And of course one of the correct responses would be: show IP route. So standard multiple choice is our first option. Another option that you need to be ready for is what I call multiple multiple choice, or as Cisco would call multiple choice multiple answer.

00:02:39
Notice, visually, you’re going to get a clue of this question type because the radio buttons that you had before are indeed going to be replaced with check boxes. Always make sure to check the rules of the question. Notice in this particular case, we’re going to choose three, that’s very important.

00:02:57
There are no partial credit in these multiple choice questions, so you would have to get all three of the responses that are correct. Again, you can imagine the troubleshooting type questions here would involve maybe something like what could go wrong with the particular protocol and there are several options that are correct.

00:03:16
You would need to choose all of the correct options. Now, by far, the most exciting type of question you could receive in this particular exam, based on troubleshooting, is a router simulation question. In this question you’re going to read the scenario up here and learn what you are supposed to troubleshoot.

00:03:38
These over here are simply instructions on how to conduct this lab, which you don’t need to read now because I’m going to give me those. So we read the instructions on top on what we are supposed to accomplish. Notice the device with a dotted line to a router is the device that’s going to present us with a console to that router.

00:03:58
Here you can see that we have a console connection to the router Lab A. If we are to configure Lab B or Lab C routers, obviously those would need to be accessed from Lab A, using something like telnet or secure shell. Because we only can make a connection here in this case to the Lab A device.

00:04:17
We click on the host that has the terminal software to make that connection, we press return to get started, and we are in the router simulator. Now notice this is actually just a demo, so it won’t have a lot of commands, but to answer some frequent questions that students have we can use things like tab autocomplete, context sensitive help, shortcuts in the particular lab simulation.

00:04:41
So probably unlikely that at the ICND1 level, you would get troubleshooting sims but it is a possibility, so we include it here in the discussion for completion sake. Now of all the topics that you have to worry about for ICND1 version 2, what would be the main areas of focus from a troubleshooting perspective.

00:05:06
Notice I said the main areas of focus, not the only areas. But really it’s going to be Layer 2 concerns. I would really want you to focus on VLAN and Trunk troubleshooting for this particular certification level. Now how about 200-101 ICND2, the exam that provides our CCNA certification.

00:05:30
Well, we’re certainly going to have the exact same question types that we might face, multiple choice related to troubleshooting, multiple answer multiple choice, we could have our simulations regarding troubleshooting in this particular exam. What are our major topics? Well certainly from a Layer 2 perspective, now we have to pick up Spanning Tree Protocol.

00:05:54
And we really want to focus on two routing protocols and their troubleshooting, and that would be EIGRP and OSPF in this particular exam. Then we have our route and our switch exams for CCNP. Notice we leave TSHOOT for discussion next in this particular Nugget.

00:06:18
But when it comes to route and switch once again we have our multiple choice, our multiple answer multiple choice, and potential simulations that involve troubleshooting. From a content perspective, it’s no surprise that with route we are really going to be intensely concentrating on our OSPF protocol, our EIGRP protocol, don’t forget about Border Gateway Protocol at this level, policy based routing, generic routing encapsulation, and our discussions on redistribution.

00:06:54
Switch– well, obviously we’re going to be looking at all of our Layer 2 topics that we explored intensely in this particular course on routing switching troubleshooting mastery. Then we get to the star of the show, the TSHOOT exam. This particular exam really, really encompasses everything that we took a look at in this particular route switch troubleshooting mastery course.

00:07:24
Now there’s going to be several multiple choice, I’d say five to six multiple choice and potentially multiple answer multiple choice, to start this exam. Just five or six of them and then you move to a completely simulation-based test. So a bulk of this test, 85% of this exam, is indeed SIM testing.

00:07:50
Now we have a three-step process for you to utilize in order to get prepared for this particular exam. Step one is for you to go ahead and know exactly what’s going to be on this test. This is more so than ever something that is delivered to you by Cisco Systems.

00:08:14
Let me show you. So you remember how we get information about our certification exams, right? We go to cisco.com, we go to training and events, we go under training in certifications, and then we choose what we’re interested in, like CCNP route switch in this case, and then there’s our TSHOOT exam.

00:08:33
But actually this is going to link us to another area that’s very important that we visit to get the resources that I want you to utilize for this test. Notice up here in the top right it says Cisco learning network, go now. Yeah we want to head over to the Cisco learning network, and we’re going to go to the professional category and then the CCNP routing and switching.

00:08:59
Here for the TSHOOT exam, just go to the overview and these are the resources that we want. Look at this, it says, “exam demo and tutorial” “review the TSHOOT demo and tutorial.” This is absolutely remarkable. Let’s start with the TSHOOT exam instructions.

00:09:27
It says, do you want to download these instructions? I’ll say, yes absolutely, in fact, let me just go ahead and open them. Let me resize this, alright perfect. So here is a TSHOOT exam demonstration instructions, and it walks you through how this particular simulation is going to work in the exam.

00:09:51
So you’re going to read this closely. You’re going to have a simulation interface in the exam with tickets, over here on the right hand side, and then exhibits you can access. So when you click on a ticket, you’re going to get a question on what is going wrong, and then you’re going to get a question on how would you fix it.

00:10:16
Now to find all of this information you’re going to go in and explore on the particular devices. Please note that you are just exploring. OK? You’re not making configuration changes to the devices so while it’s a SIM, it’s a SIM for you to just explore the configuration of these various devices.

00:10:41
So the first document that we want to access there is the TSHOOT exam instructions. OK? Now you have the TSHOOT exam topology, and this is really unbelievable, folks. Cisco gives you the exact, that’s right I said exact, troubleshooting topology that you are going to find in the TSHOOT exam, the exact topology, letter for letter.

00:11:16
So can you memorize this layout of the Layer 3 topology? Can you memorize the RIP next generation and OSPF version 3 IPv6 topology? Can you memorize the Layer 2/3 topology? Absolutely, you can. If you wanted to ensure your preparedness, you would have as much of this committed to memory as you absolutely could.

00:11:43
This is the exact diagrams that you will see in the exam. Also what is this doing for you, it is demonstrating your exact scope. Are you going to get EIGRP? Absolutely. Are you going to get OSPF? Absolutely. Are you going to get network address translation? Yes you will.

00:12:03
Will you have a DHCP server set up in your exam to give clients their information? Yes you will. So you can see, from this topology, a look at the exact resources that you are going to need to troubleshoot. Absolutely amazing that we’re getting this level of detail from Cisco Systems.

00:12:26
And then, we have an exam demo link. This is so amazing. So we’re going to go in and we’re going to get to click on a sample ticket. And it’s going to say, OK which device is the fault condition located, based on some information. You can minimize that question, and then you can go into your topology, and you can start running your various Show Commands in order to determine what is wrong.

00:12:58
OK? So what you do is you go in and you take your first ticket, you respond to that first question, and then you’re going to say, next question I’m done– you’re not done yet you have three questions to answer in each of these testlets. It then says, what’s the fault condition related to.

00:13:20
And let’s say you discover that it was related to EIGRP routing, then you go to your next question, and then it says, what is the solution. And this is what each ticket is going to follow, Where is the problem, what is the problem, and how would you fix it? Every single ticket is going to follow that exact same methodology.

00:13:42
So let’s say the fix was we need to delete the passive interface default command under EIGRP. We would then say, done, and we are done with that ticket. Notice that ticket is now highlighted in red because we have answered all three of the questions. So what is your responsibility? What is your job? Well step one here is to go through, carefully, all of those particular resources that I just demonstrated for you, available at the Cisco learning network.

00:14:17
You’re going to know exactly how the exam is structured, you’re going to know the exact technologies on your exam, as well as the actual diagrams that they are going to be utilizing. Now your second step is to pick an overall troubleshooting methodology. I would highly recommend you use a combination of all of trace the path, and I would go ahead and combine that with the bottom-up approach.

00:14:50
If I could spell bottom– there we go– the bottom-up approach. So in other words, you’re going to literally– let’s say the problem involves some client being able to connect to a particular resource out there. And let’s say the resource is some server, Server A, and this is Client B, this is Switch 1, Switch 2, and Router 3. So if that’s the path you’re going to want to go ahead and start your troubleshooting right here and I would go bottom-up.

00:15:25
So check Layer 1, Layer 2, then move to Layer 3, then move to this device. Do your bottom-up troubleshooting. Then moved to this device, your bottom-up troubleshooting. Then this device. And then finally that device. So pick a strategy and then utilize that strategy very, very quickly in your exam.

00:15:47
By the way, one way you can save a lot of time is you could do a ping or a trace, right? In this case Client B is supposed to be able to get to Server A. Why not ping or trace and see how far you can get? Maybe you can get to this second switch so now you’ve just saved yourself some time.

00:16:08
You don’t need to engage in troubleshooting in those first two hops. So pick a strategy, and then go for it. Now our third step is to take advantage of something that is there to be taken advantage of in the exam interface. Let me demonstrate. What you have the ability to do is really take advantage of the exam interface that they presented to you.

00:16:35
You see, we have these base topologies, right? We have these base topologies, they’re identical for all of the tickets. And guess what, you have base configurations that are all identical. That’s right. So what you have the ability to do is if you were to see a configuration, let’s say it’s an EIGRP configuration, and notice here in fact that auto-summarization is indeed on, and you’re suspecting that might be a problem when you’re in a particular trouble ticket, and you’re working in this trouble ticket you can choose this abort button.

00:17:17
And you can leave that ticket. You can go to another ticket and you can go in, and you can do a comparison on the show run output. So you could say, show run and you could go down and you could scrutinize the EIGRP configuration in this ticket. So jumping around, leaving a ticket with the abort feature, and going to another ticket will allow you to compare these baseline configurations to help zero in on a particular configuration that you might think is a problem.

00:17:59
Again, what’s great is these three steps can be thoroughly practiced with, with all those great materials that Cisco provides at the Cisco learning network. Again, we can go in and memorize the topologies and the technologies. We can go ahead and practice with a tracing the path and bottom-up troubleshooting approach, and we can practice comparing the different configurations that are available in our ticket baselines, all with those great resources.

00:18:33
And that brings our discussion to exam 350-001, this is the CCIE written qualification exam. This qualifies you to take the practical lab exam or this would recertify your CCIE. Now please understand, once again, we have multiple choice, our multiple answer multiple choice, and the potential for simulations involving troubleshooting, and, as you might guess, the potential scope is every single Nugget of this series.

00:19:09
So all of the Layer 2, all the Layer 3 Nuggets that we explored, in this particular course, are all the game for this certification exam. And that brings us to the CCIE Routing and Switching Practical Lab Exam, version 4. In this particular exam, troubleshooting plays a massive part– you could almost argue that it’s too big a role that it plays, but I’ll leave that discussion for maybe some future Nugget.

00:19:39
But anyways, here is exactly what’s going to happen. When you go and you sit down and you begin your practical exam, your very first section, two hours in length, is the troubleshooting section. You must pass this in order to pass the CCIE Routing and Switching Lab Exam.

00:19:58
By the way, if you should finish before two hours you can start the next six-hour section of configuration and you will have extra time. I only recommend doing this if you are 110% sure that you pass the troubleshooting section. Don’t make the mistake of leaving it early, giving yourself extra time in config and failing it, because now you’ve failed the entire exam.

00:20:26
By the way, the configuration section could also have troubleshooting in it, and that’s why I’ve indicated that the troubleshooting in this test might be a little bit on the heavy side. Now what is your two-hour troubleshooting section like? Well you’re going to have about 30 devices, so about 30 routers and switches that you have.

00:20:49
But don’t let that intimidate you because Cisco will make it very clear to you, of the 30 devices, which are involved in your trouble tickets. You’re going to have 10 trouble tickets. So you’re going to have trouble tickets numbered 1-10 and for each of the trouble tickets it’s going to make it clear what is the subset of those devices that apply to the ticket.

00:21:16
The tickets are not interdependent. I repeat, the tickets are not interdependent. They’re not interrelated. That’s why Cisco needed so many devices. They needed so many devices to create the 10 non-interdependent tickets. So you can do these in any particular order that you deem appropriate.

00:21:39
Here’s what I would do if I were you. I would immediately, when starting this section, I would make a tracker. That’s right, I would go ahead and make a numbered tracker for all 10 of my trouble tickets. By the way, each of the trouble tickets will be worth two or three points.

00:22:01
They’re trying to give you an indication of the level of difficulty by varying, slightly, the point value on each of the tickets. Three points are going to be slightly more difficult, or require more configuration, than two point tasks. So what I would do is I would go in, and then create a tracker like this.

00:22:21
And I would begin with the very first ticket and I would give myself three minutes. Yep, I would give myself three minutes to try and identify what’s wrong. Then I would go ahead and give myself three minutes to try and fix it. If at any point I go over these three minute timers, boom, I’m going to record observations and then move on to the next one.

00:22:45
Again three minutes to isolate, three minutes to fix, then move on to the next one. Be very, very precise with these timers because you do not want to run out of time in this section and, as you might guess, that’s what happens to a lot of candidates. They might do them in order, and they might get number one, get number two, get number three, and they’re doing great time-wise, let’s say they got this one in five minutes, this one in four minutes, this one in eight minutes.

00:23:11
So they’re doing pretty good time wise and then they spend 20 minutes on ticket four. That is an absolute disaster and that is a surefire way to fail this. So I recommend three minutes for problem isolation. If you can’t do it, move on to the next one. Record your observations.

00:23:29
When you do isolate what’s wrong, then give yourself the three minutes to fix it. And again, if you can’t do it, jump around, recording as much observations as you deem helpful in your tracker, as you are moving through the tickets. So now I believe you realize that this particular course, that Keith Barker and myself put together for you, really does go through and challenge you, from a troubleshooting perspective, in a wide variety of important and real world relevant technologies.

00:24:04
But it also really gives you the foundation for success in all of the exams that we have here listed on screen. For those of you that might be joining us that are just starting your journey at the CCENT level, I hope you’ve enjoyed this particular Nugget getting a preview of the types of route switch troubleshooting that we’ll need to do in the various certifications that you have to look forward to.