MPLS for a BGP-free core

Here we are with my second post on my new blog. This time I’ll talk about one of the benefits of implementing MPLS within the backbone, i.e. to allow you to have a BGP-free core. We’ll start with the following scenario:

post2_fig1_starting_config

We implement OSPF as IGP and every router (the Provider Edge and the inner Provider routers) has full knowledge of the point-to-point links and the loopbacks in blue in the scenario above. Let’s have a look, for example, at the routing table of PE 1 and at its OSPF configuration (the rest of configuration is trivial, just assign IP addresses to the interfaces):

PE1#show ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
 D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
 N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
 E1 - OSPF external type 1, E2 - OSPF external type 2
 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
 ia - IS-IS inter area, * - candidate default, U - per-user static route
 o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

 1.0.0.0/32 is subnetted, 1 subnets
O 1.1.1.1 [110/11] via 172.16.0.2, 00:13:58, FastEthernet0/0
 2.0.0.0/32 is subnetted, 1 subnets
O 2.2.2.2 [110/21] via 172.16.0.2, 00:12:31, FastEthernet0/0
 172.16.0.0/30 is subnetted, 3 subnets
O 172.16.0.8 [110/30] via 172.16.0.2, 00:12:31, FastEthernet0/0
O 172.16.0.4 [110/20] via 172.16.0.2, 00:13:48, FastEthernet0/0
C 172.16.0.0 is directly connected, FastEthernet0/0
 22.0.0.0/32 is subnetted, 1 subnets
O 22.22.22.22 [110/31] via 172.16.0.2, 00:10:27, FastEthernet0/0
 10.0.0.0/24 is subnetted, 1 subnets
C 10.1.0.0 is directly connected, FastEthernet0/1
 11.0.0.0/32 is subnetted, 1 subnets
C 11.11.11.11 is directly connected, Loopback0

PE1#sh run | sec router
router ospf 1
 router-id 11.11.11.11
 log-adjacency-changes
 network 11.11.11.11 0.0.0.0 area 0
 network 172.16.0.1 0.0.0.0 area 0

Requirement: enable communication between the two LAN segments in red, distributing the corresponding networks with iBGP (internal BGP).

1. iBGP connection between the PE routers: good starting point, but it doesn’t work

Let’s configure an iBGP connection between PE 1 and PE 2:

post2_fig1_iBGP_between_PE

PE1#sh run | sec router bgp
router bgp 65000
 no synchronization
 bgp log-neighbor-changes
 network 10.1.0.0 mask 255.255.255.0
 neighbor 22.22.22.22 remote-as 65000
 neighbor 22.22.22.22 update-source Loopback0
 no auto-summary

PE2#sh run | sec router bgp
router bgp 65000
 no synchronization
 bgp log-neighbor-changes
 network 10.2.0.0 mask 255.255.255.0
 neighbor 11.11.11.11 remote-as 65000
 neighbor 11.11.11.11 update-source Loopback0
 no auto-summary

This is sufficient to let PE 1 know about LAN connected to PE 2 (and vice versa):

PE1#sh ip route 10.2.0.0 255.255.255.0
Routing entry for 10.2.0.0/24
 Known via "bgp 65000", distance 200, metric 0, type internal
 Last update from 22.22.22.22 00:04:10 ago
 Routing Descriptor Blocks:
 * 22.22.22.22, from 22.22.22.22, 00:04:10 ago
 Route metric is 0, traffic share count is 1
 AS Hops 0

So try to ping 10.2.0.1 from 10.1.0.1:

PE1#ping 10.2.0.1 source 10.1.0.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.2.0.1, timeout is 2 seconds:
Packet sent with a source address of 10.1.0.1
.....
Success rate is 0 percent (0/5)

It doesn’t work… why? Let’s investigate: PE 1 routing table says that 10.2.0.0/24 is reachable through 22.22.22.22, so a recursive lookup for 22.22.22.22 in the routing table succeeds and tells PE 1 that it must forward the packet to 172.16.0.2. This recursive lookup is automatically done and the result is programmed within the CEF:

PE1#sh ip cef exact-route 10.1.0.1 10.2.0.1
10.1.0.1 -> 10.2.0.1 : FastEthernet0/0 (next hop 172.16.0.2)

So, we can assume that the ping is forwarded out of FastEthernet 0/0 of PE 1 and it should be received by P 1. So, we use the follow-the-path troubleshooting technique and look at what happens on P 1 when the echo request is received on FastEthernet 0/0. We’ll use debug ip packet and disable ip route-cache on the FastEthernet 0/0 interface of P 1 because otherwise we can’t see anything but locally generated/processed IP packets. This is the modified configuration and what we see as soon as we send out an ICMP echo request:

P1#sh run | sec access-list
access-list 100 permit ip host 10.1.0.1 host 10.2.0.1
P1#sh run int fa 0/0
!
interface FastEthernet0/0
 ip address 172.16.0.2 255.255.255.252
 no ip route-cache cef
 no ip route-cache
end

P1#debug ip packet 100 detail
IP packet debugging is on (detailed) for access list 100
P1#
*Mar 1 00:20:30.083: IP: s=10.1.0.1 (FastEthernet0/0), d=10.2.0.1, len 100, unroutable
*Mar 1 00:20:30.083: ICMP type=8, code=0

It says unroutable, so give a look at P 1 routing table:

P1#sh ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
 D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
 N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
 E1 - OSPF external type 1, E2 - OSPF external type 2
 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
 ia - IS-IS inter area, * - candidate default, U - per-user static route
 o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

 1.0.0.0/32 is subnetted, 1 subnets
C 1.1.1.1 is directly connected, Loopback0
 2.0.0.0/32 is subnetted, 1 subnets
O 2.2.2.2 [110/11] via 172.16.0.6, 00:22:47, FastEthernet0/1
 172.16.0.0/30 is subnetted, 3 subnets
O 172.16.0.8 [110/20] via 172.16.0.6, 00:22:47, FastEthernet0/1
C 172.16.0.4 is directly connected, FastEthernet0/1
C 172.16.0.0 is directly connected, FastEthernet0/0
 22.0.0.0/32 is subnetted, 1 subnets
O 22.22.22.22 [110/21] via 172.16.0.6, 00:22:48, FastEthernet0/1
 11.0.0.0/32 is subnetted, 1 subnets
O 11.11.11.11 [110/11] via 172.16.0.1, 00:22:58, FastEthernet0/0

The packet is discarded, because P 1 doesn’t have a route toward 10.2.0.1!

2. Implement a full mesh of iBGP connections

We can solve the issue of previous section by implementing iBGP also on the two P routers. The problem is that it is not sufficient to implement iBGP between PE 1 and P1 and between PE 2 and P 2: one of the rules of BGP is that a route received from an iBGP peer is not propagated to another iBGP peer, so we must implement a full-mesh of iBGP relationships. This means we must add 5 iBGP peerings (two between PE 1 and the two P routers, one between P 1 and P 2 and two between PE 2 and the two P routers). This would solve our problem, but we want to avoid this solution, that is too expensive in terms of configuration if the P routers are more than 2 (we could also implement BGP Route Reflectors or other mechanisms, but these are out of the scope of this post). 

3. Enable MPLS in the backbone

We can easily solve our problem by enabling MPLS (Multi Protocol Label Switching) on the routers. We can define MPLS as a 2.5-Layer protocol: when a packet must be sent out an MPLS-enabled interface, a label is applied between the Ethernet and the IP layer and it is sent out to the usual next-hop. The router that receives the packet can forward it without looking at the destination IP address but simply looking at the label, swapping it with a proper label and forwarding it outside another interface. Each router tells to its neighbors which labels to use to reach the networks it knows (due to the IGP) through itself, using LDP (Label Distribution Protocol).

We enable mpls at the global level and under the point-to-point interfaces on all the routers, I’ll show P 1 configuration as an example:

! MPLS needs ip CEF to be enabled in order to work
ip cef 
!
mpls label protocol ldp
mpls ldp router-id Loopback0
!
interface FastEthernet 0/0
 mpls ip
interface FastEthernet 0/1
 mpls ip

Now let’s ping again:

PE1#ping 10.2.0.1 source 10.1.0.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.2.0.1, timeout is 2 seconds:
Packet sent with a source address of 10.1.0.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/41/56 ms

It works! P 1 router ip packet debug doesn’t show anything, because the IP packet is not even processed by P 1: it is received on Fa 0/0, the MPLS label is swapped, and it is forwarded on Fa 0/1.

Now we’ll have a look at how the label for 22.22.22.22, i.e. the next hop on PE 1 to 10.2.0.0/24, is received by PE 1 and how the label is swapped when it goes through the network.

Label propagation for 22.22.22.22/32 from Pe 2 to Pe 1

PE 1

Here we see the Penultimate Hop Popping (PHP) feature of MPLS. If a router is the final destination for a prefix, it tells its neighbors not to use an MPLS label when sending an IP packet for that prefix to it, because it is useless:

PE2#show mpls ldp bindings 22.22.22.22 32
 tib entry: 22.22.22.22/32, rev 12
 local binding: tag: imp-null
 remote binding: tsr: 2.2.2.2:0, tag: 18

So, PE 2 tells to P 2 (the only LDP neighbor it has) not to use an MPLS label, and receives the instruction to use label 18 to reach 22.22.22.22/32 through P 2 (2.2.2.2).

P 2

Let’s have a look at the ldp binding on P 2:

P2#show mpls ldp bindings 22.22.22.22 32
 tib entry: 22.22.22.22/32, rev 12
 local binding: tag: 18
 remote binding: tsr: 1.1.1.1:0, tag: 18
 remote binding: tsr: 22.22.22.22:0, tag: imp-null

P 2 tells to its neighbors that its label for 22.22.22.22/32 is 18. We can see that LDP neighbor 22.22.22.22 (PE 2) says to use no label to reach 22.22.22.22/32 through it, while neighbor 1.1.1.1 says to use label 18 (the same as P 2, but it is not relevant).

P 1

Now it’s time to look at P 1:

P1#show mpls ldp bindings 22.22.22.22 32
 tib entry: 22.22.22.22/32, rev 12
 local binding: tag: 18
 remote binding: tsr: 11.11.11.11:0, tag: 20
 remote binding: tsr: 2.2.2.2:0, tag: 18

The local label for 22.22.22.22/32 is 18, the same received from P 2. Then there is the label received by PE 1, which is 20.

PE 1

Finally, we arrived on PE 1:

PE1#show mpls ldp bindings 22.22.22.22 32
 tib entry: 22.22.22.22/32, rev 12
 local binding: tag: 20
 remote binding: tsr: 1.1.1.1:0, tag: 18

It receives the instruction to use label 18 from neighbor 1.1.1.1 (P 1) and it’s local label for 22.22.22.22/32 is 20.

Label-switched forwarding of packet from 10.1.0.1 to 10.2.0.2

We now go in the opposite direction, following the travel of an ICMP Echo Request from 10.1.0.1 to 10.2.0.

PE 1

As we’ve seen at the beginning, a packet to 10.2.0.1 has 22.22.22.22 as next-hop on P 1 and a recursive lookup tells PE 1 to forward it to 172.16.0.2, i.e. P 1, but this time the packet is sent out with an MPLS label:

PE1#show mpls forwarding-table 10.2.0.1
Local Outgoing   Prefix        Bytes tag  Outgoing   Next Hop
tag   tag or VC  or Tunnel Id  switched   interface
20    18         10.2.0.0/24   0          Fa0/0      172.16.0.2

PE1#show ip cef 10.2.0.1
10.2.0.0/24, version 18, epoch 0, cached adjacency 172.16.0.2
0 packets, 0 bytes
 tag information from 22.22.22.22/32, shared
 local tag: 20
 fast tag rewrite with Fa0/0, 172.16.0.2, tags imposed: {18}
 via 22.22.22.22, 0 dependencies, recursive
 next hop 172.16.0.2, FastEthernet0/0 via 22.22.22.22/32
 valid cached adjacency
 tag rewrite with Fa0/0, 172.16.0.2, tags imposed: {18}

PE 1 does not have a label for 10.2.0.0/24, since it is not received through the IGP (OSPF) nor it is a locally known network, so show mols forwarding-table 10.2.0.1 shows the label it is applied to reach 22.22.22.22, which is the next-hop for 10.2.0.0/24.

P 1

P1#show mpls forwarding-table
Local Outgoing   Prefix         Bytes tag Outgoing  Next Hop
tag   tag or VC  or Tunnel Id   switched  interface
16    Pop tag    2.2.2.2/32     0         Fa0/1     172.16.0.6
17    Pop tag    172.16.0.8/30  0         Fa0/1     172.16.0.6
18    18         22.22.22.22/32 12944     Fa0/1     172.16.0.6
19    Pop tag    11.11.11.11/32 7489      Fa0/0     172.16.0.1

P 1 swaps tag 18 with… tag 18 and sends the packet out onto FastEthernet 0/1 toward P 2.

This is the packet forwarded from P 1 to P 2, with the MPLS encapsulation header with label 18:

MPLS_packet

P 2

P2#show mpls forwarding-table
Local Outgoing   Prefix         Bytes tag Outgoing  Next Hop
tag   tag or VC  or Tunnel Id   switched  interface
16    Pop tag    1.1.1.1/32     0         Fa0/1     172.16.0.5
17    Pop tag    172.16.0.0/30  0         Fa0/1     172.16.0.5
18    Pop tag    22.22.22.22/32 12911     Fa0/0     172.16.0.10
19    19         11.11.11.11/32 8494      Fa0/1     172.16.0.5

When P 2 receives the packet with label 18, it removes the label (due to the PHP feature) and sends it out on FastEthernet 0/0.

PE 2

PE2#show ip cef 10.2.0.1
10.2.0.1/32, version 4, epoch 0, receive

PE 2 receives an IP packet with 10.2.0.1 as destination and it can receive, i.e. process, it.

Conclusion

In this post, we’ve seen how MPLS, which is easily and quickly configured on the backbone, saves us from implementing an iBGP full-mesh. It works between Layer 2 and Layer 3, avoids internal routers to examine the IP packets destined to the PE routers and let them forward IP packets destined to IP addresses they don’t even know.

I hope you enjoyed this post, feel free to post comments or contact me.

Advertisements
This entry was posted in Networking and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s