Category: EIGRP

Graceful Restart, NSF, NSR and LFA

Key Concepts

  1. Graceful Restart and Non-Stop-Forwarding refers to the same technology
  2. Non-stop-routing is NOT the same as GR (NSF)
  3. When you have LFA and GR, which one do you prefer? LFA is preferred. If there is an pre-calculated alternative path, we can to switch to that alternative path, instead of routing through the current interface.
  4. What GR trying to do is to PREVENT convergence, and keeping the FIB. It is completely different than LFA trying to do, which is to converge as FAST as possible, with a pre-calculated LFA, normally within 50ms.
  5. ISSU requires GR and SSO. GR is to keep routing protocol states, while SSO is to keep the same for everything else (like router config sync, CEF tables).
  6. GR (NSF) is NOT support on SHAM-LINKS and virtual-links (OSPF)
  7. GR requires SSO support. So GR is only for platform with dual RP.
  8. Graceful Restart restarts the routing process, while NSR refreshes the routing process
  9. Graceful Restart is an IETF standard
  10. Graceful Restart MUST be supported on both peers, while NSR does not
  11. GR is supported for OSPF, IS-IS, EIGRP, BGP, LDP. NSR is supported for IS-IS, and BGP
  12. GR uses less memory than NSR, because NSR needs to transfer all states to the standby RP, while GR actually restart the routing process
  13. Both GR and NSR archive the same result, but GR is preferred, because it uses less memory. However if peer does not support GR, use NSR.
  14. Use GR for devices you have control, for example between RR and PE. Use NSR for devices you have no control, for example PE to CE.
  15. Careful thought need to be considered when tuning IGP fast hello when used in conjunction with GR/NSF/SSO/ISSU. If the dead timer is smaller than the the time it takes to perform state-ful failover, GR is not as useful because the IGP already detected link changed and triggered a LSA/LSP update. That negates the reason why we want to use GR. The purpose of GR is to avoid letter routing protocols to know there is a change on the link, to prevent SPF calculation.

External Links:

  1. http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6550/solution_overview_c22-487228.html
  2. http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/gr_ospf.html
Advertisements

Key Concepts to Understand for Routing Protocols

Protocols We Care About:

  1. OSPF
  2. IS-IS
  3. EIGRP
  4. BGP
  5. PIM (Spare/Bidir)

Protocol Theory

  1. How neighbors are formed and maintained
  2. How the best path is calculated
  3. How aggregation is configured and deployed
  4. How external routing information is handled
  5. How protocols interact

BGP PIC

There are three types of failure scenarios on a SP networks:

  1. P router failure
  2. PE router failure (or link to PE failure)
  3. CE failure (or PE to CE link failure)

The P and PE failure can be detected by IGP. Tuned OSPF and IS-IS both have converge within 1 second, and both have FRR LFA capability, with enable a local repair within 50ms.

The PE-CE link is typically not routed in SP IGP, so the convergence is based on MP-BGP. MP-BGP is convergence is slow, plus its increases as the number of prefixes increases.

One of the reasons why BGP is slow is because router vendor has configured FIB table to associate a BGP prefix directly to an interface. That is what CEF does. CEF’s job is to improve the route look up time by performing recursive look up on the BGP prefix, and store the directly connected next-hop in the FIB. This however creates an issue when we have a lot of BGP prefixes, because if there is a failure and IGP converged, CEF needs to update the connected next hop for ALL BGP prefixes.

The solution is BGP PIC. BGP PIC is a solution that enables router to update the BGP next-hop on the FIB by using a hierarchical FIB structure. It is a very simple solution. All BGP prefixes that have the same connected next hop are pointed to ONE next-hop. So now instead of charing thousands of connected next-hop, we just change ONE next-hop.

The end result is BGP FIB update time will be independence of the number of prefixes. Which makes BGP convergence time remains the same regardless of how many prefixes it carries.

What is BGP PIC (BGP FRR)

The BGP PIC Edge for IP and MPLS-VPN feature improves BGP convergence after a network failure. This convergence is applicable to both core and edge failures and can be used in both IP and MPLS networks. The BGP PIC Edge for IP and MPLS-VPN feature creates and stores a backup/alternate path in the routing information base (RIB), forwarding information base (FIB), and Cisco Express Forwarding and LFIB so that when a failure is detected, the backup/alternate path can immediately take over, thus enabling fast failover.

BGP PIC is essentially BGP equivalent of FRR, plus RIB/FIB/LFIB optimization using hierarchy on the next hop. 

Benefits of the BGP PIC Edge for IP and MPLS-VPN Feature

  • An additional path for failover allows faster restoration of connectivity if a primary path is invalid or withdrawn.
  • Reduction of traffic loss.
  • Constant convergence time so that the switching time is the same for all prefixes.

External Links

Routing Protocol Key Concepts

These are the key concepts I need to understand for routing protocols:

  1. How neighbors are formed and maintained
  2. How the best path is calculated
  3. How aggregation is configured and deployed
  4. How external routing information is handled
  5. How protocols interact

Bidirectional Forwarding Detection (BFD)

Overview

  • BFD payload packets are sent using encapsulation of each protocol/connection you want to monitor (IPv4, IPv6, 802.3, etc)
  • BFD control packets will be encapsulated in UDP datagram (src port 3784 ) 
  • BFD can run a pretty much any transport. RFC5880 does not define the transport protocol. Thus it can be run a layer 2 links, or layer 3 links.
  • RFC5881, however defines BFP on IPv4 and IPV6. BFD controls packets for IP run on UDP 3784. BFD Echo packets run on UDP 3785
  • BFD packets are unicast packets, even on shared medium like Ethernet. It does not have neighbor discovery feature.
  • BFP is especially useful for link down detection on Ethernet, since Ethernet does not have a download detection mechanism for nodes on the same ethernet segment
  • BFD supports MD5 and SHA-1 authentication
  • BFP enables fast neighbor failure detection. It supports many all routing protocols: EIGRP, OSPF, IS-IS, BGP
  • It can also be used to check MPLS LSP. It is more light weight compares to LSP-Ping
  • BFP can also be used to detect the liveliness of GRE tunnel
  • BFD supports two modes: Asynchronous Mode and Echo Mode. Echo mode is optional and negotiable.
  • The typical detection time is 50ms. But it is not a hard number. The detection time is based on implementation and negotiation between BFD neighbors.

External Links:

OSPF and EIGRP comparison

Features OSPF EIGRP
Hello Must have same interval with neighbor Can have Difference interval
Summarization Only at ABR and ASBR Anywhere
Default Hello 10 seconds, with 40 seconds dead timer 5 seconds, with 15 seconds dead timer
Concept of DR in a Multicast Access Network Yes No
 Hello Handshake  3-way  3-way

External Routes Preference for IGP and BGP

  • In general, EGP prefers external route and IGP prefers internal routes.
  • BGP prefers eBGP routes (AD 20) over iBGP routes (AD 200)
  • EIGRP prefers internal routers (AD 90) over external routes (AD 170)
  • OSPF prefers intra-area route over external routes (E1 and E2). By default OSPF ASBR redistribute external routes as E2 routes (internal costs are not considered). OSPF uses same AD (110) for internal and external routes. OSPF prefers E1 over E2 routes regardless of the metrics
  • IS-IS prefers internal routes over external routes. IS-IS uses same AD (115) for internal and external routes