Category: OSPF

OSPF as PE-CE routing protocol in Mpls

  1. What feature we can use to prevent routing loop for dual home CE?
    1. PE set “down bit” for LSA type 3 and reject LSA type 3 from CE with “down bit” set
  2. What feature we can use to keep external route as LSA type 5 or type 7 for network with dual OSPF processes when using MPLS L3VPN as inter-site connection?
    1. Configure domain-ID on PE. PE with same domain-ID advertises LSA 1/2/3 as LSA3 to CE. PE with difference domain-ID advertises all LSA as LSA type 5
  3. What feature we can use to prefer MPLS Backbone link over intra area backdoor link?
    1. Run “OSPF Sham-link” between PEs.

Graceful Restart, NSF, NSR and LFA

Key Concepts

  1. Graceful Restart and Non-Stop-Forwarding refers to the same technology
  2. Non-stop-routing is NOT the same as GR (NSF)
  3. When you have LFA and GR, which one do you prefer? LFA is preferred. If there is an pre-calculated alternative path, we can to switch to that alternative path, instead of routing through the current interface.
  4. What GR trying to do is to PREVENT convergence, and keeping the FIB. It is completely different than LFA trying to do, which is to converge as FAST as possible, with a pre-calculated LFA, normally within 50ms.
  5. ISSU requires GR and SSO. GR is to keep routing protocol states, while SSO is to keep the same for everything else (like router config sync, CEF tables).
  6. GR (NSF) is NOT support on SHAM-LINKS and virtual-links (OSPF)
  7. GR requires SSO support. So GR is only for platform with dual RP.
  8. Graceful Restart restarts the routing process, while NSR refreshes the routing process
  9. Graceful Restart is an IETF standard
  10. Graceful Restart MUST be supported on both peers, while NSR does not
  11. GR is supported for OSPF, IS-IS, EIGRP, BGP, LDP. NSR is supported for IS-IS, and BGP
  12. GR uses less memory than NSR, because NSR needs to transfer all states to the standby RP, while GR actually restart the routing process
  13. Both GR and NSR archive the same result, but GR is preferred, because it uses less memory. However if peer does not support GR, use NSR.
  14. Use GR for devices you have control, for example between RR and PE. Use NSR for devices you have no control, for example PE to CE.
  15. Careful thought need to be considered when tuning IGP fast hello when used in conjunction with GR/NSF/SSO/ISSU. If the dead timer is smaller than the the time it takes to perform state-ful failover, GR is not as useful because the IGP already detected link changed and triggered a LSA/LSP update. That negates the reason why we want to use GR. The purpose of GR is to avoid letter routing protocols to know there is a change on the link, to prevent SPF calculation.

External Links:



There are three types of failure scenarios on a SP networks:

  1. P router failure
  2. PE router failure (or link to PE failure)
  3. CE failure (or PE to CE link failure)

The P and PE failure can be detected by IGP. Tuned OSPF and IS-IS both have converge within 1 second, and both have FRR LFA capability, with enable a local repair within 50ms.

The PE-CE link is typically not routed in SP IGP, so the convergence is based on MP-BGP. MP-BGP is convergence is slow, plus its increases as the number of prefixes increases.

One of the reasons why BGP is slow is because router vendor has configured FIB table to associate a BGP prefix directly to an interface. That is what CEF does. CEF’s job is to improve the route look up time by performing recursive look up on the BGP prefix, and store the directly connected next-hop in the FIB. This however creates an issue when we have a lot of BGP prefixes, because if there is a failure and IGP converged, CEF needs to update the connected next hop for ALL BGP prefixes.

The solution is BGP PIC. BGP PIC is a solution that enables router to update the BGP next-hop on the FIB by using a hierarchical FIB structure. It is a very simple solution. All BGP prefixes that have the same connected next hop are pointed to ONE next-hop. So now instead of charing thousands of connected next-hop, we just change ONE next-hop.

The end result is BGP FIB update time will be independence of the number of prefixes. Which makes BGP convergence time remains the same regardless of how many prefixes it carries.


The BGP PIC Edge for IP and MPLS-VPN feature improves BGP convergence after a network failure. This convergence is applicable to both core and edge failures and can be used in both IP and MPLS networks. The BGP PIC Edge for IP and MPLS-VPN feature creates and stores a backup/alternate path in the routing information base (RIB), forwarding information base (FIB), and Cisco Express Forwarding and LFIB so that when a failure is detected, the backup/alternate path can immediately take over, thus enabling fast failover.

BGP PIC is essentially BGP equivalent of FRR, plus RIB/FIB/LFIB optimization using hierarchy on the next hop. 

Benefits of the BGP PIC Edge for IP and MPLS-VPN Feature

  • An additional path for failover allows faster restoration of connectivity if a primary path is invalid or withdrawn.
  • Reduction of traffic loss.
  • Constant convergence time so that the switching time is the same for all prefixes.

External Links

OSPF and IS-IS comparison

Area on RouterABR can be on different areasEach router can only be in ONE area. The border between areas is on the link that connects two routers that are in different areas. The reason for this difference is that an IS-IS router generally has one network service access point (NSAP) address, and an IP router generally has multiple IP addresses

features OSPF IS-IS
router ID router ID system ID
disconnect backbone virtual-link not supported
area separation on the ABR, area is per interface on the link, area is per router
transport protocol IP datalink
neighbor discovery IP multicast to Layer 2 multicast
support IPv6 OSPFv3 IS-IS support IPv6 with TLV extension
dual topology on single process not supported supported dual topology, one for IPv4 and IPv6
Totally Stubby Area option to configure the only option for level 1 area
SPF leaf IP subnets can be branch or leaf on the SPF Tree IP subnets are leaf on the SPF tree
easily expendable not easy easy with TLV, extended to support TRILL and OTV to carry MAC, and IPv6
default route to totally stubby area ABR advertises default route attached bit from L1/L2 router
  • ABR. Summarization can be applied to both directions
  • ASBR
  • L1/L2 router. L1 routes outes are summarized to L2 routes
  • When redistribution (ASBR), external routes are summarized to L1 or L2 routes
only advertise route installed in RIB ? Yes
advertise routes from another area into stub area regular area, stub area route leaking
prevent routing loop on between areas with multicast ABR
  • intra-area routes are preferred over inter-area routes
  • Up/Down Bit
  • L1 route is preferred over L2 route
LSA types many types per router (1,2,3,4,5,7) Just one LSP per router
Multiple Access Network optimization DR, BDR, LSA type 2 DIS, One LSP for per multiple access network. does not support backup DIR
LSA Acknowledgement LS Request and Update CSNP contains full LSP info, PSNP is the acknowledgement and request
LSA Max timer Max Age Timer (3600 default) Remaining Life Time (1200 default)
authentication md5 (ipsec for OSPFv3) md5
MPLS TE Support Yes Yes
Route Tag Only external routes Yes, all routes
Network Type
  • P2P
  • Broadcast
  • Non-broadcast
  • point-to-multipoint
  • P2P
  • Broadcast
BGP Slow Convergence Support Not supported Overload Bit
Designated Router Preemptive DR is not preempt, DIS is preempt, new router with higher priority will take over and become the new active DIS
Unrecognized LSA Not flooded ignored but flooded
Summarization at ASBR and ABR at ASBR and L1/L2 router
authentication MD5 (IPSEC AH,ESP for OSPFv3) cleartext and md5, separate authentication for Hello and LSP
metrics default metrics is reference is 100Mbps, it can be changed to other number all interface default to 10, and can be changed to max 64. total metrics for the network is 1023. For larger network, use wide metric. Route leaking between areas require narrow and wide metric, MPLS-TE supports wide metric only
Redistribution anywhere except in stub area. default to O2 external route (IGP metric is ignored), can be changed to O1 route, O1 is preferred over O2 any areas, level 1 or level 2 or level 1-2 routers. default to internal route, can be changed to external route, metric of 64 is added to external route. internal route is recommended. wide metric does not use external route
Network Types P2P, broadcast, P2MP,P2MP nonbroadcast, NBMA P2P, Broadcast
Link State Packet Types Many, LSA 1,2,3,4,5,7 There are four types of LSPs: Level 1 pseudonode, Level 1 nonpseudonode, Level 2 pseudonode, and Level 2 nonpseudonode
Backbone Area area 0 The IS-IS backbone is a contiguous collection of Level 2-capable routers, each of which can be in a different area

OSPF Hub and Spoke Common Design Issues

  • by default, OSPF is not optimized for hub and spoke topology. tuning is required.
  • classic OSPF issue: intra-area route is preferred over inter-area route, as a result sub-optimum routing. to correct, typically use virtual-link to connect ABR to ABR and use the virtual-link into area 0. the best practice is to have two links between two ABRs. One link in area 0, one link in the area x. If the ABRs carry over then one area, we want to have a link for each additional area to ensure optimum path for for area. the additional link can be VLANs.
  • In a NBMA network (FR) the best OSPF design is single Point-to-Multipoint interface at the Hub.
    • the advantage is
      • single IP subnet
      • no config per spoke
      • smaller database compare to point-to-point interface
    • the disadvantage is
      • additional host routes inserted in the routing table