Design Scenario 1 will cover a redundant OSPF Point-to-Multipoint WAN
Introduction
The "Design Scenario" is a series where I plan to show case different network or security designs. Generally I will list a few requirements, go over the designs and discuss the pros and cons. I will try to steer away from specific configurations as the goal is always to try and be vendor neutral. However there will be some scenarios where I will go over implementations, so in some cases I will provide specific configuration. I hope this provides you with an alternate way to look at a network engineering challenge and also feel free to share your thoughts or your own designs !
Above is the overall topology for this scenario. The focus will be on the WAN side which will utilize Point-to-Multipoint (P2MP) OSPF for routing. The main requirement here is to control traffic in a load balanced fashion, where there are two sets of remote sites that will each have a primary path and backup path simultaneously. Secondary requirements are simplicity and scalability. In addition to more bandwidth, by utilizing both WAN circuits we can provide fail over redundancy and better use of invested funds on circuits. Below is the traffic flow diagram.
Design 1 - Totally Stub Areas with ABR cost adjustments
The first design covers a more vendor neutral approach as I verified this configuration is supported with multiple vendors. The design calls for 2 areas which are totally stub (aka stub no-summary). Each group has it's own Area depending on which path is desired, so in this case I put remote site Group A into Area 11 and Group B into Area 22. You can see each Area border router (ABR) has an adjacency with each remote site, but each interface is either in Area 11 or 22. Due to the fact that each of the Areas are Totally Stubby there are no other routes present except for a default 0.0.0.0, this makes it easy to control the path advertised into the area via the default-cost on the ABR.
For the return traffic or traffic originating from the data center destined to a remote site, I chose to use summary routes on the ABRs into the backbone Area 0. By doing this we can set cost on each summary at the WAN boundary in order to have them preferred over one another. By doing this we avoid having to put cost adjustments on particular interfaces/circuits. Also because there are summaries on both sides, if one router suddenly becomes unavailable the backup route is already present (minimizes fail over time).
Pros:
Easily Scales
Vendor neutral configuration
Meets traffic flow requirements (both WAN circuits utilized)
Cons:
Somewhat complex ABR configurations
Can be difficult to summarize in a brownfield deployment
Some vendors might require area filtering at the ABR if totally stub isn't supported
Design 2 - Multi-Area with cost adjustments on remote routers
This choice is found to likely only work with Cisco routers (from my research). They have a handy feature that is perfectly suited for this situation. Due to the fact that we have all the interfaces in the same subnet to achieve the P2MP behavior, we are able to make a cost adjustment on a per neighbor basis. Therefore on each remote router we simply assign a higher and lower cost to each aggregation router. Also because of the interface nature mentioned above, each interface is seen as "directly connected"; consequently we are unable to apply cost to the aggregation router's sub-interfaces for each Vlan. (more on that later)
Configuration looks something like this:
Rmt1# router ospf 1
Rmt1# neighbor 10.1.1.1 cost 10
Rmt1# neighbor 10.1.1.2 cost 20
Moving on, you will see there is only 1 non-backbone area here. There are a couple reasons for this: 1. Because each aggregation router will have an adjacency over the L2 WAN, we can avoid any transit behavior due to the default OSPF rule in that inter-area routes are never preferred over intra-area routes. So if Agg1 learns a default route from Agg2 in area 22, it will not prefer to take it over an Area 0 learned default route (or vice versa). 2. In addition, Agg1 (or 2) will also not advertise itself back to remote sites as a path for that default route because the type 3 LSA was learned inside the non-backbone area the remote site is apart of (vs. learned from Area 0). Make sense?
We round out this design with the same type of summarization as design 1 into the backbone for the same purpose of controlling return traffic.
Pros:
Easily scales
Simpler than design 1
Meets traffic flow requirements (both WAN circuits utilized)
Cons:
Likely only capable with Cisco routers
Can be difficult to summarize in a brownfield deployment
Design 3 - Totally Stub area with no path load balancing
So this is probably a common setup in some networks. Generally from what I've seen per-interface cost is taught as the way to control traffic, so we would see one primary path and one backup path. I will outline in a minute why this likely isn't possible with the Layer 2 WAN.
Here I went with a similar setup as before with the totally stub area, however I have only 1 area for the WAN which means we can only provide 1 primary/backup path (vs. design 1). We do this the same way with the default-cost for the T. stub generated default route. You can see I also elected to summarize only on 1 ABR into Area 0 with /16 subnet masks as an example. Therefore the longer /23 or /24 subnet masks Agg1 advertises into the backbone will be preferred for return traffic.
Pros:
Easily scales
Generally simple design
Vendor neutral style configuration
Cons:
Does not entirely meet traffic flow requirements
Likely most cost ineffective design due to 1 circuit under-utilized (although there would be way to get around this by going with a regular stub area)
Possible bandwidth congestion issues by using only 1 circuit
Issues with Area 0 only design
So I'm sure you've read about Area 0 only for OSPF design, which is fine choice in a number of situations. However in this particular case there are a few issues with not using 1 or more non-backbone areas.
First, from what I've noticed when running this in the lab is that the WAN links are seen as directly connected so you cannot apply cost directly to each sub-interface to control traffic, this is how I discovered the per-cost neighbor command in Cisco ios. So when you attempt to apply cost to any of the WAN interfaces towards the cloud it doesn't actually apply to the route. Moreover, as I mentioned in design 2 there are issues with routes being advertised that are learned in the backbone vs a non-backbone area. Lastly, by using the Multi-area setup you can avoid the WAN cloud being used as transit behavior because of the OSPF rules of inter-area vs intra-area routes.
I believe it is just as simple logically to have the totally stub area with the default-cost on the ABRs as applying cost on a per-interface basis (if it was possible).
Conclusion:
I hope you enjoyed this design scenario and look forward to any questions, comments, or concerns. I believe this is a scalable solution that can be utilized to load-balance traffic on a Layer 2 style WAN. You can see there are certain choices that can be made based on which vendor equipment you have or by how much circuit utilization is needed to justify a purchase for a secondary path.
Comments