Data Center Topologies

A typical data center will have end hosts connect to top of rack ( ToR ) switches typically using 1GigE or 10GigE links.  These ToR / access switches will contain a number of end host ports, usually 48GigE to physically connect the end stations.  Because this layer has a high number of ports, its configuration aims for simplicity and ease of management.  The ToR also have a number of 10GigE or 40GigE uplink ports to connect to an upstream device.  Depending on the size of the data center these ToR switches sometimes connect to one or more end of row ( EoR ) switches. The design of the data center topology is to provide a rich connectivity among the ToR switches so that all application and end-user requirements are satisfied. The diagrams below display the ToR and EoR server connectivity models.

Server Connection Model

Server Connection Model

 

Server Connection Model

Server Connection Model

 

Fabric Extenders – FEX

Cisco has introduced the concept of Fabric Extenders, these switches are not Ethernet switches but remote line cards of a virtualized modular chassis ( parent switch ).  This allows scalable topologies that were not previously possible with traditional Ethernet switches in the access layer.  You should relate a FEX device like a remote line card that attach to a parent switch.  All the configuration is done on the parent switch yet physically the fabric extender could be in a different location.  The mapping between the parent switch and the FEX ( fabric extender ) is done via a special VN-Link.  FEXs come with a range of connectivity solutions, including 100 Megabit Ethernet, 1 Gigabit Ethernet, 10 Gigabit Ethernet ( copper and fiber ) and 40 Gigabit Ethernet and can be synchronized with the following models of parent switches – Nexus 5000, Nexus 6000, Nexus 7000, Nexus 9000 , as well as Cisco UCS Fabric Interconnect.  Because of the simplicity of FEX they have very low  latency ( as low as 500 nanoseconds ) when compared to traditional Ethernet switches.

Fabric Extender Topology

Fabric Extender Topology

 

Traditional switches like the 3750’s are still common in the data centers access layer and can be used with the stacking technology which combines two physical switches into one logical switch.  This stacking technology allows you to build highly resilient switching system, one switch at a time. If you are looking at a standard access layer switch like the 3750’s , you should consider the next-generation Catalyst 3850 series.  The 3850 supports BYOD/mobility and offers a variety of performance and security enhancements to previous models.

 

Layer 2 and Layer 3

Depending on the topology deployed the forwarding of the packets at the access layer can be either Layer 2 or Layer 3.  A layer 3 approach would involve additional management and the configuration of IP addresses on hosts in a hierarchy fashion that match the switches assigned IP address.  An alternative approach is to use layer 2 which has less overhead as Layer 2 MAC addresses do not need specific configuration but has its drawbacks with scalability and poor performance. Generally, access switches are focused on the communication of servers that are in the same IP subnet allowing any type of traffic – unicast, multicast, or broadcast. You can however have filtering devices such as a Virtual Security gateway ( VSG )  to permit traffic between servers but that is generally reserved for inter POD ( Platform Optimized Design ) traffic.

 

Traffic Flow

The types of traffic in data center topologies can be North – South or East to West.  North – South ( up / down ) corresponds to traffic between the servers and the external world ( outside the data center ).  East to West corresponds to internal communication between servers i.e traffic does not leave the data center.  Determining the type of traffic up front is important as it influences the type of topology used in the data center.

Traffic Flow

Traffic Flow

 

For example, you may have a pair of ISCSI switches and all traffic is internal between the servers. In this case, you would need high bandwidth inter switch links, usually an ether-channel to support all the cross server talk and the only north to south traffic would be management traffic. In another part of the data center you may have DATA server farm switches which have only HSRP heart beat traffic across the inter switch links and large bundled uplinks for a high volume of north to south traffic. Depending on the type of application which can be either outward facing or internal computation will influence the type of traffic that will be dominant.

 

Network virtualization and VXLAN

Network virtualization and the  ability for a physical server to host many VM’s and the ability for those VM’s to move is also used extensively in data centers, either for workload distribution or business continuity. This will also affect the type of design you have at the access layer.  In a Layer 3 farbic, migration of a VM across that boundary changes its IP address, resulting in a reset of the TCP sessions because unlike SCTP, TCP does not support dynamic address configuration.  In a layer 2 fabric , migrating a VM incurs ARP overhead and requires forwarding on millions of flat MAC address which leads to MAC scalability and poor performance problems.

Network virtualization plays an important role in the data center and technologies like VXLAN attempt to move control plan from the core to the edge, attempting to stabilize the core to have only a handful of addresses for each ToR switch.

VXLAN

VXLAN

Regardless of whether you can building L2 or L3 in the access and using VXLAN or some other overlay to stabilize the core you need to modularize the data center . The first is to build each POD or rack as a complete unit.  Each of theses PODs will be able to perform all its functions within that POD.

POD: It is a design methodology aims to simplify, speed deployment and optimize utilization of resources as well as drive interoperability of the three ore data center components: server, storage and networks.

For example, one POD might be specific human resources systems.  The second is modularity based on the type of resources offered.  For example, storage pod or bare metal compute may be housed in separate pods.  Applying these two type of modularization allows designers to easy control inter POD traffic with a predefined set of policies.  It also allows operators to upgrades PODs and a specific type of service at once without affecting other PODS.  This type of segmentation does not address the scale requirements of the data center.  Even when we have properly modularized the data center into specific portions the MAC table sizes on each switch still increases exponentially as the data center grows.

 

Current and future design factors

New technologies with scalable control planes must be introduced for a cloud-enabled data center and these new control planes show offer:

a) The ability to scale MAC addresses

b) First-Hop Redundancy Protocol ( FHRP ) multi pathing and Anycast HSRP

c) Equal Cost multi pathing

d) MAC learning optimizations

There are a number of design factors you need to take into account when designing a data center. What is the growth rate in terms of servers, switch ports, customers of the data center?  This is to avoid part of the network topology becoming a bottleneck or links over congested.

What is the application bandwidth demand? this demand is usually translated into what we call oversubscription. In data center networking, oversubscription basically refers to how much bandwidth switches at each layer offer to downstream devices.  Over subscription is normal in a data center design you limit over subscription to the ToR and edge of the network and this offers you a single place to start when you experience performance problems.

In tandem with IP, Ethernet forms the basis of data center networking and since its inception 40 years ago, Ethernet  frames have been transmitted over a variety of physical media, even barbed wire. Ethernet 6-byte MAC addressing is flat and the address is typically assigned by the manufacturer without any consideration to its location.  Ethernet-switched networks have no explicit routing protocols to spread readability about the flat addresses of the servers NICs.  Instead flooding and address learning are used to create forwarding table entries.  On the other hand, IP addressing is hierarchy, meaning that its address is assigned by the network operator based on its location in the network. The advantage of a hierarchy address space is that forwarding tables can be aggregation and if summarization or other routing technique are employed changes in one side of the network will not necessary affect other areas of the network. This makes IP-routed networks more scalable than Ethernet-switches networks.  IP-routed networks also offer ECMP techniques which enables networks to use parallel  links between nodes without spanning tree disabling one of those links.  The ECMP method hashes packet headers before one of the bundled links is selected to avoid out-of-sequence packets within individual flows.

Giving the above considerations, there is a need to find a solution that encompasses both the benefits of L2’s plug and play flat addressing and the scalability of IP.  Location-Identifier Split Protocol  ( LISP ) has a set of solutions that use hierarchical addresses as locators in the core and flat addresses as identifiers in the edges.  Equivalent approaches such as THRILL and Fabric Path are being used to create massive scalable L2 multipath networks with equidistant endpoint. Tunneling is also being used to extend down to the server and access layer to overcome the 4K limitation with traditional VLANs.

 

 

About Matt Conran

Matt Conran has created 155 entries.