data center topology

Merchant Silicon

 

Data Center Topology Types

 

Merchant Silicon

In the ever-evolving landscape of technology, innovation continues to shape how we live, work, and connect. One such groundbreaking development that has caught the attention of experts and enthusiasts alike is merchant silicon. In this blog post, we will explore the remarkable capabilities of merchant silicon and its far-reaching impact across various industries.

Merchant silicon refers to off-the-shelf silicon chips designed and manufactured by third-party companies. These versatile chips can be used in various applications, offering cost-effective solutions for businesses.

Highlights: Merchant Silicon

  • Landscape Changes

Some data center vendors offer a “Debian” based operating system for network equipment. Their philosophy is that engineers should manage switches just like they manage servers with the ability to use existing server administration tools. They want networking to work as a server application. For example, Cumulus has created the first full-featured Linux distribution for network hardware. It allows designers to break free from proprietary networking equipment and utilize the advantages of the SDN Data Center.

  • Issues with Traditional Networking

Cloud computing, distributed storage, and virtualization technologies are changing the operational landscape. Traditional networking concepts do not align with new requirements and continually act as blockers to business enablers. Decoupling hardware/software is required to keep pace with the innovation needed to meet the speeds and agility of cloud deployments and emerging technologies.

 

Before you proceed, you may find the following helpful:

  1. LISP Hybrid Cloud
  2. Modular Building Blocks
  3. Virtual Switch
  4. Overlay Virtual Networks
  5. Virtual Data Center Design

 



Merchant Silicon

Key Data Center Topology Discussion Points:


  • Introduction to data center topology and what is involved.

  • Highlighting the disaggregation model that can be used in data centers.

  • Critical points on Merchant Silicon.

  • Technical details on design best practices.

  • Technical details on MLAG implementation and FHRP.

 

Back to basic with Merchant silicon

Merchant silicon is a term that is used to describe chips. Usually, ASICs (Application Specific Integrated Circuits) are developed by an entity, not the company selling the switches. Then we have custom silicon that is the opposite of Merchant Silicon. Custom silicon is a term used to describe chips, usually ASICs, that are custom designed and traditionally built by the company selling the switches in which they are used.

 

Benefits of Merchant Silicon:

1. Cost-Effectiveness: One of the primary advantages of merchant silicon is its cost-effectiveness. Since these chips are mass-produced, they are available at a lower cost than custom chips. This allows networking equipment manufacturers to deliver high-performance solutions at a more affordable price, making networking technology more accessible to a broader audience.

2. Flexibility and Innovation: Merchant silicon allows network equipment manufacturers to choose the best chipset. They can select chips from various vendors, offering different features and capabilities. This enables manufacturers to innovate and differentiate their products, creating a more diverse and competitive networking landscape.

3. Time-to-Market: Developing custom chips can be a time-consuming process. By leveraging merchant silicon, networking equipment manufacturers can significantly reduce their time-to-market, as they can quickly integrate pre-existing, tested chips into their products. This allows them to bring new networking solutions to market faster, meeting the ever-increasing demands of the industry.

Impact on the Networking Industry:

Merchant silicon has profoundly impacted the networking industry, transforming how networks are built and operated. Here are some key areas where merchant silicon has made a difference:

1. Performance and Scalability: With the advancements in merchant silicon, networking equipment manufacturers can now deliver higher performance and scalability in their products. These chips offer greater processing power, faster data rates, and improved packet forwarding capabilities, enabling networks to handle more traffic and meet the growing demands of bandwidth-intensive applications.

2. Openness and Interoperability: Merchant silicon promotes openness and interoperability in networking. Since network equipment manufacturers are not tied to proprietary chipsets, they can build solutions that adhere to industry standards and work seamlessly with equipment from different vendors. This fosters a more open, collaborative networking ecosystem where interoperability and compatibility are prioritized.

3. Innovation and Differentiation: By leveraging merchant silicon, networking equipment manufacturers can focus on developing innovative software solutions and features that differentiate their products in the market. This has led to new technologies, such as software-defined networking (SDN) and network function virtualization (NFV), revolutionizing how networks are designed, managed, and optimized.

 

Disaggregation Model

Disaggregation is the next logical evolution in data center topologies. Cumulus does not reinvent all the wheels; they believe that routing and bridging work well, with no reason to change them. Instead, they use existing protocols to build on the original networking concept base. The technologies they offer are based on well-designed current feature sets. Their O/S enables dis-aggregation of switching design to the server hardware/software disaggregation model.

Disaggregation decouples hardware/software on individual network elements. Today modern networking equipment is proprietary, which makes it expensive and hard to manage. Disaggregation allows designers to break free from vertically integrated networking gear. It also allows you to separate the procurement decisions around hardware and software.

 

Data Center Topology Types
Diagram: Data Center Topology Types.

 

Data center topology types and merchant silicon

Previously, we needed proprietary hardware to provide networking functionality. Now, the hardware allows many of those functions in “merchant silicon.” In the last ten years, we have seen a massive increase in the production of merchant silicon. Merchant silicon is a term used to describe the use of “off-the-shelf” chip components to create a network product enabling open networking. Currently, three major players for 10GbE and 40GbE switch ASIC are Broadcom, Fulcrum, and Fujitsu.

In addition, cumulus supports the Broadcom Trident II ASIC switch silicon, also used in the Cisco Nexus 9000 series. Merchant silicon’s price/performance ratio is far better than proprietary ASIC.

 

Routing isn’t broken – Simple building blocks.

To disaggregate networking, we must first simplify itNetworking is complicated. Sometimes less is more. Building powerful ecosystems using simple building blocks with existing layer 2 and layer 3 protocols is possible. Internet Protocol (IP) is the underlying base technology and the basis for every large data center. MPLS is an attractive, helpful alternative, but IP is a mature building block today. IP is based on a standard technique, unlike Multichassis Link Aggregation (MLAG), which is vendor-specific.

 

Multichassis Link Aggregation (MLAG) implementation

Each vendor has various MLAG variations; some operate with unified and separate control planes. MLAG offers suitable control planes: Juniper with Virtual Chassis, HP with Intelligent Resilient Framework (IRF), Cisco Virtual Switching System, and cross-stack EtherChannel. MLAG, with separate control planes, includes Cisco Virtual Port-Channel (vPC) and Arista MLAG.

With all the vendors out there, we have no standard for MLAG. Where specific VLANs can be isolated to particular ToRs, Layer 3 is a preferred alternative. Cumulus Multichassis Link Aggregation (MLAG) implementation is an MLAG daemon written in python.

The specific implementation of how the MLAG gets translated to the hardware is ASIC independent, so in theory, you could run MLAG between two boxes that are not running the same chipset. Similar to other vendor MLAG implementations, limited to two spine switches. If you require anything to scale, move to IP. The beauty of IP is that you can do much stuff without relying on proprietary technologies.

 

Data center topology types: A design for simple failures

Everyone building networks at scale is building them as a loosely simple coupled system. People are not trying to over-engineer and build exact systems. High-performance clusters are excellent applications and must be made a certain way. A general-purpose cloud is not built that way. Operators build “generic” applications over “generic” infrastructure. Designing and engineering networks with simple building blocks lead to simpler designs with simple failures. Over-engineering networks experience complex failures that are time-consuming to troubleshoot. When things fail, they should fail.

Building blocks should be constructed with straightforward rules. Designers understand you can build extensive networks with simple rules and building blocks. For example, analyzing Spine Leaf architecture looks complicated. But in terms of the networking fabric Cumulus ecosystem is made of a straightforward building block – fixed form-factor switches. It makes failures very simple.

On the other hand, if the chassis base switch fails, you need to troubleshoot many aspects. Did the line card not connect to the backplane? Is the backplane failing? All these troubleshooting steps add complexity. With the disaggregated model, when networks fail, they fail in simple ways. Nobody wants to troubleshoot a network when down. Cumulus tries to keep the base infrastructure simple and not complement every tool and technology.

For example, if you use Layer 2, MLAG is your only topology. STP is simply a fail-stop mechanism and is not used as a high convergence mechanism. Rapid Spanning Tree Protocol (RSTP) and Bridge Protocol Data Units (BPDU) are all you need; you can build straightforward networks with these.

 

Virtual router redundancy

First Hop Redundancy Protocol (FHRP) now becomes trivial. Cumulus uses Anycast Virtual IP/MAC, eliminating complex FHRP protocols. You do not need a protocol in your MLAG topology to keep your network running. They support a variation of the Virtual Router Redundancy Protocol (VRRP) known as Virtual Router Redundancy (VRR). It’s like VRRP without the protocol and supports an active-active setup. It allows hosts to communicate with redundant routers without dynamic or router protocols.

Merchant silicon has emerged as a driving force in the networking industry, offering cost-effectiveness, flexibility, and faster time-to-market. This technology has enabled networking equipment manufacturers to deliver high-performance solutions, promote interoperability, and drive innovation. As the demand for faster, more reliable networks continues to grow, merchant silicon will play a pivotal role in shaping the future of networking technology.

 

Data Center Topology

Data Center Topologies

Data Center Topology

In the world of technology, data centers play a crucial role in storing, managing, and processing vast amounts of digital information. However, behind the scenes, a complex infrastructure known as data center topology enables seamless data flow and optimal performance. In this blog post, we will delve into the intricacies of data center topology, its different types, and how it impacts the efficiency and reliability of data centers.

Data center topology refers to a data center's physical and logical layout. It encompasses the arrangement and interconnection of various components like servers, storage devices, networking equipment, and power sources. A well-designed topology ensures high availability, scalability, and fault tolerance while minimizing latency and downtime.

Table of Contents

Highlights: Data Center Topoloigy

Choosing a topology

Data centers are the backbone of many businesses, providing the necessary infrastructure to store and manage data and access applications and services. As such, it is essential to understand the different types of available data center topologies. When choosing a topology for a data center, it is necessary to consider the organization’s specific needs and requirements. Each topology offers its advantages and disadvantages, so it is crucial to understand the pros and cons of each before making a decision.

Scalability of the topology

Additionally, it is essential to consider the scalability of the topology, as a data center may need to accommodate future growth. By understanding the different topologies and their respective strengths and weaknesses, organizations can make the best decision for their data center.

A data center topology refers to the physical layout and interconnection of network devices within a data center. It determines how servers, switches, routers, and other networking equipment are connected, ensuring efficient and reliable data transmission. Topologies are based on scalability, fault tolerance, performance, and cost.

Typical data center topologies

Typical data center topologies connect end hosts to the top rack ( ToR ) switches, typically using 1GigE or 10GigE links. These ToR/access switches will contain several end-host ports, usually 48GigE, to connect the end stations physically.

Because this layer has many ports, its configuration aims for simplicity and ease of management. The ToR also has several 10GigE or 40GigE uplink ports to connect to an upstream device. Depending on the data center network topology, these ToR switches sometimes connect to one or more end-of-row ( EoR ) switches, resulting in different data center topology types.

The design of the data center topology is to provide rich connectivity among the ToR switches so that all application and end-user requirements are satisfied. The diagrams below display the ToR and EoR server connectivity models commonly seen in the SDN data center.

Related: For pre-information, you may find the following post helpful

  1. ACI Cisco
  2. Virtual Switch
  3. Ansible Architecture
  4. Overlay Virtual Networks



Data Center Network Topology

Key Data Center Topologies Discussion Points:


  • End of Row and Top of Rack designs.

  • The use of Fabric Extenders.

  • Layer 2 or Layer 3 to the Core.

  • The rise of Network Virtualization.

  • VXLAN transports.

  • The Cisco ACI and ACI Network.

Back to Basics: Data Center Network Topology

A data center is a physical facility that houses critical applications and data for an organization. A data center consists of a network of computing and storage resources that support shared applications and data delivery. A data center’s components are routers, switches, firewalls, storage systems, servers, and application delivery controllers.

Enterprise IT data centers support the following business applications and activities:

  • Email and file sharing
  • Productivity applications
  • Customer relationship management (CRM)
  • Enterprise resource planning (ERP) and databases
  • Big data, artificial intelligence, and machine learning
  • Virtual desktops, communications, and collaboration services

A data center consists of the following core infrastructure components:

  • Network infrastructure: Connects physical and virtual servers, data center services, storage, and external connections to end users.
  • Storage Infrastructure: Modern data centers use storage infrastructure to power their operations. Storage systems hold this valuable commodity.
  • A data center’s computing infrastructure is its applications. The computing infrastructure comprises servers that provide processors, memory, local storage, and application network connectivity. In the last 65 years, computing infrastructure has undergone three major waves:
    • In the first wave of replacements of proprietary mainframes, x86-based servers were installed on-premises and managed by internal IT teams.
    • In the second wave, application infrastructure was widely virtualized. The result was improved resource utilization and workload mobility across physical infrastructure pools.
    • The third wave finds us in the present, where we see the move to the cloud, hybrid cloud, and cloud-native (that is, applications born in the cloud).

Common Types of Data Center Topologies:

a) Bus Topology: In this traditional topology, all devices are connected linearly to a common backbone, resembling a bus. While it is simple and cost-effective, a single point of failure can disrupt the entire network.

b) Star Topology: Each device is connected directly to a central switch or hub in a star topology. This design offers centralized control and easy troubleshooting, but it can be expensive due to the requirement of additional cabling.

c) Mesh Topology: A mesh topology provides redundant connections between devices, forming a network where every device is connected to every other device. This design ensures high fault tolerance and scalability but can be complex and costly.

d) Hybrid Topology: As the name suggests, a hybrid topology combines elements of different topologies to meet specific requirements. It offers flexibility and allows organizations to optimize their infrastructure based on their unique needs.

Considerations in Data Center Topology Design:

a) Redundancy: Redundancy is essential to ensure continuous operation even during component failures. By implementing redundant paths, power sources, and network links, data centers can minimize the risk of downtime and data loss.

b) Scalability: As the data center’s requirements grow, the topology should be able to accommodate additional devices and increased data traffic. Scalability can be achieved through modular designs, virtualization, and flexible network architectures.

c) Performance and Latency: The distance between devices, the quality of network connections, and the efficiency of routing protocols significantly impact the performance and latency in data centers. Optimal topology design considers these factors to minimize delays and ensure smooth data transmission.

Impact of Data Center Topology:

Efficient data center topology directly influences the entire infrastructure’s reliability, availability, and performance. A well-designed topology reduces single points of failure, enables load balancing, enhances fault tolerance, and optimizes data flow. It directly impacts the user experience, especially for cloud-based services, where data centers simultaneously cater to many users.

Data Center Topology

Main Data Center Topology Components

Data Center Topology

  • You need to understanding the different topologies and their respective strengths and weaknesses.

  • Rich connectivity among the ToR switches so that all application and end-user requirements are satisfied

  • A well-designed topology reduces single points of failure.

  • Example: Bus, star, mesh, and hybrid topologies

The Role of Networks

A network lives to serve the connectivity requirements of applications and applications. We build networks by designing and implementing data centers. A common trend is that the data center topology is much bigger than a decade ago, with application requirements considerably different from the traditional client–server applications and with deployment speeds in seconds instead of days. This changes how networks and your chosen data center topology are designed and deployed.

The traditional network design was scaled to support more devices by deploying larger switches (and routers). This is the scale-in model of scaling. However, these large switches are expensive and primarily designed to support only a two-way redundancy.

Today, data center topologies are built to scale out. They must satisfy the three main characteristics of increasing server-to-server traffic, scale ( scale on-demand ), and resilience. The following diagram shows a ToR design we discussed at the start of the blog.

Top of Rack (ToR)
Diagram: Data center network topology. Top of Rack (ToR).

The Role of The ToR

Top of rack (ToR) is a term used to describe the architecture of a data center. It is a server architecture in which servers, switches, and other equipment are mounted on the same rack. This allows for the most efficient use of space since the equipment is all within arm’s reach.

ToR is also the most efficient way to manage power and cooling since the equipment is all in the same area. ToR also allows faster access times since all the equipment is close together. This architecture can also be utilized in other areas, such as telecommunications, security, and surveillance.

ToR is a great way to maximize efficiency in any data center and is becoming increasingly popular. In contrast to the ToR data center design, the following diagram shows an EoR switch design.

End of Row (EoR)
Diagram: Data center network topology. End of Row (EoR).

The Role of The EoR

The term End of Row (EoR) design is derived from a dedicated networking rack or cabinet placed at either end of a row of servers to provide network connectivity to the servers within that row. In EoR network design, each server in the rack has a direct connection with the end-of-row aggregation switch. This eliminates the need to connect servers directly with the in-rack switch.

Racks are usually arranged to form a row; a cabinet or rack is positioned at the end of this row. This rack has a row aggregation switch, which provides network connectivity to servers mounted in individual racks. This switch, a modular chassis-based platform, sometimes supports hundreds of server connections. However, a large amount of cabling is required to support this architecture.

Data center topology types
Diagram: ToR and EoR. Source. FS Community.

A ToR configuration requires one switch per rack, resulting in higher power consumption and operational costs. Moreover, unused ports are often more significant in this scenario than with an EoR arrangement.

On the other hand, ToR’s cabling requirements are much lower than those of EoR, and faults are primarily isolated to a particular rack, thus improving the data center’s fault tolerance.

If fault tolerance is the ultimate goal, ToR is the better choice, but EoR configuration is better if an organization wants to save on operational costs. The following table lists the differences between a ToR and an EoR data center design.

data center network topology
Diagram: Data center network topology. The differences. Source FS Community

 

Data Center Topology Types:

Fabric extenders – FEX

Cisco has introduced the concept of Fabric Extender; these switches are not Ethernet switches but remote line cards of a virtualized modular chassis ( parent switch ). This allows scalable topologies previously impossible with traditional Ethernet switches in the access layer.

You should relate an FEX device like a remote line card attached to a parent switch. All the configuration is done on the parent switch, yet physically, the fabric extender could be in a different location. The mapping between the parent switch and the FEX ( fabric extender ) is done via a special VN-Link.

The following diagram shows an example of a FEX in a standard data center network topology. More specifically, we are looking at the Nexus 2000 FEX Series. Cisco Nexus 2000 Series Fabric Extenders (FEX) are based on the standard IEEE 802.1BR. They deliver fabric extensibility with a single point of management.

Cisco FEX
Diagram: Cisco FEX design. Source Cisco.

Different types of Fex solution

FEXs come with various connectivity solutions, including 100 Megabit Ethernet, 1 Gigabit Ethernet, 10 Gigabit Ethernet ( copper and fiber ), and 40 Gigabit Ethernet. They can be synchronized with the following models of parent switches – Nexus 5000, Nexus 6000, Nexus 7000, Nexus 9000, and Cisco UCS Fabric Interconnect.

In addition, because of the simplicity of FEX, they have very low latency ( as low as 500 nanoseconds ) compared to traditional Ethernet switches.

Data Center design
Diagram: Data center fabric extenders.

Some network switches can be connected to others and operate as a single unit. These configurations are called “stacks” and are helpful for quickly increasing the capacity of a network. A stack is a network solution composed of two or more stackable switches. Switches that are part of a stack behave as one single device.

Traditional switches like the 3750s still stand in the data center network topology access layer and can be used with stacking technology, combining two physical switches into one logical switch.

This stacking technology allows you to build a highly resilient switching system, one switch at a time. If you are looking at a standard access layer switch like the 3750s, consider the next-generation Catalyst 3850 series.

The 3850 supports BYOD/mobility and offers a variety of performance and security enhancements to previous models. The drawback of stacking is that you can only stack several switches. So, if you want additional throughout, you should aim for a different design type.

Data Center Design: Layer 2 and Layer 3 Solutions

Traditional views of data center design

Depending on the data center network topology deployed, the forwarding of the packets at the access layer can be either Layer 2 or Layer 3. A layer 3 approach would involve additional management and configuring IP addresses on hosts in a hierarchical fashion that matches the switch’s assigned IP address.

An alternative approach is to use Layer 2, which has less overhead as Layer 2 MAC addresses do not need specific configuration but have drawbacks with scalability and poor performance.

Generally, access switches focus on communicating servers in the same IP subnet, allowing any type of traffic – unicast, multicast, or broadcast. You can, however, have filtering devices such as a Virtual Security Gateway ( VSG ) to permit traffic between servers, but that is generally reserved for inter-POD ( Platform Optimized Design ) traffic.

Lab Guide: IGMPv1

In the following example, we have a lab guide on IGMPv1.I GMPv1, or Internet Group Management Protocol Version 1, is a network-layer protocol designed to facilitate host communication and actively manage multicast group memberships.

It enables hosts to join and leave multicast groups, allowing them to receive IP multicast traffic from a specific source. Notice the output from the packet captures below for the Membership Query and the Membership Report.

IGMPv1
Diagram: IGMPv1

Leaf and Spine With Layer 3

We use a leaf and spine data center design with Layer 3 everywhere with overlay networking. The leaf and spine data center is a modern, robust architecture that provides a high-performance, highly-available network. With this architecture, data center networks are composed of leaf switches that connect to one or more spine switches.

The leaf switches are connected to end devices such as servers, storage devices, and other networking equipment. The spine switches, meanwhile, act as the network’s backbone, connecting the multiple leaf switches.

The leaf and spine architecture provides several advantages over traditional data center networks. It allows for greater scalability, as additional leaf switches can be easily added to the network. It also provides better fault tolerance, as the network can operate even if one of the spine switches fails.

Furthermore, it enables faster traffic flows, as the spine switches to route traffic between the leaf switches faster than a traditional flat network.

leaf and spine

Data Center Traffic Flow

The types of traffic in data center topologies can be North-South or East-to-West. North-South ( up / down ) corresponds to traffic between the servers and the external world ( outside the data center ). East to West corresponds to internal server communication, i.e., traffic does not leave the data center.

Therefore, determining the type of traffic upfront is essential as it influences the type of topology used in the data center.

data center traffic flow
Diagram: Data center traffic flow.

For example, you may have a pair of ISCSI switches, and all traffic is internal between the servers. In this case, you would need high-bandwidth inter-switch links. Usually, an ether channel supports all the cross-server talk; the only north-to-south traffic would be management traffic.

In another part of the data center, you may have data server farm switches with only HSRP heartbeat traffic across the inter-switch links and large bundled uplinks for a high volume of north-to-south traffic. Depending on the type of application, which can be either outward-facing or internal, computation will influence the type of traffic that will be dominant. 

Video: Leaf and Spine.

This quick education tutorial will examine the leaf and spine data center architecture. We know this design is a considerable step from traditional DC design. As a use case, we will focus on how Cisco has adopted the leaf and spine design with its Cisco ACI product. We will address the components and how they form the Cisco ACI fabric.

Spine and Leaf Design: Cisco ACI
Prev 1 of 1 Next
Prev 1 of 1 Next

Virtual Machine and Containers.

This drive was from virtualization, virtual machines, and container technologies regarding east-west traffic. Many are moving to a leaf and spine data center design if they have a lot of east-to-west traffic and want better performance.

Network Virtualization and VXLAN

Network virtualization and the ability of a physical server to host many VMs and move those VMs are also used extensively in data centers, either for workload distribution or business continuity. This will also affect the design you have at the access layer.

For example, in a Layer 3 fabric, migration of a VM across that boundary changes its IP address, resulting in a reset of the TCP sessions because, unlike SCTP, TCP does not support dynamic address configuration. In a layer 2 fabric, migrating a VM incurs ARP overhead and requires forwarding on millions of flat MAC addresses, which leads to MAC scalability and poor performance problems.

Lab Guide: VXLAN

The following lab guide displays a VXLAN network. We are running VXLAN in unicast mode. VXLAN can also be configured to run in multicast mode. In the screenshot below, we have created a Layer 2 overlay across a routed Layer 3 core. The command: Show nve interface nve 1 displays an operational tunnel with the encapsulation set to VXLAN.

The ping test in the screenshot is from the desktops that connect to a Layer 3 port on the Leafs.

VXLAN overlay
Diagram: VXLAN Overlay

VXLAN: stability over Layer 3 core

Network virtualization plays a vital role in the data center. Technologies like VXLAN attempt to move the control plane from the core to the edge and stabilize the core to have only a handful of addresses for each ToR switch. The following diagram shows the ACI networks with VXLAN as the overlay that operates over a spine leaf architecture.

Layer 2 and 3 traffic is mapped to VXLAN VNIs that run over a Layer 3 core. The Bridge Domain is for layer 2, and the VRF is for layer 3 traffic. Now, we have the separation of layer 2 and 3 traffic based on the VNI in the VXLAN header.  

So, one of the first notable differences between VXLAN vs VLAN was scale. VLAN has a 12-bit identifier called VID, while VXLAN has a 24-bit identifier called a VID network identifier. This means that with VLAN, you can create only 4094 networks over ethernet, while with VXLAN, you can create up to 16 million.

ACI network
Diagram: ACI network.

Whether you can build layer 2 or layer 3 in the access and use VXLAN or some other overlay to stabilize the core, it would help if you modularized the data center. The first is to build each POD or rack as a complete unit. Each of these PODs will be able to perform all its functions within that POD.

  • A key point: A POD data center design

POD: It is a design methodology that aims to simplify, speed deployment, and optimize utilization of resources as well as drive interoperability of the three or more data center components: server, storage, and networks.

A POD example: Data center modularity

For example, one POD might be a specific human resources system. The second is modularity based on the type of resources offered. For example, a storage pod or bare metal compute may be housed in separate pods.

These two modularization types allow designers to control inter-POD traffic with predefined policies easily. Operators can also upgrade PODs and a specific type of service at once without affecting other PODs.

However, this type of segmentation does not address the scale requirements of the data center. Even when we have adequately modularized the data center into specific portions, the MAC table sizes on each switch still increase exponentially as the data center grows.

Current and Future Design Factors

New technologies with scalable control planes must be introduced for a cloud-enabled data center, and these new control planes should offer the following:

Option

Data Center Feature

Data center feature 1

The ability to scale MAC addresses

Data center feature 2

First-Hop Redundancy Protocol ( FHRP ) multipathing and Anycast HSRP

Data center feature 3

Equal-Cost multipathing

Data center feature 4

MAC learning optimizations

There are several design factors you need to take into account when designing a data center. First, what is the growth rate for servers, switch ports, and data center customers? This prevents part of the network topology from becoming a bottleneck or links congested.

Application bandwidth demand?

This demand is usually translated into oversubscription. In data center networking, oversubscription refers to how much bandwidth switches are offered to downstream devices at each layer.

Oversubscription is expected in a data center design. You limit oversubscription to the ToR and edge of the network and offer you a single place to start when you experience performance problems.

A data center with no oversubscription ratio will be costly, especially with a low latency network design. So, it’s best to determine what oversubscription ratio your applications support and work best. Optimizing your switch buffers to improve performance is recommended before you decide on a 1:1 oversubscription rate.

Ethernet 6-byte MAC addressing is flat.

In tandem with IP, Ethernet forms the basis of data center networking. Since its inception 40 years ago, Ethernet frames have been transmitted over various physical media, even barbed wire. Ethernet 6-byte MAC addressing is flat; the manufacturer typically assigns the address without considering its location.

Ethernet-switched networks have no explicit routing protocols to spread readability about the flat addresses of the server’s NICs. Instead, flooding and address learning is used to create forwarding table entries.

IP addressing is a hierarchy.

On the other hand, IP addressing is a hierarchy, meaning that its address is assigned by the network operator based on its location in the network. A hierarchy address space advantage is that forwarding tables can be aggregated. If summarization or other routing techniques are employed, changes in one side of the network will not necessarily affect other areas.

This makes IP-routed networks more scalable than Ethernet-switched networks. IP-routed networks also offer ECMP techniques that enable networks to use parallel links between nodes without spanning tree disabling one of those links. The ECMP method hashes packet headers before selecting a bundled link to avoid out-of-sequence packets within individual flows. 

Equal Cost Load Balancing

Equal cost load balancing is a method for distributing network traffic among multiple paths of equal cost. It is a way to provide redundancy and increase throughput. Sending traffic over multiple paths avoids congestion on any single link. In addition, the load is equally distributed across the paths, meaning that each path carries roughly the same total traffic.

This allows for using multiple paths of lower cost, providing an efficient way to increase throughput.

The idea behind equal cost load balancing is to use multiple paths of equal cost to balance the load on each path. The algorithm considers the number of paths, each path’s weight, and each path’s capacity. It also feels the number of packets that must be sent and the delay allowed for each packet.

Considering these factors, it can calculate the best way to distribute the load among the paths.

Equal cost load balancing can be implemented using a variety of methods. One method is to use a Link Aggregation Protocol (LACP), which allows the network to use multiple links and distribute the traffic among the links in a balanced way.

ecmp
Diagam: ECMP 5 Tuple hash. Source: Keysight
  • A keynote: Data center topologies. The move to VXLAN.

Given the above considerations, a solution that encompasses the benefits of L2’s plug-and-play flat addressing and the scalability of IP is needed. Location-Identifier Split Protocol ( LISP ) has a set of solutions that use hierarchical addresses as locators in the core and flat addresses as identifiers in the edges. However, not much is seen in its deployment these days.

Equivalent approaches such as THRILL and Cisco FabricPath create massive scalable L2 multipath networks with equidistant endpoints. Tunneling is also being used to extend down to the server and access layer to overcome the 4K limitation with traditional VLANs. What is VXLAN? Tunneling with VXLAN is now the standard design in most data center topologies with leaf-spine designs. The following video provides VXLAN guidance.

Video: VXLAN

In this video, we will discuss finding out the destination VTEP. The big decision is how you discover the destination VTEP IP address. The destination VTEP IP address needs to be mapped to the end host destination MAC address. The mechanism used to do this affects the scalability & VXLAN domain functionality.

Technology Brief : VXLAN – VXLAN Operations
Prev 1 of 1 Next
Prev 1 of 1 Next

Data Center Network Topology

Leaf and spine data center topology types

This is commonly seen in a leaf and spine design. For example, in a leaf-spine fabric, We have a Layer 3 IP fabric that supports equal-cost multi-path (ECMP) routing between any two endpoints in the network. Then on top of the Layer 3 fabric is an overlay protocol, commonly VXLAN.

A spine-leaf architecture consists of a data center network topology of two switching layers—a spine and a leaf. The leaf layer comprises access switches that aggregate traffic from endpoints such as the servers and connect directly to the spine or network core.

Spine switches interconnect all leaf switches in a full-mesh topology. The leaf switches do not directly connect. The Cisco ACI is a data center topology that utilizes the leaf and spine.

The ACI network’s physical topology is a leaf and spine, while the logical topology is formed with VXLAN. From a protocol side point, VXLAN is the overlay network, and the BGP and IS-IS provide the Layer 3 routing, the underlay network that allows the overlay network to function.

As a result, the nonblocking architecture performs much better than the traditional data center design based on access, distribution, and core designs.

Cisco ACI
Diagram: Data center topology types and the leaf and spine with Cisco ACI

Closing Points: Data Center Topologies

A data center topology refers to the physical layout and interconnection of network devices within a data center. It determines how servers, switches, routers, and other networking equipment are connected, ensuring efficient and reliable data transmission. Topologies are based on scalability, fault tolerance, performance, and cost.

  • Hierarchical Data Center Topology:

The hierarchical or tree topology is one of the most commonly used data center topologies. This design consists of multiple core, distribution, and access layers. The core layer connects all the distribution layers, while the distribution layer connects to the access layer. This structure enables better management, scalability, and fault tolerance by segregating traffic and minimizing network congestion.

  • Mesh Data Center Topology:

Every network device is interlinked in a mesh topology, forming a fully connected network with multiple paths for data transmission. This redundancy ensures high availability and fault tolerance. However, this topology can be cost-prohibitive and complex, especially in large-scale data centers.

  • Leaf-Spine Data Center Topology:

The leaf-spine topology is gaining popularity due to its scalability and simplicity. It consists of interconnected leaf switches at the access layer and spine switches at the core layer. This design allows for non-blocking, low-latency communication between any leaf switch and spine switch, making it suitable for modern data center requirements.

  • Full-Mesh Data Center Topology:

As the name suggests, the full-mesh topology connects every network device to every other device, creating an extensive web of connections. This topology offers maximum redundancy and fault tolerance. However, it can be expensive to implement and maintain, making it more suitable for critical applications with stringent uptime requirements.

Conclusions:

Data center topologies are crucial for modern data centers’ efficient and reliable operation. The choice of topology depends on various factors such as scalability, fault tolerance, performance, and cost. Hierarchical, mesh, leaf-spine, and full-mesh topologies are commonly used designs, each offering advantages and trade-offs. As technology evolves, data center topologies will continue to adapt to meet the ever-increasing demands of the digital world.

Multicast VXLAN

Data Center Network Design

Data Center Network Design

Data centers are crucial in today’s digital landscape, serving as the backbone of numerous businesses and organizations. A well-designed data center network ensures optimal performance, scalability, and reliability. This blog post will explore the critical aspects of data center network design and its significance in modern IT infrastructure.

Efficient data center network design is critical for meeting the growing demands of complex applications, high data traffic, and rapid data processing. It enables seamless connectivity, improves application performance, and enhances user experience. A well-designed network also ensures data security, disaster recovery, and efficient resource utilization.

Table of Contents

Highlights: Data Center Network Design

The goal of data center design and interconnection network is to transport end-user traffic from A to B without any packet drops, yet the metrics we use to achieve this goal can be very different. The data center is evolving and progressing through various topology and technology changes, resulting in various data center network designs.

The new data center control plane we are seeing today, such as Fabric Path, LISP, THRILL, and VXLAN, is being driven by a change in the end user’s requirement; the application has changed.

These new technologies may address new challenges, yet the fundamental question of where to create the Layer 2/Layer 3 boundary and the need for Layer 2 in the access layer remains the same. The question stays the same, yet the technologies available to address this challenge have evolved.

The use of Open Networking

We also have the Open Networking Foundation ( ONF ) with open networking. Open networking describes a network that uses open standards and commodity hardware. So, consider open networking in terms of hardware and software. Unlike a vendor approach like Cisco, this gives you much more choice with what hardware and software you use to make up and design your network.

Related: Before you proceed, you may find the following useful:

  1. ACI Networks
  2. IPv6 Attacks
  3. SDN Data Center
  4. Active Active Data Center Design
  5. Virtual Switch

Data Center Control Plane

Key Data Center Network Design Discussion Points:


  • Introduction to data center network design and what is involved.

  • Highlighting the details of VLANs and virtualization.

  • Technical details on the issues of Layer 2 in data centers. 

  • Scenario: Cisco FabricPath and DFA.

  • Details on overlay networking and Cisco OTV.

The Rise of Overlay Networking

What has the industry introduced to overcome these limitations and address the new challenges? – Network virtualization and overlay networking. In its simplest form, an overlay is a dynamic tunnel between two endpoints that enables Layer 2 frames to be transported between them. In addition, these overlay-based technologies provide a level of indirection that enables switch table sizes to not increase in the order of the number of supported end-hosts.

Today’s overlays are Cisco FabricPath, THRILL, LISP, VXLAN, NVGRE, OTV, PBB, and Shorted Path Bridging. They are essentially virtual networks that sit on top of a physical network, often the physical network not being aware of the virtual layer above it.

Lab Guide: VXLAN

The following lab guide displays a VXLAN network. We are running VXLAN in multicast mode. Multicast VXLAN is a variant of VXLAN that utilizes multicast-based IP multicast for transmitting overlay network traffic. VXLAN is an encapsulation protocol that extends Layer 2 Ethernet networks over Layer 3 IP networks.

Linking multicast enables efficient and scalable communication within the overlay network. Notice the multicast group of 239.0.0.10 and the route of 239.0.0.10 forwarding out the tunnel interface. We have multicast enabled on all Layer 3 interfaces, including the core that consists of Spine A and Spine B.

Multicast VXLAN
Diagram: Multicast VXLAN

Traditional Data Center Network Design

How do routers create a broadcast domain boundary? Firstly, using the traditional core, distribution, and access model, the access layer is layer 2, and servers served to each other in the access layer are in the same IP subnet and VLAN. The same access VLAN will span the access layer switches for east-to-west traffic, and any outbound traffic is via a First Hop Redundancy Protocol ( FHRP ) like Hot Standby Router Protocol ( HSRP ).

Servers in different VLANs are isolated from each other and cannot communicate directly; inter-VLAN communications require a Layer 3 device. Virtualization’s humble beginnings started with VLANs, which were used to segment traffic at Layer 2. It was expected to find single VLANs spanning an entire data center fabric.

 

VLAN and Virtualization

The virtualization side of VLANs comes from two servers physically connected to different switches. Assuming the VLAN spans both switches, the same VLAN can communicate with each server. Each VLAN can be defined as a broadcast domain in a single Ethernet switch or shared among connected switches.

Whenever a switch interface belonging to a VLAN receives a broadcast frame ( destination MAC is ffff.ffff.ffff), the device must forward this frame to all other ports defined in the same VLAN.

This approach is straightforward in design and is almost like a plug-and-play network. The first question is, why not connect everything in the data center into one large Layer 2 broadcast domain? Layer 2 is a plug-and-play network, so why not?

 

The issues of Layer 2

The reason is that there are many scaling issues in large layer 2 networks. Layer 2 networks don’t have controlled / efficient network discovery protocols. Address Resolution Protocol ( ARP ) is used to locate end hosts and uses Broadcasts and Unicast replies. A single host might not generate much traffic, but imagine what would happen if 10,000 hosts were connected to the same broadcast domain. VLANs span an entire data center fabric, which can bring a lot of instability due to loops and broadcast storms.

 

No hierarchy in MAC addresses

There is also no hierarchy in MAC addressing. Unlike Layer 3 networks, where you can have summarization and hierarchy addressing, MAC addresses are flat. Creating several thousand hosts to a single broadcast domain will create large forwarding information tables.

Because end hosts are potentially not static, they are likely to be attached and removed from the network at regular intervals, creating a high rate of change in the control plane. You can, of course, have a large Layer 2 data center with multiple tenants if they don’t need to communicate with each.

The shared services requirements, such as WAAS or load balancing, can be solved by spinning up the service VM in the tenant’s Layer 2 broadcast domain. This design will hit scaling and management issues. There is a consensus to move from a Layer 2 design to a more robust and scalable Layer 3 design.

But why is there still a need for Layer 2 in the data center topologies? One solution is Layer 2 VPN with EVPN. But first, let us have a look at Cisco DFA.

The Requirement for Layer 2 in Data Center Network Design

  • Servers that perform the same function might need to communicate with each other due to a clustering protocol or simply as part of the application’s inner functions. If the communication is clustering protocol heartbeats or some server-to-server application packets that are not routable, then you need this communication layer to be on the same VLAN, i.e., Layer 2 domain, as these types of packets are not routable and don’t understand the IP layer.

  • Stateful devices such as firewalls and load balancers need Layer 2 adjacency as they constantly exchange connection and session state information.

  • Dual-homed servers: Single server with two server NICs and one NIC to each switch will require a layer 2 adjacency if the adapter has a standby interface that uses the same MAC and IP addresses after a failure. In this situation, the active and standby interfaces must be on the same VLAN and use the same default gateway.

  • Suppose your virtualization solutions cannot handle Layer 3 VM mobility. In that case, you may need to stretch VLANs between PODS / Virtual Resource Pools or even data centers so you can move VMs around the data center at Layer 2 ( without changing their IP address ).

 

Data Center Design and Cisco DFA

Cisco went one giant step and recently introduced a data center fabric with Dynamic Fabric Automaton ( DFA ), similar to Juniper QFabric, which offers you both Layer 2 switching and Layer 3 routing at the access layer / ToR. Firstly, they have Fabric Path ( IS-IS for Layer 2 connectivity ) in the core, which gives you optimal Layer 2 forwarding between all the edges.

Then they configure the same Layer 3 address on all the edges, which gives you optimal Layer 3 forwarding across the whole Fabric.

On edge, you can have Layer 3 Leaf switches, for example, the Nexus 6000 series, or integrate with Layer 2-only devices like the Nexus 5500 series or the Nexus 1000v. You can also connect external routers or USC or FEX to the Fabric. In addition to running IS-IS as the data center control plane, DFA uses MP-iBGP, with some Spine nodes being the Route Reflector to exchange IP forwarding information.

Cisco FabricPath

DFA also employs a Cisco FabricPath technique called “Conversational Learning.” The first packet triggers a full RIB lookup, and the subsequent packets are switched in the hardware-implemented switching cache.

This technology provides Layer 2 mobility throughout the data center while providing optimal traffic flow using Layer 3 routing. Cisco commented, “DFA provides a scale-out architecture without congestion points in the network while providing optimized forwarding for all applications.”

Terminating Layer 3 at the access / ToR has clear advantages and disadvantages. Other benefits include reducing the size of the broadcast domain, which comes at the cost of reducing the mobility domain across which VMs can be moved.

Terminating Layer 3 at the accesses can also result in sub-optimal routing because there will be hair pinning or traffic tromboning of across-subnet traffic, taking multiple and unnecessary hops across the data center fabric.

 

The role of the Cisco Fabricpath

Cisco FabricPath is a Layer 2 technology that provides Layer 3 benefits, such as multipathing the classical Layer 2 networks using IS-IS at Layer 2. This eliminates the need for spanning tree protocol, avoiding the pitfalls of having large Layer 2 networks. As a result, Fabric Path enables a massive Layer 2 network that supports multipath ( ECMP ). THRILL is an IEEE standard that, like Fabric Path, is a Layer 2 technology that provides the same Layer 3 benefits as Cisco FabricPath to the Layer 2 networks using IS-IS.

LISP is popular in Active / Active data centers for DCI route optimization/mobility and separates the host’s location and the identifier ( EID ), allowing VMs to move across subnet boundaries while keeping the endpoint identification. LISP is often referred to as an Internet locator. 

That can enable some designs of triangular routing. Popular encapsulation formats include VXLAN ( proposed by Cisco and VMware ) and STT (created by Nicira but will be deprecated over time as VXLAN comes to dominate ).

 

Video: LISP networking

In the following video, we will demonstrate the use of LISP in networking. It’s a hands-on demonstration that goes through the various components of a LISP network and how each component operates.

Hands on Video Series – Enterprise Networking | LISP Configuration Intro
Prev 1 of 1 Next
Prev 1 of 1 Next

The role of OTV

OTV is a data center interconnect ( DCI ) technology enabling Layer 2 extension across data center sites. While Fabric Path can be a DCI technology over short distances with dark fiber, OTV has been explicitly designed for DCI. In contrast, the Fabric Path data center control plane is primarily used for intra-DC communications.

Failure boundary and site independence are preserved in OTV networks because OTV uses a data center control plane protocol to sync MAC addresses between sites and prevent unknown unicast floods. In addition, recent IOS versions can allow unknown unicast floods for certain VLANs, which are unavailable if you use Fabric Path as the DCI technology.

 

The Role of Software-defined Networking (SDN)

Another potential trade-off between data center control plane scaling, Layer 2 VM mobility, and optimal ingress/egress traffic flow would be software-defined networking ( SDN ). At a basic level, SDN can create direct paths through the network fabric to isolate private networks effectively.

An SDN network allows you to choose the correct forwarding information on a per-flow basis. This per-flow optimization eliminates VLAN separation in the data center fabric. Instead of using VLANs to enforce traffic separation, the SDN controller has a set of policies allowing traffic to be forwarded from a particular source to a destination.

The ACI Cisco borrows concepts of SDN to the data center. It operates over a leaf and spine design and traditional routing protocols such as BGP and IS-IS. However, it brings a new way to manage the data center with new constructs such as Endpoint Groups (EPGs). In addition, no more VLANs are needed in the data center as everything is routed over a Layer 3 core, with VXLAN as the overlay protocol.

Summary: Recap on Data Center Design

Data centers are the backbone of modern technology infrastructure, providing the foundation for storing, processing, and transmitting vast amounts of data. A critical aspect of data center design is the network architecture, which ensures efficient and reliable data transmission within and outside the facility.  1. Scalability and Flexibility

One of the primary goals of data center network design is to accommodate the ever-increasing demand for data processing and storage. Scalability ensures the network can grow seamlessly as the data center expands. This involves designing a network supporting many devices, servers, and users without compromising performance or reliability. Additionally, flexibility is essential to adapt to changing business requirements and technological advancements.

  • Redundancy and High Availability

Data centers must ensure uninterrupted access to data and services, making redundancy and high availability critical for network design. Redundancy involves duplicating essential components, such as switches, routers, and links, to eliminate single points of failure. This ensures that if one component fails, there are alternative paths for data transmission, minimizing downtime and maintaining uninterrupted operations. High availability further enhances reliability by providing automatic failover mechanisms and real-time monitoring to detect and address network issues promptly.

  • Traffic Optimization and Load Balancing

Efficient data flow within a data center is vital to prevent network congestion and bottlenecks. Traffic optimization techniques, such as Quality of Service (QoS) and traffic prioritization, can be implemented to ensure that critical applications and services receive the necessary bandwidth and resources. Load balancing is crucial in distributing network traffic evenly across multiple servers or paths, preventing overutilization of specific resources and optimizing performance.

  • Security and Data Protection

Data centers house sensitive information and mission-critical applications, making security a top priority. The network design should incorporate robust security measures, including firewalls, intrusion detection systems, and encryption protocols, to safeguard data from unauthorized access and cyber threats. Data protection mechanisms, such as backups, replication, and disaster recovery plans, should also be integrated into the network design to ensure data integrity and availability.

  • Monitoring and Management

Proactive monitoring and effective management are essential for maintaining optimal network performance and addressing potential issues promptly. The network design should include comprehensive monitoring tools and centralized management systems that provide real-time visibility into network traffic, performance metrics, and security events. This enables administrators to promptly identify and resolve network bottlenecks, security breaches, and performance degradation.

Data center network design is critical in ensuring efficient, reliable, and secure data transmission within and outside the facility. Scalability, redundancy, traffic optimization, security, and monitoring are key considerations for designing a robust, high-performance network. By implementing best practices and staying abreast of emerging technologies, data centers can build networks that meet the growing demands of the digital age while maintaining the highest levels of performance, availability, and security.