Green data center with eco friendly electricity usage tiny person concept. Database server technology for file storage hosting with ecological and carbon neutral power source vector illustration.

Data Center – Site Selection | Content routing

Data Center Site Selection

In today's interconnected world, data centers play a crucial role in ensuring the smooth functioning of the internet. Behind the scenes, intricate routing mechanisms are in place to efficiently transfer data between different locations. In this blog post, we will delve into the fascinating world of data center routing locations and discover how they contribute to the seamless browsing experience we enjoy daily.

Data centers are the backbone of our digital infrastructure, housing vast amounts of data and serving as hubs for internet traffic. One crucial aspect of data center operations is routing, which determines the path that data takes from its source to the intended destination. Understanding the fundamentals of data center routing is essential to grasp the significance of routing locations.

When it comes to selecting routing locations for data centers, several factors come into play. Proximity to major internet exchange points, network latency considerations, and redundancy requirements all influence the decision-making process. We will explore these factors in detail and shed light on the complex considerations involved in determining optimal routing locations.

Data center routing locations are strategically distributed across the globe to ensure efficient data transfer and minimize latency. We will take a virtual trip around the world, uncovering key regions where routing locations are concentrated. From the bustling connectivity hubs of North America and Europe to emerging markets in Asia and South America, we'll explore the diverse geography of data center routing.

Content Delivery Networks (CDNs) play a vital role in optimizing the delivery of web content by caching and distributing it across multiple data centers. CDNs strategically position their servers in various routing locations to minimize latency and ensure rapid content delivery to end-users. We will examine the symbiotic relationship between data center routing and CDNs, highlighting their collaborative efforts to enhance web browsing experiences.

Highlights: Data Center Site Selection

Google Cloud CDN

A CDN is a globally distributed network of servers that stores and delivers website content to users based on their geographic location. By caching and serving content from servers closest to the end users, CDNs significantly reduce latency and enhance the overall user experience.

Google Cloud CDN is a robust and scalable CDN solution offered by Google Cloud Platform. Built on Google’s global network infrastructure, it seamlessly integrates with other Google Cloud services, providing high-performance content delivery worldwide. With its vast network of edge locations, Google Cloud CDN ensures low-latency access to content, regardless of the user’s location.

– Global Edge Caching: Google Cloud CDN caches content at edge locations worldwide, ensuring faster retrieval and reduced latency for end-users.

– Security and Scalability: With built-in DDoS protection and automatic scaling, Google Cloud CDN guarantees the availability and security of your content, even during traffic spikes.

– Intelligent Caching: Leveraging machine learning algorithms, Google Cloud CDN intelligently caches frequently accessed content, further optimizing delivery and reducing origin server load.

Real-time Analytics: Google Cloud CDN provides comprehensive analytics and monitoring tools to help you gain insights into your content delivery performance.

Routing IP addresses: The Process

In IP routing, routers must make packet-forwarding decisions independently of each other. Therefore, IP routers are only concerned with finding the next hop to a packet’s final destination. The IP routing protocol is myopic in this sense. IP’s myopia allows it to route around failures easily, but it is also a weakness. In most cases, the packet will be routed to its destination via another router unless the router is on the same subnet (more on this later).

In the routing table, a router looks up a packet’s destination IP address to determine the next hop. A packet is then forwarded to the network interface returned by this lookup by the router.

RIB and the FIB

All the different pieces of information learned from all the other methods (connected, static, and routing protocols) are stored in the RIB. A software component called the RIB manager selects all these different methods. Every routing protocol has a unique number called the distance2. If more than one protocol has the same prefix, the RIB manager picks the protocol with the lowest distance. The shortest distance is found on connected routes. Routes obtained via a routing protocol have a greater distance than static routes.

Routing to a data center

Let us address how users are routed to a data center. Well, there are several data center site selection criteria or even checklists that you can follow to ensure your users follow the most optimal path and limit sub-optimal routing. Distributed workloads with multi-site architecture open up several questions regarding the methods for site selection, path optimization for ingress/egress flows, and data replication (synchronous/asynchronous) for storage. 

BGP AS Prepending

AS Path prepending is a simple yet powerful technique for manipulating BGP route selection. By adding additional AS numbers to the AS Path attribute, network administrators can influence the inbound traffic flow to their network. Essentially, the longer the AS Path, the less attractive the route appears to neighboring ASes, leading to traffic routed through alternate paths.

AS Path prepending offers several benefits for network administrators. Firstly, it provides a cost-effective way to balance inbound traffic across multiple links, thereby preventing congestion on a single path. Secondly, it enhances network resilience by providing redundancy and alternate paths in case of link failures. Lastly, AS Path prepending can be used strategically to optimize outbound traffic flow and improve network performance.

In my example, AS 1 wants to ensure traffic enters the autonomous system through R2. We can add our autonomous system number multiple times, so the as-path becomes longer. Since BGP prefers a shorter AS path, we can influence our routing. This is called AS path pretending. Below, the default behavior is shown without pretending to be configured. 

BGP Configuration

First, create a route map and use set as-path prepend to add your own AS number multiple times. Don’t forget to add the route map to your BGP neighbor configuration. It should be outbound since you are sending this to your remote neighbor! Let’s check the BGP table! Now we see that 192.168.23.3 is our next-hop IP address. The AS Path for the second entry has also become longer. That’s it!

BGP AS Prepend

Distributing the load

Furthermore, once the content is distributed to multiple data centers, you need to manage the request for the distributed content and the load by routing users’ requests to the appropriate data center. Routing in the data center is known as content routing. Content routing takes a user’s request and sends it to the relevant data center.

Before you proceed, you may find the following post helpful:

  1. DNS Security Solutions
  2. BGP FlowSpec
  3. DNS Reflection Attack
  4. DNS Security Designs
  5. Data Center Topologies
  6. WAN Virtualization

Data Center Site Selection

Data Center Interconnect (DCI)

Before we get started on your journey with a data center site selection checklist, it may be helpful to know how data centers interconnect. Data Center Interconnect (DCI) solutions have been known for quite some time; they are mainly used to help geographically separated data centers.

Layer 2 extensions might be required at different layers in the data center to enable the resiliency and clustering mechanisms offered by the other applications. For example, Cisco’s OTV can be used as a DCI solution.

OTV provides Layer 2 extension between remote data centers using MAC address routing. A control plane protocol exchanges MAC address reachability information between network devices, providing the LAN extension functionality. This has a tremendous advantage over traditional data center interconnect solutions, which generally depend on data plane learning and flooding across the transport to learn reachability information.

Data Center Site Selection Criteria

Proximity-based site selection

Different data center site selection criteria can route users to the most optimum data centers. For example, proximity-based site selection involves selecting a geographically closer data center, which generally improves response time. Additionally, you can route requests based on the data center’s load or application availability.

Things become interesting when workloads want to move across geographically dispersed data centers while maintaining active connections to front-end users and backed systems. All these elements put increasing pressure on the data center interconnect ( DCI ) and the technology used to support workload mobility.

Multi-site load distribution & site-to-site recovery

Data center site selection can be used for site-to-site recovery and multi-site load distribution. Multi-site load distribution requires a mechanism that enables the same application to be accessed by both data centers, i.e., an active/active setup.

For site-to-site load balancing, you must use an active/active scenario ( also known as hot standby ) in which both data centers host the same active application. Logically active / standby means that some applications will be active on one site while others will be on standby at the other sites.

data center site selection checklist
Data Center Site Selection. Dual Data Centers.

Data center site selection is vital, and determining which data center to target your request can be based on several factors, such as proximity and load. Different applications will prefer different site selection mechanisms. For example, video streaming will choose the closest data center ( proximity selection ). Other types of applications would prefer data centers that are least loaded, and others work efficiently with the standard round-robin metric. The three traditional methods for data center site selection criteria are Ingress site selection DNS-based, HTTP redirection, and Route Health Injection.

Data Center Site Selection Checklist

Hypertext Transfer Protocol ( HTTP ) redirection

Applications can have built-in HTTP redirection in their browsers. This enables them to communicate with a secondary server if the primary server is not available. When redirection is required, the server will send an HTTP Redirect ( 307 ) to the client and send the client to the correct site with the required content. One advantage of this mechanism is that you have visibility into the requested content, but as you have probably already guessed, it only works with HTTP traffic.

HTTP Redirect
Diagram: HTTP redirect.

DNS-based request routing

DNS-based request routing, or DNS load balancing, distributes incoming network traffic across multiple servers or locations based on the DNS responses. Traditionally, DNS has been primarily used to translate human-readable domain names into IP addresses. However, DNS-based request routing can now be vital in optimizing network traffic flow.

How does it work?

When a user initiates a request to access a website or application, their device sends a DNS query to a DNS resolver. Instead of providing a single IP address in response, the DNS resolver returns a list of IP addresses associated with the requested domain. Each IP address corresponds to a different server or location that can handle the request.

The control point for geographic load distribution in DNS-based request routing resides within DNS. DNS-based request routing uses DNS for both site-to-site recovery and multi-site load distribution. A DNS request, either recursive or iterative, is accepted by the client and directed to a data center based on configurable parameters. This provides the ability to distribute the load among multiple data centers with an active/active design based on criteria such as least loaded, proximity, round-robin, and round-trip time ( RTT ).

The support for legacy applications

DNS-based request routing becomes challenging if you have to support legacy applications without DNS name resolution. These applications have hard-coded IP addresses used to communicate with other servers. When there is a combination of legacy and non-legacy applications, the solution might be to use DNS-based request routing and IGP/BGP.

Another caveat for this approach is that the refresh rate for the DNS cache may impact the convergence time. Once a VM moves to the secondary site, there will also be increased traffic flow on the data center interconnect link—previously established connections are hairpinned.

Route Health Injection ( RHI )

Route Health Injection (RHI) is a method for improving network resilience by dynamically injecting alternative routes. It involves monitoring network devices and routing protocols to identify potential failures or performance degradation. By preemptively injecting alternative routes, RHI enables networks to reroute traffic and maintain optimal connectivity quickly.

How does Route Health Injection work?

Route Health Injection operates by continuously monitoring the health of network devices and analyzing routing protocol information. It leverages various metrics such as latency, packet loss, and link utilization to assess the overall health of network paths. When a potential issue is detected, RHI dynamically injects alternative routes to bypass the affected network segment, allowing traffic to flow seamlessly.

RHI is implemented in front of the application and, depending on its implementation, allows the same address or a different address to be advertised. It’s a route injected by a local load balancer that influences the ingress traffic path. RHI injects a static route when the VIP ( Virtual IP address ) becomes available and withdraws the static route when the VIP is no longer active. The VIP is used to represent an application.

  • A key point: Data center active-active scenario

Route Health Injection can be used for an active/active scenario as both data centers can use the same VIP to represent the server cluster for each application. RHI can create a lot of churns as routes are constantly being added and removed. If the number of supported applications grows, the network’s number of network host routes grows linearly. The decision to use RHI should come down to the scale and size of the data center’s application footprint.

RHI is commonly used on Intranets as the propagation of more specifics is not permitted on the Default Free Zone ( DFZ ). Specific requirements require RHI to be used with BGP/IGP for external-facing clients. Due to the drawbacks of DNS caching, RHI is often preferred over DNS solutions for Internet-facing applications.

  • A quick point: Ansible Automation

Ansible could be a good automation tool for bringing automation into the data center. Ansible can come from Ansible CLI, with Ansible Core, or a platform approach with Ansible Tower. Can these automation tools assist in our data center operations? Ansible variables can be used to remove site-specific information to make your playbooks more flexible.

For data center configuration or simply checking routing tables, you can have a single playbook that uses Ansible variables to perform operations on both data centers. I use this to check the routing tables of each data center. Once playbook using Ansible variables against one inventory for all my data centers. This can quickly help you when troubleshooting data center site selection.

BGP AS prepending

This can be used for active / standby site selection, not a multi-load distribution method. BGP uses the best path algorithm to determine the best Path to a specific destination. One of those steps that all router manufacturers widely use is AS Path—the lower the number of ASs in the path list, the better the route.

Specific routes are advertised from both data centers, with additional AS Paths added to the secondary site’s routes. When BGP goes through its site selection processes, it will choose the Path with the least AS Paths, i.e., the primary site without AS Prepending.

 

BGP conditional advertisements

BGP Conditional Advertisements are helpful when you are concerned that some manufacturers may have AS Path explicitly removed. A condition must be met with conditional route advertisement before an advertisement occurs. The routers on the secondary site monitor a set of prefixes located on the first site, and when those prefixes are not reachable at the first site, the secondary sites begin to advertise.

Its configuration is based on community”no-export” and iBGP between the sites. If routes were redistributed between BGP > IGP and advertised to the IBGP peer, the secondary site would advertise those routes, defeating the purpose of a conditional advertisement.

data center site selection checklist
How do users get routed to a data center?

The RHI method used internally or externally with BGP is proper when using IP as the site selection method. For example, this may be the case when you have hard-coded IP addresses in the application used primarily with legacy applications or are concerned about DNS caching issues. Site selection based on RHI and BGP requires no changes to DNS.

However, its main drawback is that it cannot be used for active/active data centers and is primarily positioned as an active / standby method. This is because there is only ever one routing table entry in the routing table.

Additionally, for the final data center site selection checklist. There are designs where you can use IP Anycast in conjunction with BGP, IGP, and RHI to achieve an active/active scenario, and I will discuss this later. With this setup, there is no need for BGP conditional route advertisement or AS Path prepending.

WAN Design Requirements

Spine Leaf Architecture

Spine Leaf Architecture

In today's interconnected world, where data traffic is growing exponentially, having a robust and scalable network architecture is crucial for businesses and organizations. One such architecture that has gained popularity in recent years is the Spine Leaf architecture. This blog post will explore Spine Leaf architecture, its benefits, and how it can revolutionize network design.

Spine Leaf architecture, also known as Clos architecture, is a network design approach that offers high bandwidth, low latency, and scalability. It is commonly used in data centers and large-scale enterprise networks. The architecture is based on a two-tier design consisting of spine switches and leaf switches.

The spine-leaf architecture is a data center design that provides a scalable and low-latency network fabric. Unlike traditional three-tiered designs, the spine-leaf architecture eliminates the need for complex hierarchical structures, allowing for faster and more efficient communication between devices. With a non-blocking fabric and equal-length paths, data can travel seamlessly from one leaf switch to another, enhancing overall network performance.

The spine-leaf data center offers a myriad of benefits for organizations. Firstly, it provides predictable and consistent low-latency connectivity, ensuring optimal performance for mission-critical applications. Additionally, the architecture allows for easy scalability, enabling seamless expansion as network demands grow. Furthermore, the simplified design reduces complexity, making deployment and management more efficient. Overall, the spine-leaf data center empowers organizations with a robust and agile network infrastructure.

One of the key advantages of the spine-leaf data center is its ability to enhance network flexibility and resilience. By utilizing equal-length paths, traffic can be distributed evenly, preventing bottlenecks and maximizing network capacity. Moreover, the architecture allows for the implementation of link aggregation techniques, increasing bandwidth and redundancy. These features not only improve network performance but also provide built-in fault tolerance, ensuring high availability for critical applications.

The spine-leaf architecture represents a significant shift in the evolution of data center networks. Traditional designs often faced challenges in adapting to the increasing demands of virtualization, cloud computing, and big data analytics. The spine-leaf data center addresses these challenges by providing a scalable, high-performance, and flexible network design that can meet the requirements of modern applications and workloads. It sets the stage for the future of data center networking.

Highlights: Spine Leaf Architecture

At its most straightforward, a data center is a physical facility that houses applications and data. Such a design is based on a computing and storage resources network that enables the delivery of shared applications and data. The critical elements of a data center design include routers, switches, firewalls, storage systems, servers, and application-delivery controllers.

The data center should be flexible in quickly deploying and supporting new services. Such a design needs substantial initial planning and consideration of port density, access layer uplink bandwidth, actual server capacity, and oversubscription, to name a few.

Traditional Tree-Based Topologies

We have tree-based topologies on the opposite side of a spine-leaf switch design. Tree-based topologies have been the mainstay of data center networks. Traditionally, Cisco has recommended a multi-tier tree-based data center topology, as depicted in the diagram below.

These networks are characterized by aggregation pairs ( AGGs ) that aggregate through many network points. Hosts connect to access or edge switches, which connect to distribution, and distribution connects to the core.

The core should offer no services ( firewall, load balancing, or WAAS ), and its central role is to forward packets as quickly as possible. The aggregation switches define the boundary for the Layer 2 domain, and to contain broadcast traffic to individual domains, VLANs are used to further subdivide traffic into segmented groups. This style of design operates very differently from spine-leaf architecture.

1st Lab Guide: Leaf and Spine with Cisco ACI 

The following lab guide addresses the leaf and spine with Cisco ACI. The screenshot below shows a small topology that is fine for demonstration purposes. The leaf and spine are based on the Cisco Nexus 9000 series. The ACI has an automated fabric discovery process, and as you can see, we have successfully registered all fabric members.

SDN data center
Diagram: Cisco ACI fabric checking.

The traditional three-tier model was based on the following design principles:

  1. The access switch connects to endpoints, e.g., servers.
  2. The aggregation or distribution switches provide redundant connections to access switches.
  3. The core switches provide fast transport between aggregation switches, typically connected in a redundant pair for high availability.
  4. Networking and security services such as load balancing or firewalling were typically connected to the distribution layers.
spine leaf architecture
The traditional data center design. Non-spine-leaf architecture.

The focus of the design

Their design’s focus was based on fault avoidance principles, and the strategy for implementing this principle is to take each switch and its connected links and build redundancy into it. This led to the introduction of port channels and devices deployed in pairs. In addition, servers pointed to a First Hop Redundancy Protocol, like HSRP or VRRP ( Hot Standby Router Protocol or Virtual Router Redundancy Protocol ). Unfortunately, the steady-state type of network design led to many inefficiencies:

  1. Inefficient use of bandwidth via a single-rooted core.
  2. Operational and configuration complexity.
  3. The cost of having redundant hardware.
  4. It is not optimized for small flows.

Recent changes to application and user requirements have changed the functions of data centers, which in turn has altered the topology and design of the data center to a spine-leaf switch topology. For example, the traditional aggregation point design style was inefficient, and recent changes in end-user requirements are driving architects to design around the following key elements.

Spine Leaf Architecture: Requirements

A spine-leaf architecture collapses one of these tiers at the most basic level, as depicted in the diagram below. Follow the following design principles:

  1. The removal of the Spanning Tree Protocol (STP)
  2. Increased use of fixed-port switches over modular models for the network backbone
  3. More cabling to purchase and manage, given the higher interconnection count
  4. A scale-out vs. scale-up of infrastructure.
what is spine and leaf architecture
Diagram: What is spine and leaf architecture? 2-Tier Spine Leaf Design

Leaf and Spine Main Points

With the introduction of the cloud and containerized infrastructure, there was an increase in east-west traffic. East-west traffic differs from north to south traffic and moves laterally from server to server. Generally, this type of traffic flow stays internal to the data center.

With the change in traffic patterns, we must design our data centers to have low latency and optimized traffic flows, especially for time-sensitive or data-intensive applications. A spine-leaf data center design aids this by ensuring traffic always has the same number of hops from its next destination, so latency is lower and predictable.

STP has always been problematic in the data center. Now, the capacity improves with a leaf and spine because STP is no longer required. In the past, STP blocked redundant paths between two switches, where only one could be active at any time.

As a result, paths often need to be more subscribed. With a leaf, spine-leaf architectures rely on protocols such as Equal-Cost Multipath (ECPM) routing to load balance traffic across all available paths while still preventing network loops. So, instead of running STP to the spine layer, we can run routing protocols.

We also have better scalability. We can add additional spine switches, and leaf switches can be seamlessly inserted when port density becomes problematic. There is no need to take down the core layer for upgrades.

STP Blocking.
Diagram: STP Blocking. Source Cisco Press free chapter.

Data Center Requirements

  • 1) Equidistant endpoints with non-blocking network core.

Equidistant endpoints mean that every device is a maximum of one hop away from the other, resulting in consistent latency in the data center. The term “non-blocking” refers to the internal forwarding performance of the switch.

Non-blocking is the ability to forward at line rate tx/Rx – sender X can send to receiver Y and not be blocked by a simultaneous sender. A blocking architecture cannot deliver the total bandwidth even if the individually switching modules are not oversubscribed or if all ports are not transmitting simultaneously.

  • 2) Unlimited workload placement and mobility.

The application team wants to place the application at any point in the network and communicate with existing services like storage. This usually means that VLANs need to sprawl for VMotion to work. The main question is, where do we need large layer 2 domains? Bridging doesn’t scale, and that’s not just because of spanning tree issues; it’s because the MAC addresses are not hierarchical and cannot be summarized. There is also a limit of 4000 VLANs.

  • 3) Lossless transport for storage and other elephant flows.

To support this type of traffic, data centers require not only conventional QoS tools but also Data Center Bridging ( DCB ) tools such as Priority flow control ( PFC ), Enhanced transmission selection ( ETS ), and Data Center Bridging Exchange ( DCBX ) to be applied throughout their designs. These standards are enhancements that allow lossless transport and congestion notification over full-duplex 10 Gigabit Ethernet networks.

Feature

 Benefit

Priority-based Flow Control ( PFC )

Manages bursty single traffic source on a multiprotocol link

Enhanced transmission selection ( ETS )

Enables bandwidth management between traffic types for multiprotocol links

Congestion notification

Addresses the problems of sustained congestion by moving corrective action to the edge of the network

Data Center Bridging Exchange Protocol 

Allows the exchange of enhanced Ethernet parameters

  • 4) Simplified provisioning and management.

Simplified provisioning and management are critical to operational efficiency. However, the ability to auto-provision and for the users to manage their networks is challenging for future networks.

  • 5) High server-to-access layer transmission rate at Gigabit and 10 Gigabit Ethernet.

Before the advent of virtualization, servers transitioned from 100Mbps to 1GbE as processor performance increased. With the introduction of high-performance multicore processors and each physical server hosting multiple VMs, the processor-to-network connection bandwidth requirements increased dramatically, making 10 Gigabit Ethernet the most common network access option for servers.

In addition, the popularization of 10 Gigabit Ethernet for server access has provided a straightforward approach to group/bundle multiple Gigabit Ethernet interfaces into a single connection, making Ethernet an extremely viable technology for future-proof I/O consolidation.

In addition, to reduce networking costs, data centers are now carrying data and storage traffic over Ethernet using protocols such as iSCSI ( Internet Small Computer System Interface ) and FCoE ( Fibre Channel over Ethernet ). FCoE allows the transport of Fibre channels over a lossless Ethernet network.

spine-leaf switch
FCoE Frame Format

Although there has been some talk of introducing 25 Gigabit Ethernet due to the excessive price of 40 Gigabit Ethernet, the two main speeds on the market are Gigabit and 10 Gigabit Ethernet. The following is a comparison table between Gigabit and 10 Gigabit Ethernet:

Gigabit Ethernet

 10 Gigabit Ethernet

+ Well know and field-tested

+ Much faster vMotion

+ Standard and cheap Copper cabling

+ Converged storage & network ( FCoE or lossless iSCSI/NFS)

+ NIC on the motherboard

+ Reduce the number of NICs per server

+ Cedric Kelly

+ Built-in Qos with ETS and PFC

+ Uses fiber cabling which has lower energy consumption and error rate

– Numerous NICs per hypervisor host. Maybe up to 6 NICs ( user data, VMotion, storage )

– More expensive NIC cards

– No storage/networking convergence. Unable to combine networking and storage onto one NIC

– Usually requires new cabling to be laid which intern could mean more structured panels

– No lossless transport for storage and elephant flows

– SFP used either for single-mode or multimode fiber can be up to $4000 list per each

Spine-Leaf Switch Design

The critical difference between traditional aggregation layers/points and fabric networks is that fabric doesn’t aggregate. If we want to provide 10GB for every edge router to send 10GB to every other edge router, we must add bandwidth between routers A and B, i.e., if we have three hosts sending at 10GB each, we need a core that supports 30 GB.

We must add bandwidth at the core because what if two routers wanted to send 2 x 10GB of data, and the core only supports a maximum of 10GB ( 10GB link between routers A and B)? Both data streams must be interleaved onto the oversubscribed link so that both senders get equal bandwidth. 

You get blocked and oversubscribed when more bandwidth comes into the core than the core can accommodate. Blocking and oversubscription cause delay and jitter, which is bad for some applications, so we must find a way to provide total bandwidth between each end host.

Oversubscription is expressed as the ratio of inputs to outputs (ex. 3:1) or as a percent that is calculated (1 – (# outputs / # inputs)). For example, (1 – (1 output / 3 inputs)) = 67% oversubscribed). There will always be some oversubscription on the network, and there is nothing we can do to get away from that, but as a general rule of thumb, an oversubscription value of 3:1 is best practice.

Some applications will operate fine when oversubscription occurs. It is up to the architect to thoroughly understand application traffic patterns, bursting needs, and baseline states to define the oversubscription limits a system can tolerate accurately.

The simplest solution to overcome the oversubscription and blocking problems would be to increase the bandwidth between Router A and B, as shown in the diagram labeled “Traditional Aggregation Topology.” This is feasible up to a certain point. Router A and B links must also grow to 10GB and 30 GB when the number of edge hosts grows. Datacenter links and the optics used to connect them are expensive.

Spine-Leaf Switch Design

The solution is to divide the core devices into several spine devices, which expose the internal fabric, enabling a spine-leaf architecture similar to what you see with ACI networks. This is achieved by spreading the fabric across multiple devices ( leaf and spine ).

The spreading of the fabric results in every leaf edge switch connecting to every spine core switch, resulting in every edge device having the total bandwidth of the fabric. This places multiple traffic streams parallel, unlike the traditional multitier design that stacks multiple streams onto a single link.

In addition, the higher degree of equal-cost multi-path routing ( ECMP ) found with leaf and spine architectures allows for greater cross-sectional bandwidth between layers, thus greater east-west bandwidth. There is also a reduction in the fault domain compared to traditional access, distribution, and core designs.

A failure of a single device only reduces the available bandwidth by a fraction, and only transit traffic will be lost with a link failure. ECMP reduces liability to a single fault and brings domain optimization.

Origination of the spine and leaf design

Charles Clos initially designed a Clos network in 1952 as a multi-stage circuit-switched interconnection network to provide a scalable approach to building large-scale voice switches. It constrained high-speed switching fabrics and required low-latency, non-blocking switching elements.

There has been an increase in the deployment of Clos-based models in data center deployments. Usually, the Clos network is folded around the middle to form a “folded-Clos” network, referred to as a spine-leaf architecture. The spine-leaf switch design consists of three switches:

  • Servers connect directly to ToR ( top of rack ) switches.
  • ToR connects to aggregation switches.
  • Intermediate switches connect to aggregation switches. 

The spine is responsible for interconnecting all Leafs and allows hosts in one rack to talk to hosts in another. The leaves are responsible for physically connecting the servers and distributing traffic via ECMP across all spine nodes.

Leaf and Spine: Folded 3-Stage Clos fabric

Spine-leaf switch deployment considerations:

A. Spine-leaf switch: Fixed or modular switches

Fixed Switches

Modular switches

+ Cheaper

+ Gradual Growth

+ Lower Power Consumption

 + Larger fabrics with leaf/spine topologies

+ Require less space

 + Build-in redundancy with redundant SUPs and SSO/NSF

+ More ports per RU

+ In-Service software redundancy

+ Easier to manage

– Hard to manage

– More expensive

– Difficult to expand

– More cabling due to an increase in device numbers

The leaf layer determines the size of the spine and the oversubscription ratios. It is responsible for advertising subnets into the network fabric. An example of a leaf device would be a Nexus 3064, which provides the following:

  1. Line rate for Layer 2 and Layer 3 on all ports.
  2. Shared memory buffer space.
  3. Throughput of 1/2 terabits per second ( Tbps ) and 950 million packets per second ( Mpps )
  4. 64-way ECMP

The spine layer is responsible for learning infrastructure routes and physically interconnecting all leaf nodes. The Nexus 7K is the platform for the Spine device layer. The F2 series line cards can provide 48x 10G line rate ports and fit the requirements for spine architecture very well.
The following are the types of implementations you could have with this topology:

  1. Layer 3 fabric with standard routing.
  2. Large-scale bridging ( FabricPath, THRILL, or SPB ).
  3. Many-chassis MLAG ( Cisco VSS ).

This article will focus on Layer 3 fabrics with standard routing.

B. Spine-leaf switch: Non-redundant layer 3 design

Spine-leaf switch: Design Summary

  1. Layer 3 directly to the access layer. Layer 2 VLANs do not span the spine layer.
  2. Servers are connected to single switches. Servers are not dual connected to two switches, i.e., there is no server to switch redundancy or MLAG.
  3. All connections between the switches will be pure routed point-to-point layer 3 links.
  4. There are no inter-switch VLANs, so no VLAN will ever go beyond one switch.

When the spine switches only advertise the default to the leaf switches, the leaf switches lose visibility of the entire network, and you will need additional intra-spine links. Therefore, intra-spine links should not be used for data plane traffic in a leaf-spine architecture.

Spine-leaf switch: Design assumptions

The spine layer passes a default route to the Leaf. The link between the Leaf connecting to Host 1 and Spine Z fails. In the diagram, the link is marked with a red “X.” Host 4 sends traffic to the fabric destined for Host 1.

This traffic spreads ( ECMP ) across all links connecting the connected Leaf to the Spine layers. The traffic hits Spine C, and as C does not have a direct link ( it has failed ) to the Leaf connecting to Host 1, some traffic may be dropped while others will be sub-optimal. To overcome this, you must add inter-switch links between the Spine layers, which is not recommended.

Spine-leaf switch: Recommendations

  1. Buy Leaf switches that can support enough IP prefixes and don’t use summarization from Spine to Leaf.
  2. Always use 40G links instead of channels of 4 x 10G links because link aggregation bandwidth does not affect routing costs. If you lose a link in the port channel, the cost of the port channel does not change, which could result in congestion on the link. You could use Embedded Event Manager ( EEM ) scripting to change the OSPF cost after one of the port channels fails. This would add complexity to the network as you now don’t have equal-cost routes. This would lead you to use the Cisco proprietary protocol EIGRP, which supports unequal cost routing. If you didn’t want to support a Cisco proprietary protocol, you could implement MPLS TE between the ToR switches. First, you need to check that the DC switches support the MPLS switching of labels.
  3. Use QSFP optics as they are more robust than SFP optics. This will lower the likelihood of one of the parallel links failing.

 

C. Spine-leaf switch: Redundant layer 3 design

Spine-leaf switch: Design Summary

  1. The servers are dual home to two different switches.
  2. Servers have one IP address due to the restriction of TCP applications. Ideally, use LACP ( Link Aggregation Control Protocol ) between the host and servers.
  3. Layer 2 trunk links between the Leaf switches are needed to carry VLANs that span both switches. This will restrict VLANs from spanning the core, thus creating a sizeable L2 fabric based on STP.
  4. ToR switches must be in the same subnets ( share the server’s subnet) and advertise this subnet into the fabric. Again, the servers are dual-homed to 2 switches with one IP address.

Spine-leaf switch: The challenges

The leaf switches both advertise the same subnet to the spine switches. The spine switches and thinks they have two paths to reach the host. The Spine switch will spread its traffic from Host 1 to Leaf switches connecting Host 1 and Host 2. In specific scenarios, this could result in traffic to the hosts traversing the Interswitch link between the leaf nodes. This may not be a problem if most traffic leaves the servers northbound ( traffic leaving the data center ). However, if there is a lot of inbound traffic, this link could become a bottleneck and congestion point. This may not be an issue if this is a hosting web server farm because most traffic will leave the data center to external users.

Spine-leaf switch: Recommendation

  1. If there is a lot of east-to-west traffic ( 80 % ), using LAG ( Link Aggregation Group ) between the servers and ToR Leaf switches is mandatory.
  2. The two Leaf switches must support MLAG ( Multichassis Link Aggregation ). The result of using MLAG on the Leaf switches is that when connecting Leaf receives traffic destined for host X, it knows it can reach it directly through its connected link—resulting in optimal southbound traffic flow.
  3. Most LAG solutions place traffic generated from a single TCP session onto a single uplink, limiting the TCP session throughput to the bandwidth of a single uplink interface. However, Dynamic NIC teaming is available in Windows Server 2012 R2 which can split a single TCP session into multiple flows and distribute them across all uplinks.
  4. Use dynamic link aggregation – LACP and not static port channels. The LAGs between servers and switches should use LACP to prevent traffic blackholing.




Key Spine Leaf Architecture Summary Points:

Main Checklist Points To Consider

  • The spine leaf architecture consists of a leaf layer and a spine layer. Endpoints connect to the leaf layer—the spine switch act as the core.

  • This layout of the leaf and spine gives you optimal load balancing and ECMP for any endpoint in any location.

  • The traditional tree-based topologies are not suited for virtualization and you will always be hit with the core port count.

  • The spine and leaf can build massive data centers with, for example, folder 3-stage design.

  • Cisco ACI is an example of a leaf and spine design. VXLAN is the most common overlay protocol that works over what is known as the underlay.

Recap on Spine and Leaf Architecture

Spine Switches:

Spine switches form the backbone of the network in a Spine Leaf architecture. They are high-performance switches that connect to every leaf switch in the network. The spine switches provide a non-blocking, high-bandwidth fabric for data transfer between leaf switches. They ensure data traffic flows seamlessly across the network, avoiding bottlenecks and congestion.

Leaf Switches:

Leaf switches are connected to the spine switches and act as the access layer in a Spine Leaf architecture. They connect end-user devices, servers, or other network devices to the spine switches. Leaf switches are responsible for forwarding traffic between devices within the same leaf and between different leaf switches. They offer a high degree of network flexibility and redundancy.

Benefits of Spine Leaf Architecture:

1. Scalability: Spine Leaf architecture allows for easy scalability as new leaf switches can be added without affecting the existing network. This scalability makes it ideal for growing businesses and organizations with expanding network requirements.

2. High Bandwidth: The architecture provides high bandwidth capacity by leveraging multiple spine switches. This efficiently handles heavy data traffic and ensures optimal network performance even during peak usage.

3. Low Latency: Spine Leaf architecture minimizes latency by eliminating multiple layers of network hierarchy. With fewer hops and shorter paths, data packets can be transmitted quickly, improving application response times.

4. Redundancy and Resilience: The architecture offers built-in redundancy and resilience. If a link or a switch fails, traffic can be automatically rerouted through alternate paths, ensuring uninterrupted network connectivity and minimizing downtime.

5. Enhanced Performance: Spine Leaf architecture improves overall network performance by evenly distributing traffic across multiple paths. This load-balancing capability optimizes resource utilization and prevents any single point of failure.

Spine Leaf Architecture

The spine-leaf architecture has only two layers of switches: spines and leaves. Switches form the spine layer, which performs routing and works as the network’s core. Access switches connect servers, storage devices, and other end users to the leaf layer. A data center network with this structure has a lower hop count and a lower network latency. Leaf switches are connected to spine switches in the spine-leaf architecture. In this design, there is only one interconnected switch path between any leaf switches so that any server can communicate with any other server.

Why Use Spine-leaf Architecture?

The spine-leaf architecture has become a famous data center architecture, bringing many advantages, including scalability and network performance. In five points, we summarize the benefits of spine-leaf architecture in modern networks.

– Enhanced redundancy: The spine-leaf architecture connects the servers with the core network, providing greater flexibility in hyperscale data centers. As a result, the leaf switch can serve as a bridge between the server and the core network. A sizeable non-blocking fabric is formed by connecting leaf switches to spine switches, increasing redundancy and reducing traffic bottlenecks.

– Enhanced bandwidth: The spine-leaf architecture can effectively avoid traffic congestion through protocols such as transparent interconnection of multiple links (TRILL) and shortest path bridging (SPB). Adding uplinks to the spine switch increases interlayer bandwidth and reduces oversubscription to secure network stability using the spine-leaf architecture.

– Enhanced scalability: Multiple links carry traffic in the spine-leaf architecture. In addition to improving scalability, switches can help enterprises expand their businesses in the future.

– Reduced expenses: Because spine-leaf architecture allows switches to handle more connections, data centers deploy fewer devices. A spine-leaf architecture minimizes costs in many data center networks.

– Increased Performance: With a maximum number of hops of only two, we facilitate a more direct traffic path, enhancing overall performance and reducing bottlenecks. This applies only when the destination is on the same leaf switch.

leaf and spine

Spine and Leaf Popularity

Because of cloud computing and containerized infrastructure, east-west traffic increases in modern data centers. East-west traffic moves from server to server in a lateral fashion. Modern applications have components distributed across multiple servers or virtual machines, which partly explains this shift. When it comes to east-west traffic, low-latency, optimized flows are critical for applications that are time-sensitive or data-intensive. Spine-leaf architectures reduce latency by ensuring every hop between destinations is the same.

STP has also been removed, increasing capacity. Only one switch can be active simultaneously, even though STP can provide redundant paths between two switches. Consequently, paths are oversubscribed. Spine-leaf architectures use protocols such as Equal-Cost Multipath (ECMP) to accomplish load-balancing traffic across all available paths across all available paths. Topologies with spines and leaves improve scalability and performance. Capacity can be increased by adding additional spine switches and connecting them to each leaf. Likewise, new leaf switches can be seamlessly inserted if port density becomes an issue. “Scaling out” infrastructure doesn’t change anything.

stp port states

Charles Clos – large-scale switching fabrics

Using Edson Erwin’s concept of building large-scale switching fabrics for telephone systems, Charles Clos (pronounced Klo) developed the Clos network, published in the Bell System Technical Journal in 1953. The original paper, “A Study of Non-Blocking Switching Networks,” is cited in hundreds of subsequent documents. In telephony systems, a Clos network consists of three stages, each with several crossbar switches. To reduce complexity and cost, stages were introduced instead of a single prominent crossbar to reduce the number of crosspoint interconnections needed to build large-scale crossbar-like functionality.

Crossbar switches are strictly non-blocking switches with n inputs and n outputs and interconnecting lines connecting inputs and outputs. For idle input and output lines, non-blocking means that connections can be made without interrupting other connected lines. A crossbar is fundamentally designed to accomplish this. In this case, the complexity of the crossbar switch is O(n2).

Clos. Non Blocking

Data center topology

A spine-leaf architecture is a variation of data center topologies that consists of two switching layers. We have a spine-leaf switch design consisting of two layers. The leaf layer consists of access switches that aggregate traffic from endpoints that could be traditional servers or containers and connect directly to the spine, which is the network core. The Spine switch will often have two for redundancy to interconnect all leaf switches in a full-mesh leaf and spine topology. With a spine and leaf data center network design, the leaf switches do not directly connect.

The underlay and the overlay

eBGP, in this case, is used to exchange routing information between the nodes of the fabric through the underlay, which provides point-to-point Layer 3 interfaces between leafs and spines. Using eBGP to advertise the loopback addresses of VTEPs in the fabric (typically leaves), the underlay offers connectivity between the loopbacks.

In the overlay layer, packets are encapsulated in an outer IP header and transported from one VTEP to another using the data plane encapsulation layer. Source IP addresses are the loopbacks of the originating VTEPs, and destination IP addresses are the loopbacks of the terminating VTEPs.

Multicast VXLAN
Diagram: Multicast VXLAN

Example: Cisco ACI

Instead, all connectivity goes through the core, and the physical and logical layout is generally the same based on network overlay protocols, more than likely VXLAN. An example of a data center that utilizes such a design is the Cisco ACI. The ACI Cisco consists of three main components: the Application Policy Infrastructure Controller (APIC), the spine switches, and the leaf switches.

 

Summary: Spine Leaf Architecture

In data centers, efficiency and scalability are critical to ensuring optimal performance. One architectural design that has gained significant attention is the leaf and spine architecture. This blog post delved into the intricacies of leaf and spine architecture, exploring its benefits, components, and the future it holds for data centers.

Understanding Leaf and Spine Architecture

Leaf and spine architecture, also known as Clos architecture, is a network design approach that eliminates bottlenecks and enhances data center scalability. The architecture consists of two main components: leaf switches and spine switches. Leaf switches act as the access layer, connecting directly to servers, while spine switches serve as the backbone, interconnecting the leaf switches.

Benefits of Leaf and Spine Architecture

The leaf and spine architecture offers several advantages over traditional network designs. Firstly, it provides high bandwidth and low latency due to the non-blocking nature of the spine switches. This ensures smooth and efficient communication between servers. Additionally, the architecture allows for easy scalability, as new leaf switches can be added without impacting the existing network. This modular approach enables data centers to adapt to growing demands seamlessly.

Components of Leaf and Spine Architecture

To grasp the essence of leaf and spine architecture, it’s essential to understand its main components. Leaf switches connect servers within a rack, offering multiple high-speed ports. Spine switches, conversely, provide the interconnectivity between leaf switches, forming a fabric network. Additionally, the architecture may incorporate top-of-rack (ToR) switches for enhanced flexibility and redundancy.

Future Trends and Innovations

As technology continues to evolve, leaf and spine architecture is poised to witness further advancements. With the rise of software-defined networking (SDN), data centers can achieve greater control and programmability in managing their network infrastructure. Moreover, integrating artificial intelligence (AI) and machine learning (ML) algorithms can optimize traffic flow and improve overall network performance.

Conclusion:

In conclusion, leaf and spine architecture has revolutionized how data centers are designed and operated. Its scalable and efficient nature brings numerous benefits, including high bandwidth, low latency, and easy expansion. As technology progresses, we can expect further innovations in this architectural approach, ensuring that data centers can meet the ever-growing digital age demands.

SDN Data Center

SDN Data Center

SDN Data Center

The world of technology consists of data centers that play a crucial role in storing and managing vast amounts of information. Traditional data centers, however, have faced challenges in terms of scalability, flexibility, and efficiency. Enter Software-Defined Networking (SDN), a groundbreaking approach reshaping the landscape of data centers. In this blog post, we will explore the concept of SDN, its benefits, and its potential to revolutionize data centers as we know them.

In SDN, the functions of network nodes (switches, routers, bare metal servers, etc.) are abstracted so they can be managed globally and coherently. A single controller, the SDN controller, manages the whole entity coherently by detaching the network device's decision-making part (control plane) from its operational part (data plane).

The name "Software Defined" comes from this controller, allowing "network programmability." The Open Networking Foundation (ONF) was founded in March 2011 to promote the concept and development of OpenFlow. In 2009, the University of Stanford (US) and its research center (ONRC) published the first OpenFlow specifications, one of the protocols used by SDN controllers.

- Traditional data center networks often face challenges such as complex configurations, limited scalability, and lack of agility. SDN technology addresses these issues by introducing a software-based approach to network management. With SDN, data center operators can automate network provisioning, streamline operations, and achieve greater scalability. Moreover, SDN enables network virtualization, allowing multiple virtual networks to coexist on a shared physical infrastructure, leading to improved resource utilization.

- Security is a top priority for data centers, and SDN brings notable advancements in this domain. With its centralized control, SDN provides a holistic view of the network, enabling enhanced security policies and threat detection mechanisms. By dynamically allocating resources and isolating traffic, SDN mitigates potential security breaches. Additionally, SDN facilitates network resilience through features like automatic traffic rerouting, load balancing, and real-time network monitoring.

- The applications of SDN in data centers are vast and varied. One notable use case is network virtualization, which allows data center operators to create isolated virtual networks for different tenants or applications. This enhances resource allocation and provides better network performance. SDN also enables efficient load balancing across servers, optimizing resource utilization and improving application delivery. Furthermore, SDN facilitates the deployment of network services, such as firewalls and intrusion detection systems, in a more agile and scalable manner.

Highlights: SDN Data Center

SDN Data Center

What is SDN:

With SDN, network nodes (switches, routers, bare-metal servers, etc.) are abstracted from their functions, which allows them to be managed globally and coherently. An SDN controller coherently manages the entire system through its control plane (control plane) and data plane (data plane (data plane). “Network programmability” is enabled by Software Defined Controllers. March 2011 saw the founding of the Open Networking Foundation (ONF), a non-profit organization dedicated to promoting and developing OpenFlow. Research centers, such as Stanford University’s ONRC, which produced the first OpenFlow specifications in 2009, were interested in using OpenFlow as a protocol for SDN controllers.

Why do we need it?

IT teams are responsible for building and managing IT infrastructure and applications, but they should also serve key business drivers for their organization, such as these:

  1. Affordability
  2. Growth
  3. Adaptability
  4. Ability to scale
  5. A secure environment. 

As we know, non-SDN networks in the data center space have many drawbacks and present many operational challenges to modern IT infrastructures. In addition to these challenges, organisations from diverse industries raised new demands for SDN.

Google Cloud Data Centers

### What is Google Network Connectivity Center?

Google Network Connectivity Center (NCC) is a comprehensive network management solution designed to unify and simplify the connectivity experience. It serves as a centralized hub for managing and orchestrating network connectivity, providing a holistic view of an organization’s network. By leveraging NCC, businesses can ensure efficient and secure data flow between their on-premises infrastructure, cloud environments, and remote locations.

### Key Features of NCC

#### Centralized Management

One of the standout features of NCC is its centralized management capability. It allows network administrators to monitor and control multiple network connections from a single interface. This centralization reduces complexity and enhances operational efficiency, making it easier to identify and resolve connectivity issues swiftly.

#### Automation and Orchestration

NCC integrates powerful automation and orchestration tools, which streamline network operations. Automated workflows can be configured to handle routine tasks, reducing the manual effort required and minimizing the risk of human error. This ensures that network operations remain consistent and reliable.

#### Enhanced Security

Security is a top priority for any network management solution, and NCC is no exception. It offers robust security features such as encryption, access control, and threat detection. These features help safeguard the integrity and confidentiality of data as it moves across different network segments.

*What Are Managed Instance Groups?**

Managed Instance Groups are a powerful feature of Google Cloud that allows you to manage a group of identical virtual machine (VM) instances. These groups are designed to provide automated, scalable, and resilient VM operations. By using templates, you can define configurations for your instances, ensuring consistency and control across your infrastructure. Whether you’re running a web application or a large-scale computational workload, MIGs can help you maintain optimal performance and availability.

**The Benefits of Using Managed Instance Groups**

One of the primary benefits of Managed Instance Groups is their ability to automatically scale your infrastructure based on demand. This means you can dynamically add or remove instances in response to traffic patterns, reducing costs during low-demand periods and ensuring capacity during peak times. Additionally, MIGs come with built-in load balancing, distributing incoming traffic evenly across your instances, which enhances application reliability and performance.

**How to Set Up Managed Instance Groups on Google Cloud**

Setting up a Managed Instance Group in Google Cloud is straightforward. First, you’ll need to create an instance template, which specifies the machine type, image, and other instance properties. Then, you can create a Managed Instance Group using this template, defining parameters such as the number of instances and the scaling policy. Google Cloud provides an intuitive interface and comprehensive documentation to guide you through this process, making it accessible even for those new to cloud computing.

**Best Practices for Optimizing Managed Instance Groups**

To get the most out of your Managed Instance Groups, it’s essential to follow best practices. Start by defining clear scaling policies that align with your application’s needs. Regularly update your instance templates to incorporate the latest software updates and patches. Additionally, monitor your instance group’s performance using Google Cloud’s monitoring tools, allowing you to make data-driven decisions and optimize resource allocation.

Managed Instance Group

ESXi Host Client

### Getting Started with ESXi Host Client

Before diving into advanced functionalities, it’s essential to understand the basics. To access the ESXi Host Client, simply open a web browser and enter the IP address or hostname of your ESXi host. You’ll be prompted to log in with your administrative credentials. Once logged in, you’ll be greeted with a dashboard that provides an overview of your host’s status, including resource usage, active VMs, and important notifications.

### Key Features and Functionalities

The ESXi Host Client is packed with features designed to streamline your workflow. Some of the most notable functionalities include:

– **Virtual Machine Management**: Easily create, configure, and manage virtual machines directly from the host client. You can start, stop, and restart VMs, as well as monitor their performance and resource allocation.

– **Storage Management**: The client allows you to manage datastores, browse datastore files, and even upload ISO images needed for VM installations.

– **Networking Configuration**: Configure network settings such as vSwitches, port groups, and VLANs. This ensures your VMs have the necessary connectivity while maintaining network security.

– **Host Health Monitoring**: Keep an eye on your host’s health with real-time monitoring of CPU, memory, and storage usage. Receive alerts for any issues that might require your attention.

### Advanced Tips and Tricks

While the ESXi Host Client is user-friendly, there are several advanced tips and tricks that can enhance your experience:

– **Custom Dashboards**: Create custom dashboards tailored to your specific needs. This can help you monitor key metrics at a glance and quickly identify any potential issues.

– **Scripting and Automation**: Use the power of VMware’s APIs to automate routine tasks. This can save you time and reduce the risk of human error.

– **Snapshot Management**: Efficiently manage VM snapshots to ensure you always have a backup before making significant changes. This is particularly useful during software updates or configuration changes.

Understanding Container Networking Fundamentals

Container networking revolves around enabling communication between containers, as well as establishing connections with external networks. It involves various components such as virtual bridges, network namespaces, and IP routing. By understanding these fundamentals, developers and system administrators can harness the full potential of container networking to create robust and scalable applications.

Example IPv6: SDN Data Center 

OSPFv3, which stands for Open Shortest Path First version 3, is an enhanced version of OSPF designed specifically for IPv6 networks. It serves as a dynamic routing protocol that enables routers to exchange information and determine the most efficient paths for packet forwarding. Unlike its predecessor, OSPFv2, OSPFv3 fully supports the IPv6 addressing scheme, making it an essential component of modern network infrastructures.

One notable feature of OSPFv3 is its support for multiple address families, allowing for the simultaneous routing of IPv6, IPv4, and other address families. This flexibility is crucial in transitioning networks from IPv4 to IPv6 while ensuring backward compatibility. Furthermore, OSPFv3 utilizes link-local IPv6 addresses for neighbor discovery and communication, simplifying configuration and improving network scalability.

The Value of SDN

In addition to OpenFlow, software-defined networks (SDNs) provide another paradigm shift. In the last few years, the idea of separating the data plane, which runs in hardware ASICs on network switches, from the control plane, which runs on a central controller, has gained traction. This effort aims to develop standardized OpenFlow APIs that expose rich functionality from the hardware to the controller. For the entire data center cluster comprised of different types of switches to be uniformly programmed to enforce a specific policy, SDNs should promote programmatic interfaces that switch vendors should support. At its simplest, the data plane merely programs hardware based on the controller’s directions by serving as a set of “dumb” devices.

SDN and OpenFlow

SDN Controllers

SDN controllers serve as the brains of an SDN data center. They are responsible for managing and orchestrating network traffic flow. Through a centralized control plane, SDN controllers provide a unified network view, allowing administrators to implement policies, configure devices, and monitor traffic. These controllers are the driving force behind the agility and programmability offered by SDN data centers.

OpenFlow Protocol

The OpenFlow protocol is at the heart of SDN data centers. It enables communication between the SDN controller and network devices such as switches and routers. By separating the control plane from the data plane, OpenFlow allows administrators to control network traffic flow directly, making it easier to implement dynamic and granular network policies. The protocol facilitates the flexibility and adaptability of SDN data centers.

SDN Switches

SDN switches play a crucial role in SDN data centers by forwarding network packets based on instructions received from the SDN controller. These switches are programmable and provide a level of intelligence that traditional switches lack. SDN switches can implement traffic engineering, Quality of Service (QoS) policies, and security measures. Their programmability and centralized management make SDN switches an integral part of SDN data centers.

Network Virtualization

One of the critical advantages of SDN data centers is network virtualization. By abstracting the underlying physical network infrastructure, SDN enables the creation of virtual networks. These virtual networks can be customized, isolated, and securely provisioned, providing flexibility and scalability to meet the dynamic demands of modern applications. Network virtualization is a game-changer for SDN data centers, offering enhanced resource utilization and simplified network management.

Scalability

As server ports increased in density, data centers grew, making it impossible to keep up. A limited number of MAC addresses, inactive links, and multicast streams prevented multicast streams from being transported in this case. Infrastructure growth became more than a “nice to have” as needs evolved. Using SDN controllers and standardized off-the-shelf switches, adding new switches and configuring their configurations quickly became easy.

To maximize downlink throughput, all links on switches must be utilized. Local networks already know about the widespread use of spreading trees (which disable parts of links). As a result of the phenomenal growth of server density, various multipathing scenarios have been addressed using things like Multi-Chassis EtherChannel (MEC) and ECMP (Equal Cost Multi-Path) with CLOS architectures.

Virtualization is one of the abstraction capabilities brought by SDN. Multiple isolated virtual networks were used to compute and store data on servers. There was also a virtualization movement in the network industry. At different layers, SDN has been developed in several variants.

stp port states

ClOS-based architectures

In recent years, high-speed network switches have made CLOS-based31 architectures extremely popular. The CLOS topology has a simple rule: switches at tier x should only be connected to switches at tier x-1 and x+1 and never to other switches at the same tier. In this topology, redundancy provides high resilience, fault tolerance, and traffic load sharing. Due to the many redundant paths between any two switches, network resources can be utilized efficiently. There is no oversubscription in CLOS-based architectures, which may be advantageous for some applications due to the huge bisection bandwidth. Additionally, the relatively simple topology alleviates the burden of having separate core and aggregation layers inherent in traditional three-tier architectures, which help troubleshoot traffic.

what is spine and leaf architecture

Example Technology: Nexus and VPC

Understanding Nexus Virtual Port Channel

At its core, Nexus vPC is a feature that allows two Nexus switches to appear as a single logical entity. This logical entity enables the creation of redundancy, load balancing, and seamless failover mechanisms. Linking the switches together through a virtual port channel allows them to share the traffic load and act as a unified system. This technology eliminates the traditional limitations of spanning tree protocol and unlocks new levels of performance and resiliency.

The benefits of deploying Nexus vPC are manifold. First and foremost, it enhances network availability by providing active-active links between switches. In the event of a link failure, traffic seamlessly fails over to the remaining links, minimizing downtime. Additionally, vPC enables load balancing across the links, optimizing bandwidth utilization and improving overall network performance. This feature is precious in data centers with high traffic demands.

What problems do we have, and what are we doing about them? Ask yourself: Are data centers ready and available for today’s applications and tomorrow’s emerging data center applications? Businesses and applications are putting pressure on networks to change, ushering in a new era of data center design. From 1960 to 1985, we started with mainframes and supported a customer base of about one million users.

Example: ACI Cisco

ACI Cisco, short for Application Centric Infrastructure, is a software-defined networking (SDN) solution developed by Cisco Systems. It provides a holistic approach to managing and automating network infrastructure, allowing organizations to achieve agility, scalability, and security all in one framework.

Cisco ACI is a software-defined networking (SDN) solution that brings automation, scalability, and agility to network infrastructure. It combines physical and virtual elements, creating a unified and programmable network fabric that simplifies operations and accelerates application deployment. By abstracting network policies from the underlying infrastructure, Cisco ACI enables organizations to achieve policy-driven automation and policy-based security across the entire network.

Example Technology: BGP in the data center

Understanding BGP Multipath

BGP Multipath is a feature that enables the installation of multiple paths for the same destination prefix in the BGP routing table. Unlike traditional BGP, which only selects a single best path, BGP Multipath allows for the utilization of multiple paths simultaneously. This feature significantly enhances network resiliency, load balancing, and routing efficiency.

Load Balancing: BGP Multipath distributes traffic across multiple paths, preventing congestion on a single path and optimizing bandwidth utilization. This load-balancing mechanism enhances network performance and reduces bottlenecks.

Fault Tolerance: BGP Multipath increases network resilience and fault tolerance by providing redundancy. In a link failure or congestion, traffic can be seamlessly rerouted through alternative paths, ensuring uninterrupted connectivity.

Improved Convergence: BGP Multipath reduces convergence time by incorporating multiple paths into the routing decision process. This results in faster route selection and improved network responsiveness.

Security in SDN Data Centers

Example Technology: Nexus and MAC ACLs

Understanding MAC ACLs

MAC ACLs, or Media Access Control Access Control Lists, are powerful tools that allow network administrators to filter traffic based on source or destination MAC addresses. By defining specific rules, administrators can permit or deny traffic at Layer 2 and enhance network security and performance.

Nexus 9000 MAC ACLs offer several advantages over traditional access control methods. Firstly, they provide granular control at the MAC address level, enabling administrators to restrict or allow access to specific devices. Additionally, MAC ACLs can be dynamically applied to VLANs, making them highly scalable and adaptable to evolving network environments.

Configuring MAC ACLs on the Nexus 9000 is straightforward. Administrators can define ACL rules using the command-line interface (CLI) or the graphical user interface (GUI). By specifying the MAC addresses, action (permit/deny), and optional parameters, administrators can create custom access control policies tailored to their network requirements.

VXLAN Overlays

Scalability and Agility:

With the increasing demands of modern business applications, scalability and agility are paramount. Cisco ACI offers a highly scalable architecture that can adapt to changing network requirements. By leveraging a spine-leaf topology and VXLAN overlays, Cisco ACI provides a flexible and scalable foundation that can seamlessly grow to accommodate evolving business needs.

VXLAN, at its core, is an encapsulation protocol that enables the creation of virtualized networks over existing Layer 3 infrastructure. It extends Layer 2 segments over Layer 3 networks, facilitating scalable and flexible network virtualization. Using unique VXLAN identifiers overcomes the limitations of traditional VLANs, allowing for a significantly more significant number of virtual networks to coexist.

Benefits of VXLAN

-Enhanced Scalability and Flexibility: VXLAN addresses the limitations of VLANs, which are often restricted to a maximum of 4096 unique IDs. With VXLAN, the pool of available IDs expands dramatically, creating an almost limitless number of virtual networks. This scalability empowers organizations to meet the demands of modern applications and dynamic workloads.

-Improved Network Segmentation: VXLAN enables efficient network segmentation by isolating traffic within virtual networks. This segmentation enhances security, simplifies network management, and provides a more robust framework for multi-tenancy environments. By leveraging VXLAN, organizations can better control and isolate their network traffic.

-Seamless Network Extension and Migration: VXLAN facilitates seamless network extension and migration across data centers, campuses, or cloud environments. By encapsulating Layer 2 frames within Layer 3 packets, VXLAN enables the creation of virtual networks that span geographically dispersed locations. This capability simplifies workload mobility, disaster recovery, and data center consolidation efforts.

Example Technology: VXLAN Flood and Learn

The Basics of Flood and Learn

As the name suggests, VXLAN Flood and Learn involves flooding network traffic to learn the MAC (Media Access Control) addresses. In traditional Ethernet networks, switches use MAC address tables to determine the destination of incoming frames. However, in VXLAN environments, the MAC addresses of virtual machines and hosts keep changing due to mobility and dynamic provisioning. Flood and Learn addresses this challenge by flooding traffic to all ports, allowing the switches to learn the MAC addresses associated with each VXLAN.

VXLAN Flood and Learn offers several benefits and finds applications in various scenarios. One such application is in data center environments with virtualized networks. It enables seamless communication between virtual machines across different hosts without requiring manual MAC address configuration. VXLAN Flood and Learn also facilitates network mobility, making it suitable for dynamic workloads and cloud environments.

Example: Software-defined data centers

To offer computing and network services to many clients, software-defined data centers (SDDCs) use virtualization technologies to separate hardware infrastructure into virtual machines. All computing, storage, and networking resources can be abstracted and represented as software in a virtualized data center. Anybody could access the data center resources if sold as a service.

SDDCs include software-defined networking (SDN) and virtual machines. In addition to Citrix, KVM, OpenDaylight, OpenStack, OpenFlow, Red Hat, and VMware, many other open and proprietary software platforms exist for virtualizing computing resources.

The advantage of SDDC is that clients do not have to build their infrastructure. They can meet their computing, networking, and storage needs by renting resources from the cloud. It is advantageous for software companies or service providers to have centralized data centers because they can serve many clients simultaneously. Hardware and storage costs are plummeting, a significant factor driving SDDC and cloud computing. Infrastructure as a Service (IaaS) becomes more economical as these resources become cheaper, making it more advantageous to build large data centers on a large scale.

Example: Open Networking Foundation

We also have the Open Networking Foundation ( ONF ), which leverages SDN principles, employs open-source platforms, and defines standards to build and operate open networking. The ONF’s portfolio includes several areas, such as mobile, broadband, and data centers running on white box hardware.

Recap on SDN Principles

SDN Defined:

SDN is an innovative approach to networking that separates the control plane from the data plane, providing a centralized and programmable network architecture. SDN enables dynamic and agile network management by decoupling network control and forwarding functions.

1. Centralized Control:

SDN leverages a central controller that acts as the brain of the network, making intelligent decisions about traffic forwarding, network policies, and resource allocation. This centralized control enhances network visibility and simplifies management tasks.

At its core, SDN centralized control refers to a network architecture in which a central controller governs the behavior of the entire network. Unlike traditional networking models, where intelligence is distributed across different network devices, SDN Centralized Control consolidates control into a single entity. This central controller acts as the brain of the network, making global decisions and orchestrating network flows.

SDN Centralized Control offers many advantages. First, it gives network administrators a holistic view of the entire network, simplifying management and troubleshooting processes. With a centralized controller, administrators can configure and monitor network devices from a single control point, saving time and effort.

2. Programmability:

One of the critical principles of SDN is its programmability. Network administrators can dynamically control and configure the network behavior by utilizing open interfaces and standard protocols like OpenFlow. This programmability empowers network operators to tailor the network to specific needs and applications.

SDN programmability is the ability to control and manipulate network behavior through software-based programming interfaces. It allows network administrators to dynamically configure and manage network resources, making networks more adaptable and responsive to changing business needs. By separating the control plane from the data plane, SDN programmability enables centralized management and control of network infrastructure, leading to simplified operations and increased efficiency.

SDN programmability empowers network administrators to respond to changing demands and quickly adapt network configurations. It allows for the creation of virtual networks, enabling the seamless segmentation and isolation of network traffic. This flexibility allows organizations to optimize network resources and support diverse applications and services.

Traditionally, scaling network infrastructure has been a complex and time-consuming task. SDN programmability simplifies the scaling process by automating the provisioning and deployment of network resources. This scalability ensures that network performance remains optimal even during peak usage periods.

3. Abstraction:

SDN abstracts the underlying network infrastructure, providing a simplified and logical view of the network. By abstracting complex network details, SDN enables higher-level automation, easier troubleshooting, and more efficient resource utilization.

SDN abstraction is the process of separating the underlying network infrastructure from the control logic that governs it. By abstracting the network resources, administrators can interact with the network at a higher level of abstraction, making it easier to manage and automate complex tasks. This abstraction layer provides a simplified, centralized network view independent of the underlying hardware and protocols.

SDN abstraction offers unprecedented flexibility by decoupling network control from the underlying infrastructure. It enables dynamic control and reconfiguration of network resources, allowing for rapid adaptation to changing requirements.

With SDN abstraction, complex network configurations can be managed through a single, intuitive interface. Administrators can define network policies and services without getting involved in the low-level details of network devices.

Abstraction simplifies network management, making it easier to scale the network infrastructure. By automating tasks and reducing the manual effort required, SDN abstraction improves operational efficiency and reduces the risk of human errors.

Google Cloud Data Centers

Understanding Network Tiers

Network tiers, in simple terms, are a hierarchical structure that categorizes the quality, performance, and cost of network connections. Google Cloud offers two main tiers: Premium Tier and Standard Tier. Let’s explore each tier in detail.

The Premium Tier is designed for businesses that demand the utmost in performance, reliability, and low latency. Leveraging Google’s vast global network infrastructure, the Premium Tier ensures optimized routing, reduced congestion, and enhanced end-user experience. Whether your application requires lightning-fast response times or handles mission-critical workloads, the Premium Tier is tailored to meet your needs.

For organizations seeking a cost-effective network solution without compromising on quality, the Standard Tier is an excellent choice. With competitive pricing, this tier offers reliable connectivity while prioritizing affordability. It serves as a viable option for applications that are less latency-sensitive or require less bandwidth.

Understanding VPC Peerings

VPC Peerings serve as a bridge between two VPC networks, allowing them to communicate as if they were part of the same network. It establishes a private and encrypted connection between VPC networks, ensuring data privacy and security. With VPC Peerings, you can extend your network’s reach, enabling collaboration and data sharing across different VPCs.

Enhanced Security: By utilizing VPC Peerings, you can establish secure connections between VPC networks without exposing your services to the public internet. This helps mitigate potential security risks and ensures your data remains protected.

Improved Performance: VPC Peerings enable low-latency and high-throughput communication between VPC networks. This allows for faster data transfer and reduces network bottlenecks, enhancing overall application performance.

Simplified Network Architecture: VPC Peerings eliminate the need for complex VPN configurations or costly dedicated connections. They simplify your network architecture by providing seamless connections and communication between VPC networks.

vCenter Server

**Seamless Management of Virtual Environments**

One of the most compelling features of vCenter Server is its ability to provide a single pane of glass for managing your entire virtual environment. This centralized control allows administrators to monitor resource allocation, optimize performance, and ensure high availability across multiple virtual machines (VMs). With vCenter Server, you can easily create, configure, and manage VMs, clusters, and data stores, ensuring that your infrastructure is always running smoothly.

**Enhanced Security and Compliance**

In today’s digital age, security is more critical than ever. vCenter Server includes robust security features designed to protect your virtual environment. From role-based access control (RBAC) to secure boot and encrypted vMotion, vCenter Server ensures that your data remains protected. Additionally, it offers compliance tools that help you adhere to industry standards and regulations, making it easier to pass audits and avoid potential fines.

**Automation and Orchestration**

Why spend countless hours on repetitive tasks when you can automate them? vCenter Server supports a variety of automation tools, including vRealize Orchestrator and PowerCLI, which allow you to script and automate routine operations. This not only saves time but also reduces the risk of human error, improving overall efficiency. With built-in automation features, you can schedule tasks such as VM provisioning, backups, and updates, freeing up your IT team to focus on more strategic initiatives.

**Scalability and Flexibility**

As your business grows, so does your need for a scalable and flexible IT infrastructure. vCenter Server is designed to scale seamlessly with your organization. Whether you’re managing a small cluster of VMs or an extensive data center, vCenter Server can handle it all. Its flexible architecture supports hybrid cloud environments, allowing you to extend your on-premises infrastructure to the cloud effortlessly. This scalability ensures that you can meet changing business demands without significant disruptions.

Related: Before you proceed, you may find the following post helpful:

  1. DNS Structure
  2. Data Center Network Design
  3. Software Defined Perimeter
  4. ACI Networks
  5. Layer 3 Data Center

SDN Data Center

The Future of Data Centers 

Exploring Software-Defined Networking (SDN)

In recent years, the rapid advancement of technology has given rise to various innovative solutions transforming how data centers operate. One such revolutionary technology is Software-Defined Networking (SDN), which has garnered significant attention and is set to reshape the landscape of data centers as we know them. In this blog post, we will delve into the fundamentals of SDN and explore its potential to revolutionize data center architecture.

SDN is a networking paradigm that separates the control plane from the data plane, enabling centralized control and programmability of network infrastructure. Unlike traditional network architectures, where network devices make independent decisions, SDN offers a centralized management approach, providing administrators with a holistic view and control over the entire network.

The Benefits of SDN in Data Centers

Enhanced Network Flexibility and Scalability:

SDN allows data center administrators to allocate network resources dynamically based on real-time demands. Scaling up or down becomes seamless with SDN, resulting in improved flexibility and agility. This capability is crucial in today’s data-driven environment, where rapid scalability is essential to meeting growing business demands.

Simplified Network Management:

SDN abstracts the complexity of network management by centralizing control and offering a unified view of the network. This simplification enables more efficient troubleshooting, faster service provisioning, and streamlined network management, ultimately reducing operational costs and increasing overall efficiency.

Increased Network Security:

By offering a centralized control plane, SDN enables administrators to implement stringent security policies consistently across the entire data center network. SDN’s programmability allows for dynamic security measures, such as traffic isolation and malware detection, making it easier to respond to emerging threats.

SDN and Network Virtualization:

SDN and network virtualization are closely intertwined, as SDN provides the foundation for implementing network virtualization in data centers. By decoupling network services from physical infrastructure, virtualization enables the creation of virtual networks that can be customized and provisioned on demand. SDN’s programmability further enhances network virtualization by allowing the rapid deployment and management of virtual networks.

Back to Basics: SDN Data Center

From 1985 to 2009, we moved to the personal computer, client/server model, and LAN /Internet model, supporting a customer base of hundreds of millions. From 2009 to 2020+, the industry has completely changed. We have various platforms (mobile, social, big data, and cloud) with billions of users, and it is estimated that the new IT industry will be worth 4.8T. All of these are forcing us to examine the existing data center topology.

SDN data center architecture is a type of architectural model that adds a level of abstraction to the functions of network nodes. These nodes may include switches, routers, bare metal servers, etc.), to manage them globally and coherently. So, with an SDN topology, we have a central place to work a disparate network of various devices and device types.

We will discuss the SDN topology in more detail shortly. At its core, SDN enables the entire network to be centrally controlled, or ‘programmed,’ using a software SDN application layer. The significant advantage of SDN is that it allows operators to manage the whole network consistently, regardless of the underlying network technology.

SDN Data Center
SDN Data Center

Statistics don’t lie.

The customer has changed and is making us change our data center topology. Content doubles over the next two years, and emerging markets may overtake mature markets. We expect 5,200 GB of data/per person created in 2020. These new demands and trends are putting a lot of duress on the amount of content that will be made, and how we serve and control this content poses new challenges to data networks.

Knowledge check for other software-defined data center market

The software-defined data center market is considerable. In terms of revenue, it was estimated at $43.178 billion in 2020. However, this has grown significantly; now, the software-defined data center market will grow to $120.3 billion by 2025, representing a CAGR of 22.4%.

Knowledge Check for SDN data center architecture and SDN Topology.

Software Defined Networking (SDN) simplifies computer network management and operation. It is an approach to network management and architecture that enables administrators to manage network services centrally using software-defined policies. In addition, the SDN data center architecture enables greater visibility and control over the network by separating the control plane from the data plane. Administrators can control routing, traffic management, and security by centralized managing networks. With global visibility, administrators can control the entire network. They can then quickly apply network policies to all devices by creating and managing them efficiently.

The Value: SDN Topology

An SDN topology separates the control plane from the data plane connected to the physical network devices. This allows for better network management and configuration flexibility, and configuring the control plane can create a more efficient and scalable network.

The SDN topology has three layers: the control plane, the data plane, and the physical network. The control plane controls the data plane, which carries the data packets. It is also responsible for setting up virtual networks, configuring network devices, and managing the overall SDN topology.

A personal network impact assessment report

I recently approved a network impact assessment for various data center network topologies. One of my customers was looking at rate-limiting current data transfer over the WAN ( Wide Area Network ) at 9.5mbps over 10 hours for 34GB of data transfer at an off-prime time window. Due to application and service changes, this customer plans to triple that volume over the next 12 months.

They result in a WAN upgrade and a change in the scope of DR ( Disaster Recovery ). Big Data, Applications, Social Media, and Mobility force architects to rethink how they engineer networks. We should concentrate more on scale, agility, analytics, and management.

SDN Data Center Architecture: The 80/20 traffic rule

The data center design was based on the 80/20 traffic pattern rule with Spanning Tree Protocol ( 802.1D ), where we have a root, and all bridges build a loop-free path to that root. This results in half ports forwarding and half in a blocking state—completely wasting your bandwidth even though we can load balance based on a certain number of VLANs forwarding on one uplink and another set of VLANs forwarding on the secondary uplink.

We still face the problems and scalability of having large Layer 2 domains in your data center design. Spanning tree is not a routing protocol; it’s a loop prevention protocol, and as it has many disastrous consequences, it should be limited to small data center segments.

SDN Data Center

Data Center Stability


Layer 2 to the Core layer

STP blocks reduandant links

Manual pruning of VLANs for redudancy design

Rely on STP convergence for topology changes

Efficient and stable design

Data Center Topology: The Shifting Traffic Patterns

The traffic patterns have shifted, and the architecture needs to adapt. Before, we focused on 80% leaving the DC, while now, a lot of traffic is going east to west and staying within the DC. The original traffic pattern made us design a typical data center style with access, core, and distribution based on Layer 2, leading to Layer 3 transport. The route you can approach was adopted as Layer 3, which adds stability to Layer 2 by controlling broadcast and flooding domains.

The most popular data architecture in deployment today is based on very different requirements, and the business is looking for large Layer 2 domains to support functions such as VMotion. We need to meet the challenge of future data center applications, and as new apps come out with unique requirements, it isnt easy to make adequate changes to the network due to the protocol stack used. One way to overcome this is with overlay networking and VXLAN.

Overlay networking
Diagram: Overlay Networking with VXLAN

The Issues with Spanning Tree

The problem is that we rely on the spanning tree, which was useful before but is past its date. The original author of the spanning tree is now the author of THRILL ( replacement to STP ). STP ( Spanning Tree Protocol ) was never a routing protocol to determine the best path; it was used to provide a loop-free path. STP is also a fail-open protocol ( as opposed to a Layer 3 protocol that fails closed ).

STP Path distribution

One of the spanning trees’ most significant weaknesses is their failure to open. If I don’t receive a BPDU ( Bridge Protocol Data Unit ), I assume I am not connected to a switch and start forwarding on that port. Combining a fail-open paradigm with a flooding paradigm can be disastrous.

STP va Routing Blocking Links

Next, let’s address the Spanning Tree Protocol on a network of 3 switches. STP is there to help, but in some cases, it blocks specific ports based on the default configuration or by the administrator forcing traffic to get a certain way. Either way, you can lose bandwidth. It is easy to demonstrate this by looking at three switches in the diagram. You would want all of these links in a forwarding state, but with STP, one of the links is blocked to prevent loops.

Since the spanning tree is enabled, all our switches will send a unique frame to each other called a BPDU (Bridge Protocol Data Unit). The spanning tree requires two pieces of information in this BPDU: the MAC address and Priority. Together, the MAC address and priority make up the bridge ID.

The spanning tree requires the bridge ID for its calculation. Let me explain how it works:

  • First, a spanning tree will elect a root bridge; this root bridge will have the best “bridge ID.”
  • The switch with the lowest bridge ID is the best one.
  • The priority is 32768 by default, but we can change this value.

Spanning Tree Root Switch

So, who will become the root bridge? In our example, SW1 will become the root bridge! The bridge ID is made up of priority and MAC address. Since all switches have the same priority, the MAC address will be the tiebreaker. SW1 has the lowest MAC address, thus the best bridge ID, and will become the root bridge. The ports on our root bridge are always designated, which means they are forwarding. 

Above, you see that SW1 has been elected as the root bridge, and the “D” on the interfaces stands for designated.

Now we have agreed on the root bridge, our next step for all our “non-root” bridges (so that’s every switch that is not the root) will be to find the shortest path to our root bridge! The shortest path to the root bridge is called the “root port.” Take a look at my example:

stp port states

VPC for Nexus Data Centers

Port States:

 If you have played with some Cisco switches before, you might have noticed that every time you plugged in a cable, the LED above the interface was orange and, after a while, became green. What is happening at this moment is that the spanning tree is determining the state of the interface; this is what happens as soon as you plug in a cable:

  • The port is in listening mode for 15 seconds. In this phase, it will receive and send BPDUs but not learn MAC addresses or transmit data.
  • The port is in learning mode for 15 seconds.  We are still sending and receiving BPDUs, but now the switch will also learn MAC addresses. There is still no data transmission, though.
  • Now we go into forwarding mode, and finally, we can transmit data!

How does this compare to routing? With layer 3, we have a TTL, meaning we can stop loops as long as there is no complicated route redistribution at different points in the network topology. Let’s look at the following example, which uses RIP.

RIP is a distance vector routing protocol and the simplest one. We’ll start by paying attention to the distance vector class. What does the name distance vector mean?

    • Distance: How far away? In the routing world, we use metrics.
    • Vector: Which direction? In the routing world, we care about which interface and the next router’s IP address to send the packet to.

Notice below we are not blocking ports. Instead, we are load balancing.

RIP load balancing

Analysis:

Load-sharing between packets or destinations (actually source/destination IP address pairs) is supported by Cisco Express Forwarding (CEF) without performance degradation (without CEF, per-packet load-sharing requires process switching). Even though there is no performance impact on the router, per-packet load sharing almost always results in out-of-order packets. As a result of packet reordering, TCP throughput might be reduced in high-speed environments (per-packet load-sharing improves per-flow throughput in low-speed/few-flow scenarios) or applications that cannot survive out-of-order packet delivery, for example, Fast Sequenced Transport for SNA over IP or voice/video streams, may suffer.

Use the ip load-sharing per-packet interface configuration command to configure per-packet load-sharing (the default is per destination). This command must be used to configure all outgoing interfaces where traffic is load-shared.

STP has a bad reputation

STP, in theory, prevents bridging loops. Many reasons contribute to STP’s lousy reputation in practice.

You must accept that design choice if you prefer plug-and-pray networking over proper routing protocols. There is little we can do in this situation. To use alternate paths, you need an appropriate routing protocol, regardless of whether you’re routing on layer 2 (TRILL, SPB) or layer 3 (IP). Forward-on behavior is one of the main problems with STP. All links forward traffic until BPDUs block some of them.

A forwarding loop is almost certain to occur if a device drops BPDUs or if a switch loses its control plane (for example, due to a memory leak).

Design a Scalable Data Center Topology

To overcome the limitation, some are now trying to route ( Layer 3 ) the entire way to the access layer, which has its problems, too, as some applications require L2 to function, e.g., clustering and stateful devices—however, people still like Layer 3 as we have stability around routing. You have an actual path-based routing protocol managing the network, not a loop-free protocol like STP, and routing also doesn’t fail to open and prevents loops with the TTL ( Time to Live ) fields in the headers.

Convergence routing around a failure is quick and improves stability. We also have ECMP ( Equal Cost Multi-Path) paths to help with scaling and translating to scale-out topologies. This allows the network to grow at a lower cost. Scale-out is better than scale-up.

Whether you are a small or large network, having a routed network over a Layer 2 network has clear advantages. However, how we interface with the network is also cumbersome, and it is estimated that 70% of network failures are due to human errors. The risk of changes to the production network leads to cautious changes, slowing processes to a crawl.

In summary, the problems we have faced so far;

STP-based Layer 2 has stability challenges; it fails to open. Traditional bridging is controlled flooding, not forwarding, so it shouldn’t be considered as stable as a routing protocol. Some applications require Layer 2, but people still prefer Layer 3. The network infrastructure must be flexible enough to adapt to new applications/services, legacy applications/services, and organizational structures.

There is never enough bandwidth, and we cannot predict future application-driven requirements, so a better solution would be to have a flexible network infrastructure. The consequences of inflexibility slow down the deployment of new services and applications and restrict innovation.

The infrastructure needs to be flexible for the data center applications, not the other way around. It must also be agile enough not to be a bottleneck or barrier to deployment and innovation.

What are the new options moving forward?

Layer 2 fabrics ( Open standard THRILL ) change how the network works and enable a large routed Layer 2 network. A Layer 2 Fabric, for example, Cisco FabricPath, is Layer 2; it acts more than Layer 3 as it’s a routing protocol-managed topology. As a result, there is improved stability and faster convergence. It can also support massive ( up to 32 load-balanced forwarding paths versus a single forwarding path with Spanning Tree ) and scale-out capabilities.

VXLAN: Overlay networking

What is VXLAN?

Suppose you already have a Layer 3 core and must support Layer 2 end to end. In that case, you could go for an Encapsulated Overlay ( VXLAN, NVGRE, STT, or a design with generic routing encapsulation). You have the stability of a Layer 3 core and the familiarity of a Layer 2 core but can service Layer 2 end to end using UDP port numbers as network entropy. Depending on the design option, it builds an L2 tunnel over an L3 core. 

Example: Encrypted GRE with IPsec

Understanding Encrypted GRE

GRE, or Generic Routing Encapsulation, is a network protocol commonly used to encapsulate and transport different network layer protocols over an IP network. It provides a virtual point-to-point connection, allowing the transmission of data between different sites or networks. However, without encryption, the data transmitted through GRE is vulnerable to interception and unauthorized access. This is where encrypted GRE with IPSec comes into play.

IPSec, or Internet Protocol Security, is a suite of protocols used to secure IP communications by authenticating and encrypting the data packets. It provides a secure tunnel between two endpoints, ensuring the transmitted data’s confidentiality, integrity, and authenticity. By combining IPSec with GRE, organizations can create a safe and private communication channel over an untrusted network.

a. Enhanced Data Privacy: With encrypted GRE and IPSec, organizations can ensure the privacy of their data while transmitting it over public or untrusted networks. The encryption algorithms used in IPSec provide high security, making it extremely difficult for unauthorized parties to decipher the transmitted information.

b. Secure Communication: Encrypted GRE with IPSec establishes a secure tunnel between endpoints, protecting the integrity of the data. It prevents tampering, replay attacks, and other malicious activities, ensuring the information reaches its destination without any unauthorized modifications.

c. Flexibility and Compatibility: Encrypted GRE with IPSec can be implemented across various network environments, making it a versatile solution. It is compatible with different operating systems, routers, and firewalls, allowing organizations to integrate it seamlessly into their existing network infrastructure.

GRE with IPsec ipsec plus GRE

Back to VXLAN

A use case for this will be if you have two devices that need to exchange state at L2 or require VMotion. VMs cannot migrate across L3 as they need to stay in the same VLAN to keep the TCP sessions intact. Software-defined networking is changing the way we interact with the network.

It provides faster deployment and improved control. It changes how we interact with the network and has more direct application and service integration. With a centralized controller, you can view this as a policy-focused network.

Many prominent vendors will push within the framework of converged infrastructure ( server, storage, networking, centralized management ) all from one vendor and closely linking hardware and software ( HP, Dell, Oracle ). While other vendors will offer a software-defined data center in which physical hardware is virtual, centrally managed, and treated as abstraction resource pools that can be dynamically provisioned and configured ( Microsoft ).

Summary: SDN Data Center

In the dynamic landscape of technology, data centers play a crucial role in storing, processing, and delivering digital information. Traditional data centers have limitations, but the emergence of Software-Defined Networking (SDN) has revolutionized how data centers operate. In this blog post, we delved into the world of SDN data centers, exploring their benefits, key components, and potential implications.

Understanding SDN

SDN, in essence, separates the control plane from the data plane, enabling centralized network management through software. Unlike traditional networks, where network devices make individual decisions, SDN allows for a more programmable and flexible infrastructure. By abstracting the network’s control, SDN empowers administrators to manage and orchestrate their data centers dynamically.

Key Components of SDN Data Centers

It is crucial to grasp the critical components of SDN data centers to comprehend their inner workings. The SDN architecture comprises three fundamental elements: the Application Layer, Control Layer, and Infrastructure Layer. The Application Layer houses the software applications that utilize the network services, while the Control Layer handles network-wide decisions and policies. Lastly, the Infrastructure Layer comprises the physical and virtual network devices that forward data packets.

Advantages of SDN Data Centers

The adoption of SDN in data centers brings forth a myriad of advantages. Firstly, SDN enables network programmability, allowing administrators to configure and manage their networks through software interfaces. This flexibility reduces manual configuration efforts and enhances overall efficiency. Secondly, SDN data centers boast improved scalability, as the centralized control plane simplifies network expansion and resource allocation. Additionally, SDN enhances network security by enabling fine-grained control and real-time threat detection.

Potential Implications and Challenges

While SDN data centers offer numerous benefits, addressing potential implications and challenges is crucial. One concern is the potential risk of a single point of failure in the centralized control plane. Network disruptions or software vulnerabilities could significantly impact the entire data center. Moreover, transitioning from traditional networks to SDN requires careful planning, as it involves reconfiguring the existing infrastructure and training network administrators to adapt to the new paradigm.

Conclusion:

In conclusion, Software-Defined Networking (SDN) has paved the way for a new era of data centers. By separating the control and data planes, SDN empowers administrators to programmatically manage their networks programmatically, leading to enhanced flexibility, scalability, and security. Despite the challenges and potential implications, SDN data centers hold immense potential for transforming the way we architect and operate modern data centers.