data center

Data Center – Site Selection | Content routing

 

data center

 

Data Center Site Selection

In today’s digital age, data centers play a vital role in supporting the ever-growing demand for data storage and processing. As businesses increasingly rely on technology to operate efficiently, selecting the right location for a data center cannot be overstated. The process of data center site selection involves careful consideration of various factors to ensure optimal performance, reliability, and cost-effectiveness.

 

  • Routing to a data center

Let us address how users get routed to a data center. Well, there is several data center site selection criteria or even data center site selection checklist that you can follow to make sure your users follow the most optimal path and limit sub-optimal routing. Distributed workloads with multi-site architecture open up several questions regarding the methods for site selection, path optimization for ingress/egress flows, and data replication (synchronous/asynchronous) for storage. 

  • Distributing the load

Furthermore, once the content is distributed to multiple data centers, you need to manage the request for the distributed content and the load by routing users’ requests to the appropriate data center. Routing in the data center is known as content routing. Content routing takes a user’s request and sends it to the appropriate data center.

 

Before you proceed, you may find the following post helpful:

  1. DNS Security Solutions
  2. BGP FlowSpec
  3. DNS Reflection Attack
  4. DNS Security Designs
  5. Data Center Topologies
  6. WAN Virtualization

 

Data Center Site Selection Checklist

Key Data Center Site Selection Discussion Points:


  • Introduction to data center site selection and what is involved.

  • Highlighting the details of data center site selection criteria.

  • Technical details on load distribution and recovery.

  • Scenario: HTTP redirection and RHI.

  • BGP configurations for site selection.

 

Back to basic with Data Center Interconnect (DCI)

Data Center Interconnect

Before we get started on your journey with a data center site selection checklist, it may be helpful to know how data centers interconnect. Data Center Interconnect (DCI) solutions have been known for quite some time; they are mainly used to help geographically separated data centers.

Layer 2 extensions might be required at different layers in the data center to enable the resiliency and clustering mechanisms offered by the other applications. For example, we have Cisco’s OTV that can be used as a DCI solution.

OTV provides Layer 2 extension between remote data centers using MAC address routing. A control plane protocol exchanges MAC address reachability information between network devices providing the LAN extension functionality. This has a tremendous advantage over traditional data center interconnect solutions, which generally depend on data plane learning and flooding across the transport to learn reachability information.

 

Data Center Site Selection Criteria

Proximity-based site selection

Different data center site selection criteria can route users to the most optimum data centers. For example, proximity-based site selection is selecting a geographically closer data center. Generally, this will improve response time. Additionally, you can route requests based on the load of the data center or the application availability.

Things become interesting when workloads want to move across geographically dispersed data centers while maintaining active connections to front-end users and backed systems. All these elements put increasing pressure on the data center interconnect ( DCI ) and the technology used to support workload mobility.

 

Data Center Site Selection

Multi-site load distribution & site-to-site recovery

Data center site selection can be used for site-to-site recovery and multi-site load distribution. Multi-site load distribution requires a mechanism that enables the same application to be accessed by both data centers, i.e., an active/active setup.

For site-to-site load balancing, you must use an active/active scenario ( also known as hot standby ) in which both data centers host the same active application. Logically active / standby means that some applications will be active on one site while others will be on standby at the other sites.

data center site selection checklist
Data Center Site Selection. Dual Data Centers.

 

Data center site selection is vital, and determining which data center to target your request can be based on several factors, such as proximity and load. Different applications will prefer different site selection mechanisms. For example, video streaming will prefer the closest data center ( proximity selection ). Other types of applications would prefer data centers that are least loaded, and others work efficiently with the standard round-robin metric. The three traditional methods for data center site selection criteria are Ingress site selection DNS-based, HTTP redirection, and Route Health Injection.

 

Data Center Site Selection Checklist

Hypertext Transfer Protocol ( HTTP ) redirection

Applications can have built-in HTTP redirection in their browsers. This enables them to communicate with a secondary server if the primary server is not available. When redirection is required, the server will send an HTTP Redirect ( 307 ) to the client and send the client to the correct site with the required content. One advantage of this mechanism is that you have visibility into the requested content, but as you have probably already guessed, it only works with HTTP traffic.

 

HTTP Redirect
Diagram: HTTP redirect.

 

DNS-based request routing

DNS-based request routing, also known as DNS load balancing, is a method of distributing incoming network traffic across multiple servers or locations based on the DNS responses. Traditionally, DNS has been primarily used to translate human-readable domain names into IP addresses. However, with DNS-based request routing, it can now play a vital role in optimizing network traffic flow.

How does it work?

When a user initiates a request to access a website or application, their device sends a DNS query to a DNS resolver. Instead of providing a single IP address in response, the DNS resolver returns a list of IP addresses associated with the requested domain. Each IP address corresponds to a different server or location that can handle the request.

DNS-based request routing point of control for geographic load distribution resides within DNS. DNS-based request routing uses DNS for both site-to-site recovery and multi-site load distribution. A DNS request, either recursive or iterative, is accepted by the client and is directed to a data center based on configurable parameters. This provides the ability to distribute the load among multiple data centers with an active/active design based on criteria such as least loaded, proximity, round-robin, and round-trip time ( RTT ).

 

  • The support for legacy applications

DNS-based request routing becomes challenging if you have to support legacy applications without DNS name resolution. These applications have hard-coded IP addresses used to communicate with other servers. When there is a combination of legacy and non-legacy applications, the solution might be to use DNS-based request routing and IGP/BGP.

Another caveat for this approach is that the refresh rate for the DNS cache may impact the convergence time. There will also be increased traffic flow on the data center interconnect link once a VM moves to the secondary site – previously established connections are hair pinned.

 

Route Health Injection ( RHI )

Route Health Injection is a method used to improve network resilience by dynamically injecting alternative routes into the network. It involves the monitoring of network devices and routing protocols to identify potential failures or performance degradation. By preemptively injecting alternative routes, RHI enables networks to quickly reroute traffic and maintain optimal connectivity.

 

How does Route Health Injection work?

Route Health Injection operates by continuously monitoring the health of network devices and analyzing routing protocol information. It leverages various metrics such as latency, packet loss, and link utilization to assess the overall health of network paths. When a potential issue is detected, RHI dynamically injects alternative routes to bypass the affected network segment, allowing traffic to flow seamlessly.

RHI is implemented in front of the application and, depending on its implementation, allows the same address or a different address to be advertised. It’s a route injected by a local load balancer that influences the ingress traffic path. RHI injects a static route when the VIP ( Virtual IP address ) becomes available and withdraws the static route when the VIP is no longer active. The VIP is used to represent an application.

 

  • A key point: Data center active-active scenario

Route Health Injection can be used for an active/active scenario as both data centers can use the same VIP to represent the server cluster for each application. RHI can create a lot of churns as routes are constantly being added and removed. If the number of supported applications grows, the network’s number of network host routes grows linearly. The decision to use RHI should come down to the scale and size of the data center’s application footprint.

RHI is commonly used on Intranets as the propagation of more specifics is not permitted on the Default Free Zone ( DFZ ). Specific requirements require RHI to be used with BGP/IGP for external-facing clients. Due to the drawbacks of DNS caching, RHI is often preferred to DNS solution for Internet-facing applications.

 

  • A quick point: Ansible Automation

When you are looking to bring automation into the data center, Ansible could be a good automation tool. Ansible can come from Ansible CLI, with Ansible Core, or a platform approach with Ansible Tower. Either can these automation tools assist in our data center operations? Ansible variables can be used to remove site-specific information to make your playbooks more flexible.

For data center configuration or simply checking routing tables, you can have a single playbook that uses Ansible variables to perform operations on both data centers. I use this for checking the routing tables of each data center. Once playbook using Ansible variables against one inventory for all my data centers. This can quickly help you when troubleshooting data center site selection.

 

BGP AS prepending

This can be used for active / standby site selection, not a multi-load distribution method. BGP uses the best path algorithm to determine the best Path to a specific destination. One of those steps that all router manufacturers widely use is AS Path—the lower the number of ASs in the path list, the better the route.

Specific routes are advertised from both data centers, with additional AS Paths added to the secondary site’s routes. When BGP goes through its site selection processes, it will choose the Path with the least AS Paths, i.e., the primary site without AS Prepending.

 

BGP conditional advertisements

BGP Conditional Advertisements are helpful when you are concerned that some manufacturers may have AS Path explicitly removed. A condition must be met with conditional route advertisement before an advertisement occurs. The routers on the secondary site monitor a set of prefixes located on the first site, and when those prefixes are not reachable at the first site, the secondary sites begin to advertise.

Its configuration is based on community”no-export” and iBGP between the sites. If routes were redistributed between BGP > IGP and advertised to the IBGP peer, the secondary site would advertise those routes defeating the purpose of a conditional advertisement.

data center site selection checklist
How do users get routed to a data center?

 

The RHI method used internally or externally with BGP is proper when you are forced to use IP as the site selection method. For example, this may be the case when you have hard-coded IP addresses in the application used primarily with legacy applications, or you are concerned about DNS caching issues. Site selection based on RHI and BGP requires no changes to DNS.

However, its main drawback is that it cannot be used for active/active data centers and is primarily positioned as an active / standby method. This is because there is only ever one routing table entry in the routing table.

Additionally, for the final data center site selection checklist. There are designs where you can use IP Anycast in conjunction with BGP, IGP, and RHI to achieve an active/active scenario, and I will discuss this later. With this setup, there is no need for BGP conditional route advertisement or AS Path prepending.