This blog is the first of a series discussing the tail of active-active data centers and data center failover. The first blog focuses on GTM DNS-based load balancing and introduces failover challenges. The second post fully addresses database and storage best practices and the third will focus on ingress and egress traffic flows. F5’s Global Traffic Manager (GTM ) offers intelligent Domain Name System (DNS) resolution capability to resolve queries from different sources to different data center locations. It loads and balances DNS queries to existing recursive DNS servers and caches the response or processes the resolution itself. Acting as the authoritative DNS server or secondary authoritative DNS server. It implements a number of security services with DNSSEC, enabling it to protect against DNS-based DDoS attacks. DNS is reliant on UDP for transport so you are also subject to UDP control plane attacks.
The GTM in combination with the Local Traffic Manager (LTM) provides load balancing services towards physically dispersed endpoints. Endpoints are in separate locations but in the eyes of the GTM are logically grouped. For datacenter failover events, DNS is a lot more graceful than Anycast. With GTM DNS failover, end nodes are restarted (cold move) into secondary data centers with a different IP address. As long as the DNS FQDN remains the same, new client connections are directed to the restarted hosts in the new data center. The failover is performed with a DNS change, making it a viable option for disaster recovery, disaster avoidance, and data center migration. Stretch clusters and active-active data centers pose a separate set of challenges. In this case, other mechanisms, such as FHRP localization and LISP are combined with the GTM to influence ingress and egress traffic flows.
DNS Namespace Basis
Packets that traverse the Internet use numeric IP address and not names to identify communicating devices. In order to make numeric IP addresses memorable and user-friendly, DNS was developed to map the IP address to a user-friendly name. Employing memorable names instead of numerical IP addresses dates back to the early 1980s in the ARPANET days. Localhost files called HOSTS.txt mapped IP to names on all the ARPANET computers. The resolution was local and any changes were implemented on all computers. This was sufficient for small networks but with the rapid growth of networking, a hierarchical distributed model, known as a DNS namespace was introduced. The database is distributed all around the world on what’s known as DNS nameservers. It looks like an inverted tree, with branches representing domains, zones, and subzones. At the very top of the domain is the “root” domain and then further down we have Top-Level domains (TLD) such as .com or .net. and Second-Level domains (SLD), such as www.network-insight.net. Management of the TLD is delegated by the IANA to other organizations such as Verisign for.COM and. NET. Authoritative DNS nameservers exist for each zone. They hold information about the domain tree structure. Essentially, the name server stores the DNS records for that domain.
You interact with the DNS infrastructure with the process known as of RESOLUTION. End stations make a DNS request to their local DNS (LDNS). If the LDNS support caching and has a cached response for the query, it will itself respond to the client requests. DNS caching stores DNS queries for a period of time, specified in the DNS TTL. Caching improves DNS efficiency by reducing DNS traffic on the Internet. If the LDNS doesn’t have a cached response it will trigger what is known as the recursive resolution process. The LDNS continues to query the authoritative DNS server in the “root” zones. These names servers will not have the mapping in their database but will refer the request to the appropriate TLD. The process continues and the LDNS then queries the authoritative DNS in the appropriate.COM, .NET or .ORG zones. The entire process has many steps and is referred to as “walking a tree”. However, it is based on a quick transport protocol (UDP) and takes only a few milliseconds to complete.
Once the LDNS gets a positive result it then caches the response for a period of time, referenced by the DNS TTL. The DNS TTL setting is specified in the DNS response by the authoritative nameserver for that domain. Previously, an older and common TTL value for DNS was 86400 seconds (24 hours). What this meant was if there was a change of record on the DNS authoritative server, the DNS servers around the globe would not register that change for the TTL value of 86400 seconds. This was later changed to 5 minutes for more accurate DNS results. TTL in some end hosts browser is 30 mins, which means if there is a failover data center event and traffic needs to move from DC1 to DC2, some ingress traffic will take its time to switch to the other DC, causing long tails.
Web browsers implement a security mechanism known as DNS pinning where they refuse the take low TTL as there are many security concerns with low TTL settings, such as cache poisoning. Every time you read from the DNS namespace, there is potential for DNS cache poisoning. Because of this, all browser companies decided to ignore low TTL and implement their own aging mechanism, which is about 10 minutes. There are embedded applications that carry out a DNS lookup only once when you start the application, for example, a Facebook client on your phone. During data center failover events, this may cause a very long tail and some sessions may time out.
The first step is to configure GTM Listeners. A listener is a DNS object that processes DNS queries. It is configured with an IP address and listens to traffic destinated to that address on port 53, the common DNS port. It can respond to DNS queries with accelerated DNS resolution or GTM intelligent DNS resolution. GTM intelligent Resolution is also known as Global Server Load Balancing (GSLB) and is just one of the ways you can get GTM to resolve DNS queries. It monitors a lot of conditions to determine the best response. The GTM monitors LTM and other GTMs with a proprietary protocol called IQUERY. IQUERY is configured with the bigip_add utility. It’s a script that exchanges SSL certificates with remote BIG-IP systems. Both systems must be configured to allow port 22 on their respective self-IPs. The GTM allows you to group virtual servers together, one from each data center into a pool. These pools are then grouped into a larger object known as a Wide IP, which maps the FQDN to a set of virtual servers. The Wide IP may contain Wild cards.
- Load Balancing Methods
When the GTM receives a DNS query that matches the Wide IP, it selects the virtual server and sends back the response. There are several load balancing methods (Static and Dynamic) used to select the pool, the default is round-robin. Static load balancing includes round-robin, ratio, global availability, static persists, drop packets, topology, fallback IP, and return to DNS. Dynamic load balancing includes round trip time, completion time, hops, least connections, packet rate, QoS, and kilobytes per second. Both methods involve predefined configurations, but dynamic takes into consideration of real-time events. For example, topology load balancing allows you to select a DNS query response based on geolocation information. Queries are resolved based on the physical proximity of the resource such as LDNS country, continent, or user-defined fields. It uses an IP geolocation database to help make the decisions. It is useful for serving the correct weather and news to users based on their location. All this configuration is carried out with Topology Records (TR).
Anycast and GTM DNS for DC failover
Anycast means you advertise the same address from multiple locations. It is a viable option when data centers are geographically far apart. Anycast solves the DNS problem, but we also have a routing plane to consider. It can be difficult to get people to go to another DC with anycast. It’s really hard to get someone to go to data center A when the routing table says go-to data center B. The best approach is to change the actual routing. As a failover mechanism, anycast is not as graceful as DNS migration with F5 GTM. Generally, if session disruption is a viable option then go for Anycast. Web applications would be fine with some session disruption. HTTP is stateless and it will just resend. However, other types of applications might not be so tolerant. If session disruption is not an option and graceful shutdown is needed you have to use DNS-based load balancing. Keep in mind that due to DNS pinning in browsers you will always have long tails and eventually some sessions will be disrupted.
The best approach is to do a proper scale-out application architecture. Begin with parallel application stacks in both data centers and implement global load balancing based on DNS. Start migrating users to the other data center and when you move all the other users you can shut down the instance in the first data center. It is much cleaner and safer to do COLD migrations. Live migrations and HOT moves (keep sessions intact) are challenging over Layer 2 links. You really need a different IP address. You don’t want to have stretched VLANs across data centers. It’s much easier to do a COLD move and change the IP and then use DNS. The load balancer config can be synchronized to vCenter so the load balancer definitions are updated based on vCenter VM groups.
Another reason for failures in data centers during scale-outs could be the lack of airtight sealing, otherwise known as hermetic sealing. Not having an efficient seal brings semiconductors in contact with water vapor and other harmful gases in the atmosphere. As a result, ignitors, sensors, circuits, transistors, microchips, and much more don’t get the protection they require to function properly. The main challenge with active-active data centers and failover events is with your actual DATA and Databases. If data center A fails, how accurate will your data be? If you are running a transaction database, then you cannot afford to lose any data. Resilience is achieved by storage replication or database level replication that employs log shipping or distribution between two data centers with a two-phase commit. Log shipping has an RPO of non-zero as transactions could happen a minute before. A two-phase commit totally synchronizes multiple copies of the database but can slow due to latency.