Tech Brief Video Series – Enterprise Networking

Hello,

I have created an “Enterprise Networking Tech Brief” Series. Kindly click on the link to view the video. I’m trying out a few videos styles.

Enterprise Networking A –  LISP Components & DEMO – > https://youtu.be/PBYvIhxwrSc

Enterprise Networking B – SD-Access & Intent-based networking – > https://youtu.be/WKoGSBw5_tc

” In campus networking, there are a number of different trends that are impacting the way networks will be built in the future. Mobility, pretty much every user that is getting onto the campus is a mobile device. It used to be only company-owned devices, nows it is about BYOD and wearables. It is believed that the average user will bring about 2.7 devices to the workplace – a watch, and intelligent wearables. This aspect access to printers or collaboration systems. They also expect the same type of access to cloud workloads and application workloads in private DC. 

All this to be seamless across all devices. Iot – the corporate IoT within a campus network-connected light, card readers, all the things you would like to find in an office building. How do you make sure these cannot compromise your networks. Every attack we have seen in 12 – 19 has involved an insecure IoT device that is not managed or produced by I.T., In some cases, this IoT Device has access to the Internet, and the company network cause issues with malware and hacks. The source from Matt Conran Network World

Enterprise Networking CHands-on configuration for LISP introduction – > https://youtu.be/T1AZKK5p9PY

Enterprise Networking DIntroducing load balancing – > https://youtu.be/znhdUOFzEoM

” Load balancers operate at different Open Systems Interconnection ( OSI ) Layers from one data center to another; common operation is between Layer 4 and Layer 7. This is because each data centers hosts-unique applications with different requirements. Every application is unique with respect to the number of sockets, TCP connections ( short-lived or long-lived ), idle time-out, and activities in each session in terms of packets per second. One of the most important elements of designing a load-balancing solution is to understand fully the application structure and protocols”

Enterprise Networking E –  Hand-on configuration for LISP Debugging – > https://youtu.be/h7axIhyu1Bs

Enterprise Networking FTypes of load balancing – > https://youtu.be/ThCX03JYoL8

“Application-Level Load Balancing: Load balancing is implemented between tiers in the applications stack and is carried out within the application. Used in scenarios where applications are coded correctly making it possible to configure load balancing in the application. Designers can use open source tools with DNS or some other method to track flows between tiers of the application stack. Network-Level Load Balancing: Network-level load balancing includes DNS round-robin, Anycast, and L4 – L7 load balancers. Web browser clients do not usually have built-in application layer redundancy, which pushes designers to look at the network layer for load balancing services. If applications were designed correctly, load balancing would not be a network-layer function.”

Enterprise Networking HIntroducing application performance and buffer sizes – > https://youtu.be/d36fPso1rZg

“Today’s data centers have a mixture of applications and workloads all with different consistency requirements. Some applications require predictable latency while others sustained throughput. It’s usually the case that the slowest flow is the ultimate determining factor affecting the end-to-end performance. So to try to satisfy varied conditions and achieve predictable application performance we must focus on consistent bandwidth and unified latency for ALL flows types and workloads.”

Enterprise Networking IApplication performance: small vs large buffer sizes – > https://youtu.be/JJxjlWTJbQU

“Both small and large buffer sizes have different effects on application flow types. Some sources claim that small buffers sizes optimize performance, while other claims that larger buffers are better. Many of the web giants including Facebook, Amazon, and Microsoft use small buffer switches. It depends on your environment. Understanding your application traffic pattern and testing optimizations techniques are essential to finding the sweet spot. Most out-of-the-box applications are not going to be fine-tuned for your environment, and the only rule of thumb is to lab test.

Complications arise when the congestion control behavior of TCP interacts with the network device buffer. The two have different purposes. TCP congestion control continuously monitors available network bandwidth by using packet drops as the metric. On the other hand buffering is used to avoid packet loss. In a congestion scenario, the TCP is buffered, but the sender and receiver have no way of knowing that there is congestion and the TCP congestion behavior is never initiated. So the two mechanisms that are used to improve application performance don’t compliment each other and require careful testing for your environment.”

Enterprise Networking J – TCP Congestion Control – > https://youtu.be/ycPTlTksszs

“The discrepancy and uneven bandwidth allocation for flow boil down to the natural behavior of how TCP reacts and interacts with insufficient packet buffers and the resulting packet drops. The behavior is known as the TCP/IP bandwidth capture effect. The TCP/IP bandwidth capture effect does not affect the overall bandwidth but more individual Query Completion Times and Flow Completion Times (FCT) for applications. The QCT and FCT are prime metrics for measuring TCP-based application performance. A TCP stream’s pace of transmission is based on a built-in feedback mechanism. The ACK packets from the receiver adjust the sender’s bandwidth to match the available network bandwidth. With each ACK received, the sender’s TCP starts to incrementally increase the pace of sending packets to use all available bandwidth. On the other hand, it takes 3 duplicate ACK messages for TCP to conclude packet loss on the connection and start the process of retransmission.”

Enterprise Networking K – Mice and Elephant flows – > https://youtu.be/vCB_JH2o1nk

” There are two types of flows in data center environments. We have a large, elephant and smaller mice flow. Elephant flows might only represent a low proportion of the number of flows but consume most of the total data volume. Mice flows are, for example, control and alarm/control messages and usually pretty significant. As a result, they should be given priority over larger elephant flows, but this is sometimes not the case with simple buffer types that don’t distinguish between flow types. Priority can be given by somehow regulating the elephant flows with intelligent switch buffers. Mice flows are often bursty flows where one query is sent to many servers. This results in many small queries getting sent back to the single originating host. These messages are often small only requiring 3 to 5 TCP packets. As a result, the TCP congestion control mechanism may not even be evoked as the congestion mechanisms take 3 duplicate ACK messages. Due to the size of elephant flows they will invoke the TCP congestion control mechanism (mice flows don’t as they are too small).

Enterprise Networking LMultipath TCP – > https://youtu.be/Dfykc40oWzI

“Transmission Control Protocol (TCP) applications offer reliable byte stream with congestion control mechanisms adjusting flows to current network load. Designed in the 70s, TCP is the most widely used protocol and remains largely unchanged, unlike the networks it operates within. Back in those days the designers understood there could be link failure and decided to decouple the network layer (IP) from the transport layer (TCP). This enables the routing with IP around link failures without breaking the end-to-end TCP connection. Dynamic routing protocols do this automatically without the need for transport layer knowledge. Even Though it has wide adoption, it does not fully align with the multipath characteristics of today’s networks. TCP’s main drawback is that it’s a single path per connection protocol. A single path means once the stream is placed on a path ( endpoints of the connection) it can not be moved to another path even though multiple paths may exist between peers. This characteristic is suboptimal as the majority of today’s networks, and end hosts have multipath characteristics for better performance and robustness.”

Enterprise Networking MMultipath TCP use cases – > https://youtu.be/KkL_yLNhK_E

“Multipath TCP is particularly useful in the multipath data center and mobile phone environments. All mobiles allow you to connect via WiFi and a 3G network. MPTCP enables either the combined throughput and the switching of interfaces ( Wifi / 3G ) without disrupting the end-to-end TCP connection. For example, if you are currently on a 3G network with an active TCP stream, the TCP stream is bound to that interface. If you want to move to the Wifi network you need to reset the connection and all ongoing TCP connections will, therefore, get reset. With MPTCP the swapping of interfaces is transparent. Next-generation leaf and spine data center networks are built with Equal-Cost Multipath (ECMP). Within the data center, any two endpoints are equidistant. For one endpoint to communicate to another, a TCP flow is placed on a single link, not spread over multiple links. As a result, single-path TCP collisions may occur, reducing the throughput available to that flow. This is commonly seen for large flows and not small mice flow.”

Enterprise Networking N – > Multipath TCP connection setup – > https://youtu.be/ALAPKcOouAA

“The aim of the connection is to have a single TCP connection with many subflows. The two endpoints using MPTCP are synchronized and have connection identifiers for each of the subflows. MPTCP starts the same as regular TCP. If additional paths are available additional TCP subflow sessions are combined into the existing TCP session. The original TCP session and other subflow sessions appear as one to the application, and the main Multipath TCP connection seems like a regular TCP connection. The identification of additional paths boils down to the number of IP addresses on the hosts. The TCP handshake starts as normal, but within the first SYN, there is a new MP_CAPABLE option ( value 0x0 ) and a unique connection identifier. This allows the client to indicate they want to do MPTCP. At this stage, the application layer just creates a standard TCP socket with additional variables indicating that it wants to do MPTCP. If the receiving server end is MP_CAPABLE it will reply with the SYN/ACK MP_CAPABLE along with its connection identifier. Once the connection is agreed the client and server will set upstate. Inside the kernel, this creates a Meta socket acting as the layer between the application and all the TCP subflows.”

More Videos to come!

Additional Enterprise Networking information can be found at the following:

multipath tcp

Multipath TCP

Multipath TCP

In today's interconnected world, a seamless and reliable internet connection is paramount. Traditional TCP/IP protocols have served us well, but they face challenges in handling modern network demands. Enter Multipath TCP (MPTCP), a groundbreaking technology that has the potential to revolutionize internet connections. In this blog post, we will explore the intricacies of MPTCP, its benefits, and its implications for the future of networking.

MPTCP, as the name suggests, allows a single data stream to be transmitted across multiple paths simultaneously. Unlike traditional TCP, which relies on a single path, MPTCP splits the data into subflows, distributing them across different routes. This enables the utilization of multiple network interfaces, such as Wi-Fi, cellular, or wired connections, to enhance performance, resilience, and overall user experience.

One of the key advantages of MPTCP lies in its ability to provide robustness and resilience. By utilizing multiple paths, MPTCP ensures that data transmission remains uninterrupted even if one path fails or experiences congestion. This redundancy significantly improves the reliability of connections, making it particularly valuable for critical applications such as real-time streaming and online gaming.

Implementing MPTCP requires both client and server-side support. Fortunately, MPTCP has gained significant traction, and numerous operating systems and network devices have begun incorporating native support for this protocol. From Linux to iOS, MPTCP is gradually becoming a standardized feature, empowering users with enhanced connectivity options.

The versatility of MPTCP opens up a plethora of possibilities for various applications. For instance, in the context of mobile devices, MPTCP can seamlessly switch between Wi-Fi and cellular networks, optimizing connectivity and ensuring uninterrupted service. Additionally, MPTCP holds promise for cloud computing, content delivery networks, and distributed systems, where the concurrent utilization of multiple paths can significantly improve performance and efficiency.

Highlights: Multipath TCP

At its core, MPTCP is an extension of the conventional Transmission Control Protocol (TCP), designed to improve network resource utilization and increase resilience by allowing multiple paths for data transmission. This means data packets can be sent over several network interfaces simultaneously, optimizing speed and reliability.

The traditional TCP protocol uses a single network path for communication, which can lead to bottlenecks and inefficiencies, especially in environments with fluctuating network conditions. Multipath TCP, on the other hand, breaks away from this limitation by enabling a session to split across multiple paths. This is particularly useful in mobile environments, where devices often switch between different networks such as Wi-Fi and cellular data. By leveraging multiple paths, MPTCP ensures that if one path fails or becomes congested, others can take over seamlessly, maintaining a stable connection.

Adopting MPTCP

– MPTCP is an extension of the traditional Transmission Control Protocol (TCP) that enables the establishment of multiple sub-flows within a single TCP connection. This means that data can be simultaneously transmitted over different network paths, such as Wi-Fi and cellular networks, providing increased throughput and improved resilience against network failures.

– One of the primary advantages of MPTCP is its ability to utilize the combined bandwidth of multiple network paths, resulting in faster data transfer rates. Additionally, MPTCP offers enhanced reliability by dynamically adapting to network conditions and rerouting data if a path becomes congested or fails. This makes it particularly useful in scenarios with a stable and high-bandwidth connection, such as streaming multimedia content or real-time applications, which is crucial.

How Multipath TCP Works:

A) MPTCP operates by establishing a regular TCP connection between two endpoints, known as the initial subflow. It then negotiates with the remote endpoint to create additional subflows over different network paths.

B) These subflows are managed by a central entity called the MPTCP scheduler, which ensures efficient data distribution across the available paths. By splitting the data into smaller chunks and assigning them to different subflows, MPTCP enables parallel transmission and optimal resource utilization.

C) The versatility of MPTCP opens up exciting possibilities for various applications. MPTCP can seamlessly switch between different networks in mobile devices, providing uninterrupted connectivity and improved user experience. It also holds great potential in cloud computing, where it can enable efficient data transfers across multiple data centers, reducing latency and enhancing overall performance.

Critical Benefits of Multipath TCP:

1. Improved Performance: MPTCP can distribute the data traffic using multiple paths, enabling faster transmission rates and reducing latency. This enhanced performance is particularly beneficial for bandwidth-intensive applications such as streaming, file transfers, and video conferencing, where higher throughput and reduced latency are crucial.

2. Increased Resilience: MPTCP enhances network resilience by providing seamless failover capabilities. In traditional TCP, if a network path fails, the connection is disrupted, resulting in a delay or even a complete loss of service. However, with MPTCP, if one path becomes unavailable, the connection can automatically switch to an alternative path, ensuring uninterrupted communication.

3. Efficient Resource Utilization: MPTCP allows for better utilization of available network resources. Distributing traffic across multiple paths prevents congestion on a single path and optimizes the usage of available bandwidth. This results in more efficient utilization of network resources and improved overall performance.

4. Seamless Transition between Networks: MPTCP is particularly useful in scenarios where devices need to switch between different networks seamlessly. For example, when a mobile device moves from a Wi-Fi network to a cellular network, MPTCP can maintain the connection and seamlessly transfer the ongoing data traffic to the new network without interruption.

5. Compatibility with Existing Infrastructure: MPTCP is designed to be backward compatible with traditional TCP, making it easy to deploy and integrate into existing network infrastructure. It can coexist with legacy TCP connections and gradually adapt to MPTCP capabilities as more devices and networks support the protocol.

**TCP restricts communication**

Multiple paths connect hosts, but TCP restricts communications to a single path per transport connection. Multiple paths could be used concurrently within the network to maximize resource usage. Improved resilience to network failures and higher throughput should enhance the user experience.

Due to protocol constraints both on the end systems and within the network, Internet resources (particularly bandwidth) are often not fully utilized as the Internet evolves. The end-user experience could be significantly improved if these resources were used simultaneously.

A similar improvement in user experience could also be achieved without as much expenditure on network infrastructure. In resource pooling, these available resources are ‘pooled’ into one logical resource for the user.

**The goal of resource pooling**

As part of multipath transport, disjoint (or partially disjoint) paths across a network are simultaneously used to achieve some of the goals of resource pooling. In addition to increasing resilience, multipath transport protects end hosts from failures on one path. As a result, network capacity can be increased by improving resource utilization efficiency. In a multipath TCP connection, multiple paths are pooled transparently within a transport connection to achieve multipath TCP goals.

When one or both hosts are multihomed, multipath TCP uses multiple paths end-to-end. A host can also manipulate the network path by changing port numbers with Equal Cost MultiPath (ECMP), for example, to create multiple paths within the network.

**Multipath TCP and TCP**

Multipath TCP (MPTCP) is a protocol extension that allows for the simultaneous use of multiple network paths between two endpoints. Traditionally, TCP (Transmission Control Protocol) relies on a single path for data transmission, which can limit performance and reliability. With MPTCP, multiple paths can be established between the sender and receiver, enabling the distribution of traffic across these paths. This offers several advantages, including increased throughput, better load balancing, and improved resilience against network failures.

**Automatically Set Up Multiple Paths**

It is designed to automatically set up multiple paths between two endpoints and use those paths to send and receive data efficiently. It also provides a mechanism for detecting and recovering from packet loss and for providing low-latency communication. MPTCP is used in applications that require high throughput and low latency, such as streaming media, virtual private networks (VPNs), and networked gaming. MPTCP is an extension to the standard TCP protocol and is supported by most modern operating systems, including Windows, macOS, iOS, and Linux.

**High Throughput & Low Latency**

MPTCP is an attractive option for applications that require high throughput and low latency, as it can provide both. Additionally, it can provide fault tolerance and redundancy, allowing an application to remain operational even if one or more of its paths fail. This makes it useful for applications such as streaming media, where high throughput and low latency are essential, and reliability is critical.

Understanding TCP Performance Parameters

TCP performance parameters are settings that can be tweaked to fine-tune the behavior of the TCP protocol. These parameters dictate how TCP handles congestion control, window size, retransmission, and more. By understanding the impact of each parameter, network administrators can optimize TCP for specific use cases and network conditions.

Congestion Control and Window Size: Congestion control is vital to TCP performance. It ensures that the network does not become overwhelmed with excessive data. Network engineers can balance throughput and network utilization by adjusting TCP congestion control algorithms and window sizes. 

Retransmission and Timeout: TCP retransmission and timeout mechanisms ensure reliable data delivery. By fine-tuning retransmission parameters such as RTO (Retransmission Timeout), SRTT (Smoothed Round Trip Time), and RTO Min/Max, network administrators can optimize TCP’s ability to recover from lost packets efficiently. There are a number of trade-offs between aggressive and conservative retransmission strategies and examine how different timeout values impact overall performance.

Buffer Sizes and Burstiness: Buffers are temporary storage areas that hold data during transmission. Properly sizing TCP buffers is essential for maximizing performance. Oversized buffers can lead to increased latency and buffer bloat, while undersized buffers can cause packet loss. There are a number of intricacies of buffer sizing and techniques like Active Queue Management (AQM) to mitigate buffer-related issues.

What is TCP MSS?

TCP MSS, or Maximum Segment Size, refers to the maximum amount of data encapsulated in a single TCP segment. It determines the payload size sent within a TCP packet, excluding the TCP header. The MSS value is negotiated during the TCP handshake process, allowing both the sender and receiver to agree upon an optimal segment size for communication.

The TCP MSS value directly affects network performance and efficiency. It impacts the amount of data that can be transmitted in a single packet, affecting a network connection’s overall throughput and latency. Understanding and optimizing the TCP MSS value can significantly improve application performance and reduce unnecessary overhead.

Several factors can influence the TCP MSS value used in a network connection. One primary factor is the underlying network infrastructure’s Maximum Transmission Unit (MTU). The TCP MSS is typically set to match the MTU to avoid fragmentation and improve efficiency. Additionally, network devices, such as routers and firewalls, can enforce specific MSS values, leading to further variations.

It is crucial to consider the network environment and characteristics to optimize TCP MSS for better performance. Understanding the MTU limitations and adjusting the MSS value accordingly can prevent fragmentation and enhance data transmission. Network administrators can also employ Path MTU Discovery techniques to dynamically adjust the MSS value based on the path characteristics between the communicating devices.

Before you proceed, you may find the following helpful:

  1. Software Defined Perimeter
  2. Event Stream Processing
  3. Application Aware Networking

Multipath TCP

Reliable byte streams

To start the discussion on multipath TCP, we must understand the basics of Transmission Control Protocol (TCP) and its effect on IP Forwarding. TCP applications offer reliable byte streams with congestion control mechanisms that adjust flows to the current network load. Designed in the 1970s, TCP is the most widely used protocol and remains unchanged, unlike the networks it operates within. In those days, the designers understood there could be link failure and decided to decouple the network layer (IP) from the transport layer (TCP).

Required: Multipath Routing

This enables the routing with IP around link failures without breaking the end-to-end TCP connection. Dynamic routing protocols such as BGP Multipath do this automatically without the need for transport layer knowledge. Even though it has wide adoption, it does not fully align with the multipath networking requirements of today’s networks, driving the need for MP-TCP.

Challenge: Default TCP Operation

TCP delivers reliability using distinct variations of the techniques. Because it provides a byte stream interface, TCP must convert a sending application’s stream of bytes into a set of packets that IP can carry. This is called packetization. These packets contain sequence numbers, which in TCP represent the byte offsets of the first byte in each packet in the overall data stream rather than packet numbers. This allows packets to be of variable size during a transfer and may also allow them to be combined, called repacketization.

Challenge: Single Path Connection Protocol

TCP’s main drawback is that it’s a single path per connection protocol. A single path means once the stream is placed on a path ( endpoints of the connection), it can not be moved to another path even though multiple paths may exist between peers. This characteristic is suboptimal as most of today’s networks and end hosts have multipath characteristics for better performance and robustness.

What is Multipath TCP?

A) Multipath TCP, also known as MPTCP, is an extension to the traditional TCP protocol that allows a single TCP connection to utilize multiple network paths simultaneously. Unlike conventional TCP, which operates on a single path, MPTCP offers the ability to distribute the traffic across multiple paths, enabling more efficient resource utilization and increased overall network capacity.

Multiple Paths for a Single TCP Session

B) Using multiple paths for a single TCP session increases resource usage and resilience for TCP optimization. Additional extensions added to regular TCP simultaneously enable connection transport across multiple links.

C) The core aim of Multipath TCP (MP-TCP) is to allow a single TCP connection to use multiple paths simultaneously by using abstractions at the transport layer. As it operates at the transport layer, the upper and lower layers are transparent to its operation. No network or link-layer modifications are needed.

D) There is no need to change the network or the end hosts. The end hosts use the same socket API call, and the network continues to operate as before. No unique configurations are required as it’s a capability exchange between hosts. Multipath TCP enabling multipath networking is 100% backward compatible with regular TCP.

Multipath TCP
Diagram: Multipath TCP. Source is Cisco

TCP sub flows

MPTCP achieves its goals through sub-flows of individual TCP connections forming an MPTCP session. These sub-flows can be established over different network paths, allowing for parallel data transmission. MPTCP also includes mechanisms for congestion control and data sequencing across the sub-flows, ensuring reliable packet delivery.

MP-TCP binds a TCP connection between two hosts, not two interfaces, like regular TCP. Regular TCP connects two IP endpoints by establishing a source/destination by IP address and port number. The application has to choose a single link for the connection. However, MPTCP creates new TCP connections known as sub-flows, allowing the application to take different links for each subflow. 

Subflows are set up the same as regular TCP connections. They consist of a flow of TCP segments operating over individual paths but are still part of the overall MPTCP connection. Subflows are never fixed and may fluctuate in number during the lifetime of the parent Multipath TCP connection.

Multipath TCP uses cases.

The deployment of MPTCP has the potential to benefit various applications and use cases. For example, MPTCP can enable seamless handovers between cellular towers or Wi-Fi access points in mobile networks, providing uninterrupted connectivity. MPTCP can improve server-to-server communications in data centers by utilizing multiple links and avoiding congestion.

Multipath TCP is beneficial in multipath data centers and mobile phone environments. All mobiles allow you to connect via wifi and a 3G network. MP-TCP enables the combined throughput and the switching of interfaces (wifi / 3G ) without disrupting the end-to-end TCP connection.

For example, if you are currently on a 3G network with an active TCP stream, the TCP stream is bound to that interface. If you want to move to the wifi network, you need to reset the connection, and all ongoing TCP connections will reset. With MP-TCP, the swapping of interfaces is transparent.

Multipath networking: leaf-spine data center

Leaf and spine data centers are a revolutionary networking architecture that has revolutionized connectivity in modern data centers. Unlike traditional hierarchical designs, leaf and spine networks are based on a non-blocking, fully meshed structure. The leaf switches act as access points, connecting directly to the spine switches, creating a flat network topology.

Critical Characteristics of Leaf and Spine Data Centers

One key characteristic of leaf and spine data centers is their scalability. With their non-blocking architecture, leaf and spine networks can easily accommodate the increasing demands of modern data centers without sacrificing performance. Additionally, they offer low latency, high bandwidth, and improved resiliency compared to traditional designs.

Next-generation leaf and spine data center networks are built with Equal-Cost Multipath (ECMP). Within the data center, any two endpoints are equidistant. For one endpoint to communicate with another, a TCP flow is placed on a single link, not spread over multiple links. As a result, single-path TCP collisions may occur, reducing the throughput available to that flow.

This is commonly seen for large flows and not small mice flows. When a server starts a TCP connection in a data center, it gets placed on a path and stays there. With MP-TCP, you could use many sub-flows per connection instead of a single path per connection. Then, if some of those sub-flows get congested, you don’t send over that subflow, improving traffic fairness and bandwidth optimizations.

Hash-based distribution

The default behavior of spreading traffic through a LAG or ECMP next hops is based on the hash-based distribution of packets. First, an array of buckets is created, and each outbound link is assigned to one or more. Next, fields such as source-destination IP address / MAC address are taken from the outgoing packet header and hashed based on this endpoint identification. Finally, the hash selects a bucket, and the packet is queued to the interface assigned to that bucket. 

redundant links
Diagram: Redundant links with EtherChannel. Source is jmcritobal

The issue is that the load-balancing algorithm does not consider interface congestions or packet drops. With all mice flows, this is fine, but once you mix mice and elephant flows together, your performance will suffer. An algorithm is needed to identify congested links and then reshuffle the traffic.

A good use for MPTCP is a mix of mice and elephant flows. Generally, MP-TCP does not improve performance for environments with only mice flows.

Small files, say 50 KB, perform similarly to regular TCP. Multipath networking usually has the same results as link bonding as the file size increases. The benefits of MP-TCP come into play when files are enormous (300 KB ). MP-TCP outperforms link bonding at this level as the congestion control can better balance the load over the links.

MP-TCP connection setup

The connection aims to have a single TCP connection with many sub-flows. The two endpoints using MPTCP are synchronized and have connection identifiers for each sub-flow. MPTCP starts the same as regular TCP. Additional TCP subflow sessions are combined into the existing TCP session if different paths are available. The original TCP and other subflow sessions appear as one to the application, and the primary Multipath TCP connection seems like a regular TCP connection. Identifying additional paths boils down to the number of IP addresses on the hosts. 

The TCP handshake starts as expected, but within the first SYN is a new MP_CAPABLE option ( value 0x0 ) and a unique connection identifier. This allows the client to indicate that they want to do MPTCP. At this stage, the application layer creates a standard TCP socket with additional variables telling it intends to do MPTCP.

If the receiving server end is MP_CAPABLE, it will reply with the SYN/ACK MP_CAPABLE and its connection identifier. Once the connection is agreed upon, the client and server will set up the upstate. Inside the kernel, a Meta socket is the layer between the application and all the TCP sub-flows.

Under a multipath condition and when multiple paths are detected (based on IP addresses), the client starts a regular TCP handshake with the MP_JOIN option (value 0x1) and uses the connection identifier for the server. The server then replies with a subflow setup. New sub-flows are created, and the scheduler will schedule over each sub-flow as the data is sent from the application to the meta socket. 

TCP sequence numbers 

Regular TCP uses sequence numbers, enabling the receiving side to return packets in the correct order before sending them to the application. The sender can determine which packets are lost by looking at the ACK.

For MP-TCP, packets must travel multiple paths, so sequence numbers are needed to restore packet order before they are passed to the application. The sequence numbers also inform the sender of any packet loss on a path. When an application sends a packet, the segment is assigned a data sequence number.

TCP looks at the sub-flows to see where to send this segment. When it ships on a subflow, it uses a sequence number and puts it in the TCP header, and the other data sequence number gets set in the TCP options. 

The sequence number on the TCP header informs the client of any packet loss. The recipient also uses the data sequence number to reorder packets before sending them to the application.

Congestion control

Congestion control was never a problem in circuit switching. Resources are reserved at call setup to prevent congestion during data transfer, resulting in a lot of bandwidth underutilization due to the reservation of circuits. We then moved to packet switching, where we had a single link with no reservation, but the flows could use as much of the link as they wanted. This increases the utilization of the link and also the possibility of congestion.

To help this situation, congestion control mechanisms were added to TCP. Similar TCP congestion control mechanisms are employed for MP-TCP. Standard TCP congestion control maintains a congestion window for each connection, and you increase the window size on each ACK. With a drop, you half the window. 

MP-TCP operates similarly. You maintain one congestion window for each subflow path. Similar to standard TCP, when you have a drop on a subflow, you have half the window for that subflow. However, the increased rules are different from expected TCP behavior.

It gives more of an increase for sub-flows with a larger window. A larger window means it has a lower loss. As a result, traffic moves from congested to uncongested links dynamically.

Closing Points on MP-TCP

At its core, Multipath TCP operates by distributing data packets across multiple paths between a client and server. This process, known as multipath routing, allows for simultaneous use of various network interfaces, such as Wi-Fi and cellular data. By dynamically managing these paths, MPTCP can reroute traffic in response to network congestion or failures, ensuring a seamless and uninterrupted connection. This section delves into the technical intricacies of MPTCP, exploring how it negotiates paths, manages data packets, and maintains connection stability.

The adoption of Multipath TCP brings a plethora of benefits to both consumers and enterprises. For end-users, it means faster download speeds and a more stable internet connection, even in challenging environments. Businesses can leverage MPTCP to enhance the performance of their network applications, ensuring high availability and improved user experiences. Additionally, MPTCP contributes to better load balancing and resource utilization across network infrastructures. This section highlights the key advantages of integrating MPTCP into modern networking solutions.

Multipath TCP is not just a theoretical concept; it has practical applications in various fields. From enhancing mobile networks to optimizing cloud services, MPTCP is making its mark. In mobile devices, MPTCP allows for the simultaneous use of Wi-Fi and cellular networks, providing a more robust connection. In cloud computing, it facilitates efficient resource distribution and redundancy. This section explores real-world use cases, demonstrating how MPTCP is transforming networking across industries.

Summary: Multipath TCP

The networking world is constantly evolving, with new technologies and protocols being developed to meet the growing demands of our interconnected world. One such protocol that has gained significant attention recently is Multipath TCP (MPTCP). In this blog post, we dived into the fascinating world of MPTCP, its benefits, and its potential applications.

Understanding Multipath TCP

Multipath TCP, often called MPTCP, is an extension of the traditional TCP protocol that allows for simultaneous data transmission across multiple paths. Unlike conventional TCP, which operates on a single path, MPTCP leverages multiple network interfaces, such as Wi-Fi and cellular connections, to improve overall network performance and reliability.

Benefits of Multipath TCP

By utilizing multiple paths, MPTCP offers several key advantages. Firstly, it enhances throughput by aggregating the bandwidth of multiple network interfaces, resulting in faster data transfer speeds. Additionally, MPTCP improves resilience by providing seamless failover between different paths, ensuring uninterrupted connectivity even if one path becomes congested or unavailable.

Applications of Multipath TCP

The versatility of MPTCP opens the door to a wide range of applications. One notable application is in mobile devices, where MPTCP can intelligently combine Wi-Fi and cellular connections to provide users with a more stable and faster internet experience. MPTCP also finds utility in data centers, enabling efficient load balancing and reducing network congestion by distributing traffic across multiple paths.

Challenges and Future Developments

While MPTCP brings many benefits, it also presents challenges. One such challenge is ensuring compatibility with existing infrastructure and devices that may not support MPTCP. Additionally, optimizing MPTCP’s congestion control mechanisms and addressing security concerns are ongoing research and development areas.

Conclusion:

Multipath TCP is a groundbreaking protocol that has the potential to revolutionize the way we experience network connectivity. With its ability to enhance throughput, improve resilience, and enable new applications, MPTCP holds great promise for the future of networking. As researchers continue to address challenges and refine the protocol, we can expect even greater advancements in this exciting field.

Packet loss as a binary code 3D illustration

Dropped Packet Test

Dropped Packet Test

In the world of networking, maintaining optimal performance is crucial. One of the key challenges that network administrators face is identifying and addressing packet loss issues. Dropped packets can lead to significant disruptions, sluggish connectivity, and even compromised data integrity. To shed light on this matter, we delve into the dropped packet test—a powerful tool for diagnosing and improving network performance.

Packet loss occurs when data packets traveling across a network fail to reach their destination. This can happen due to various reasons, including network congestion, faulty hardware, or software glitches. Regardless of the cause, packet loss can have detrimental effects on network reliability and user experience.

The dropped packet test is a method used to measure the rate of packet loss in a network. It involves sending a series of test packets from a source to a destination and monitoring the percentage of packets that fail to reach their intended endpoint. This test provides valuable insights into the health of a network and helps identify areas that require attention.

To perform a dropped packet test, network administrators can utilize specialized network diagnostic tools or command-line utilities. These tools allow them to generate test packets and analyze the results. By configuring parameters such as packet size, frequency, and destination, administrators can simulate real-world network conditions and assess the impact on packet loss.

Once the dropped packet test is complete, administrators need to interpret the results accurately. A high packet loss percentage indicates potential issues that require investigation. It is crucial to analyze the test data in conjunction with other network performance metrics to pinpoint the root cause of packet loss and devise effective solutions.

To address packet loss, network administrators can employ a range of strategies. These may include optimizing network infrastructure, upgrading hardware components, fine-tuning routing configurations, or implementing traffic prioritization techniques. The insights gained from the dropped packet test serve as a foundation for implementing targeted solutions and improving overall network performance.

Highlights: Dropped Packet Test

**Understanding Packet Loss**

Packet loss occurs when data packets traveling across a network fail to reach their destination. This can happen due to network congestion, hardware failures, software bugs, or even environmental factors. The consequences of packet loss can be severe, leading to slower data transmission, corrupted files, and degraded voice and video communication. For businesses and individuals relying on robust internet connections, minimizing packet loss is essential for maintaining productivity and satisfaction.

**Tools and Techniques for Testing Packet Loss**

Testing for packet loss is a proactive step toward maintaining network health. Several tools and techniques can help identify and quantify packet loss. Ping tests, for example, are a straightforward method where small data packets are sent to a target and any loss is recorded. More advanced tools like Wireshark offer deeper insights by capturing and analyzing network traffic in real-time. These tools help network administrators pinpoint issues, understand the scope of packet loss, and devise strategies for mitigation.

**Interpreting Test Results**

Once testing is complete, interpreting the results is crucial for taking corrective action. A small percentage of packet loss over a short period might be negligible, but sustained or high levels of loss demand attention. Understanding the pattern and frequency of packet loss can guide troubleshooting efforts, whether it’s adjusting network configurations, upgrading hardware, or addressing software issues. Recognizing the symptoms early allows for quicker resolution and less impact on network performance.

Understanding Dropped Packets

1. Dropped packets occur when data fails to reach its intended destination within a network. These packets carry vital information, and any loss or delay can hamper the overall performance of a network. Understanding the causes and consequences of dropped packets is fundamental to optimizing network performance

2. The dropped packet test is an essential diagnostic tool that network administrators and engineers employ to assess the health and efficiency of a network. By intentionally creating scenarios where packets are dropped, it becomes possible to measure the impact on network performance and identify potential weaknesses or areas for improvement.

3. To perform the dropped packet test, various methodologies and tools are available. One commonly used approach involves using network traffic generators to simulate network traffic and intentionally dropping packets at specific points. This allows administrators to evaluate how different network components and configurations handle packet loss and its subsequent impact on overall performance.

4. Once the dropped packet test is completed, it is crucial to analyze the results effectively. Network monitoring tools and packet analyzers can provide detailed insights into packet loss, latency, and other performance metrics. By carefully examining these results, administrators can pinpoint potential bottlenecks, identify problematic network segments, and make informed decisions to optimize performance.

Knowledge Check: Understanding Traceroute

Traceroute basics:

Traceroute, also known as tracert on Windows, is a command-line tool that tracks the route data packets take from one point to another. By sending a series of packets with increasing Time to Live (TTL) values, traceroute reveals the path these packets follow, hopping from one network node to another.

TTL and ICMP:

Traceroute exploits the Time to Live (TTL) field in IP packets and utilizes Internet Control Message Protocol (ICMP) to gather information about the network path. As each packet encounters a node in its journey, if the TTL expires, an ICMP time exceeded message is sent back to the traceroute tool, allowing it to determine the IP address and round-trip time to reach each node.

Network troubleshooting: Traceroute is an invaluable tool for administrators to diagnose and troubleshoot connectivity issues. By identifying the exact network hop where a delay or failure occurs, administrators can pinpoint the problem and take appropriate action to resolve it swiftly.

Identifying potential bottlenecks: Traceroute assists in identifying potential bottlenecks and points of congestion in network paths. This information allows network administrators to optimize their infrastructure, reroute traffic, or negotiate better peering agreements to improve network performance.

Example Product: Cisco ThousandEyes

**Why Network Visibility Matters**

Network visibility is the cornerstone of efficient IT operations. With the proliferation of cloud services, remote work, and global connectivity, traditional network monitoring tools often fall short. They lack the insights needed to understand the entire digital supply chain. This is where ThousandEyes excels. By offering a comprehensive view of network paths, internet health, and application performance, it helps organizations identify and resolve issues before they impact end-users. This proactive approach not only enhances user satisfaction but also ensures business continuity.

**Key Features of Cisco ThousandEyes**

1. **Global Monitoring**: ThousandEyes leverages a vast network of vantage points across the globe to monitor the performance of internet-dependent services. This global reach ensures that businesses can track performance from virtually anywhere in the world.

2. **End-to-End Visibility**: By providing insights into every segment of the network path—from the data center to the end user—ThousandEyes enables businesses to pinpoint where issues occur, whether it’s within their own infrastructure or an external service provider.

3. **Cloud and SaaS Monitoring**: As more businesses migrate to cloud-based services, monitoring these environments becomes crucial. ThousandEyes offers robust monitoring capabilities for cloud platforms and SaaS applications, ensuring they perform optimally.

4. **BGP Route Visualization**: Understanding the routing of data across the internet is essential for troubleshooting and optimizing network performance. ThousandEyes provides detailed BGP route visualizations, helping businesses understand how their data travels and where potential issues might arise.

5. **Alerting and Reporting**: ThousandEyes offers customizable alerting and reporting features, allowing businesses to stay informed about performance issues and trends. This ensures that IT teams can respond quickly to potential problems and maintain high service levels.

**Use Cases and Benefits**

ThousandEyes caters to a wide range of use cases, each offering distinct benefits:

– **Optimizing User Experience**: By monitoring user interactions with applications, businesses can ensure a smooth and responsive experience, which is crucial for customer satisfaction and retention.

– **Enhancing Cloud Performance**: With detailed insights into cloud service performance, organizations can optimize their cloud environments, ensuring reliable and efficient service delivery.

– **Troubleshooting Network Issues**: ThousandEyes’ comprehensive visibility allows IT teams to quickly identify and resolve network issues, minimizing downtime and maintaining productivity.

– **Supporting Remote Work**: As remote work becomes the norm, ThousandEyes helps businesses monitor and optimize the remote user experience, ensuring employees can work efficiently from any location.

Testing For Packet Loss

How do you test for packet loss on a network? The following post provides information on testing packet loss and network packet loss tests. Today’s data center performance has to factor in various applications and workloads with different consistency requirements.

Understanding what is best per application/workload requires a dropped packet test from different network parts. Some applications require predictable latency, while others sustain throughput. The slowest flow is usually the ultimate determining factor affecting end-to-end performance.

The consequences of packet loss can be far-reaching. In real-time communication applications, even a slight loss of packets can lead to distorted audio, pixelated video, or delayed responses. In data-intensive tasks such as cloud computing or online backups, packet loss can result in corrupted files or incomplete transfers. Businesses relying on efficient data transmission can suffer from reduced productivity and customer dissatisfaction.

What is Network Monitoring?

Network monitoring refers to the continuous surveillance and analysis of network infrastructure, devices, and traffic. It involves observing network performance, identifying issues or anomalies, and proactively addressing them to minimize downtime and optimize network efficiency. By monitoring various parameters such as bandwidth usage, latency, packet loss, and device health, organizations can detect and resolve potential problems before they escalate.

a) Proactive Issue Detection: Network monitoring allows IT teams to identify and address potential problems before they impact users or cause significant disruptions. By setting up alerts and notifications, administrators can stay informed about network issues and take prompt action to mitigate them.

b) Improved Network Performance: Continuous monitoring provides valuable insights into network performance metrics, allowing organizations to identify bottlenecks, optimize resource allocation, and ensure smooth data flow. This leads to enhanced user experience and increased productivity.

Understanding Network Scanning

A: Network scanning is a proactive security measure that identifies vulnerabilities, discovers active hosts, and assesses a network’s overall health. It plays a vital role in fortifying the digital fortress by systematically examining the network infrastructure, including computers, servers, and connected devices.

B: Various techniques are employed in network scanning, each serving a unique purpose. Port scanning, for instance, involves scanning for open ports on target devices, providing insights into potential entry points for malicious actors. On the other hand, vulnerability scanning focuses on identifying weaknesses in software, operating systems, or configurations that may be exploited. Other techniques, such as ping scanning, OS fingerprinting, and service enumeration, further enhance the scanning process.

C: Network scanning offers numerous benefits contributing to an organization’s or individual’s security posture. First, it enables proactive identification of vulnerabilities, allowing for timely patching and mitigation. Moreover, regular network scanning aids in compliance with security standards and regulations. Network scanning helps maintain a controlled and secure environment by identifying unauthorized devices or rogue access points.

 What is ICMP?

ICMP is a network-layer protocol that operates on top of the Internet Protocol (IP). It is primarily designed to report errors, exchange control information, and provide feedback about network conditions. ICMP messages are encapsulated within IP packets, allowing them to traverse the network and reach their destinations.

ICMP encompasses a range of message types, each serving a specific purpose. Some common message types include Echo Request (Ping), Echo Reply, Destination Unreachable, Time Exceeded, Redirect, and Address Mask Request/Reply. Each message type carries valuable information about the network and aids in network troubleshooting and management.

Ping and Echo Requests

One of the most well-known uses of ICMP is the Ping utility, which is based on the Echo Request and Echo Reply message types. Ping allows us to test a host’s reachability and round-trip time on the network. Network administrators can assess network connectivity and measure response times by sending an Echo Request and waiting for an Echo Reply.

ICMP plays a vital role in network troubleshooting and diagnostics. When a packet encounters an issue across the network, ICMP helps identify and report the problem. Destination Unreachable messages, for example, inform the sender that the intended destination is unreachable due to various reasons, such as network congestion, firewall rules, or routing issues.

Testing Methods for Packet Loss

To ensure a robust and reliable network infrastructure, it is vital to perform regular testing for packet loss. Here are some effective methods to carry out such tests:

1. Ping and Traceroute: These command-line utilities can provide valuable insights into network connectivity and latency. Packet loss can be detected by sending test packets and analyzing their round-trip time.

2. Network Monitoring Tools: Specialized network monitoring software can offer comprehensive visibility into network performance. These tools can monitor packet loss in real-time, provide detailed reports, and even alert administrators of potential issues.

3. Load Testing: Simulating heavy network traffic can help identify how the network handles data under stress. By monitoring packet loss during these tests, administrators can pinpoint weak spots and take necessary measures to mitigate the impact.

Example: Identifying and Mapping Networks

To troubleshoot the network effectively, you can use a range of tools. Some are built into the operating system, while others must be downloaded and run. Depending on your experience, you may choose a top-down or a bottom-up approach.

Required: Consistent Bandwidth & Unified Latency

We must focus on consistent bandwidth and unified latency for ALL flow types and workloads to satisfy varied conditions and achieve predictable application performance for a low latency network design. Poor performance is due to many factors that can be controlled. 

Bandwidth refers to the maximum data transfer rate of an internet connection. It determines how quickly information can be sent and received. Consistent bandwidth ensures that data flows seamlessly, minimizing interruptions and delays. It is the foundation for an enjoyable and productive online experience.

To identify bandwidth limitations, it is crucial to conduct thorough testing. Bandwidth tests measure the speed and stability of your internet connection. They provide valuable insights into potential bottlenecks or network issues affecting browsing, streaming, or downloading activities. By knowing your bandwidth limitations, you can take appropriate steps to optimize your online experience.

Choosing the Right Bandwidth Testing Tools

Several bandwidth testing tools are available, both online and offline. These tools accurately measure your internet speed and provide detailed reports on download and upload speeds, latency, and packet loss. Some popular options include Ookla’s Speedtest, Fast.com, and DSLReports. Choose a tool that suits your needs and provides comprehensive results.

**Start With A Baseline**

So, at the start, you must find the baseline and work from there. Baseline engineering is a critical approach to determining the definitive performance of software and hardware. Once you have a baseline, you can work from there, testing packet loss.

Depending on your environment, such tests may include chaos engineering kubernetes, which intentionally brake systems in a controlled environment to learn and optimize performance. To fully understand a system or service, you must deliberately break it. An example of some Chaos engineering tests include:

Example – Chaos Engineering:

  • Simulating the failure of a micro-component and dependency.
  • Simulating a high CPU load and sudden increase in traffic.
  • Simulating failure of the entire AZ ( Availability Zone ) or region.
  • Injecting latency and byzantine shortcomings in services.

Related: Before you proceed, you may find the following helpful:

  1. Multipath TCP
  2. BGP FlowSpec
  3. SASE Visibility
  4. TCP Optimization Mobile Networks
  5. Cisco Umbrella CASB
  6. IPv6 Attacks

Dropped Packet Test

Reasons for packet loss

Packet loss can occur for several reasons. It describes lost packets of data that do not reach their destination after being transmitted across a network. Packet loss occurs when network congestion, hardware issues, software bugs, and other factors cause dropped packets during data transmission.

The best way to measure packet loss using ping is to send a series of pings to the destination and look for failed responses. For instance, if you ping something 50 times and get only 49 ICMP replies, you can estimate packet loss at roughly 2%. There is no specific value of what would be a concern. It depends on the application. For example, voice applications are susceptible to latency and loss, but other web-based applications have a lot of tolerance.

However, if I were going to put my finger in the air with some packet loss guidelines, generally, a packet loss rate of 1 to 2.5 percent is acceptable. This is because packet loss rates are typically higher with WiFi networks than with wired systems.

Significance of Dropped Packet Test:

1. Identifying Network Issues: By intentionally dropping packets, network administrators can identify potential bottlenecks, congestion, or misconfigurations in the network infrastructure. This test helps pinpoint the specific areas where packet loss occurs, enabling targeted troubleshooting and optimization.

2. Evaluating Network Performance: The Dropped Packet Test provides valuable insights into the network’s performance by measuring the packet loss rate. Network administrators can use this information to analyze the impact of packet loss on application performance, user experience, and overall network efficiency.

3. Testing Network Resilience: By intentionally creating packet loss scenarios, network administrators can assess the resilience of their network infrastructure. This test helps determine if the network can handle packet loss without significant degradation and whether backup mechanisms, such as redundant links or failover systems, function as intended.

Network administrators utilize specialized tools or software to conduct the Dropped Packet Test. These tools generate artificial packet loss by dropping a certain percentage of packets during data transmission. The test can be performed on specific network segments, individual devices, or the entire network infrastructure.

Best Practices for Dropped Packet Testing:

1. Define Test Parameters: Before conducting the test, it is crucial to define the desired packet loss rate, test duration, and the specific network segments or devices to be tested. Having clear objectives ensures that the test yields accurate and actionable results.

2. Conduct Regular Testing: Regularly performing the Dropped Packet Test allows network administrators to detect and resolve network issues before they impact critical operations. It also helps monitor the effectiveness of implemented solutions over time.

3. Analyze Test Results: After completing the test, careful analysis of the test results is essential. Network administrators should examine the impact of packet loss on latency, throughput, and overall network performance. This analysis will guide them in making informed decisions to optimize the network infrastructure.

**General performance and packet loss testing**

The following screenshot is taken from a Cisco ISR router. Several IOS commands can be used to check essential performance. The command shows interface gi1 stats and generic packet in and out information. I would also monitor input and output errors with the command: show interface gi1. Finally, for additional packet loss testing, you can opt for an extended ping that gives you more options than a standard ping. It would be helpful to test from different source interfaces or vary the datagram size to determine any MTU issues causing packet loss.

packet loss testing
Diagram: Packet loss testing.

What Is Packet Loss? Testing Packet Loss

Packet loss results from a packet being sent and somehow lost before it reaches its intended destination. This can happen because of several reasons. Sometimes, a router, switch, or firewall has more traffic coming at it than it can handle and becomes overloaded.

This is known as congestion, and one way to deal with it is to drop packets so you can focus capacity on the rest of the traffic. Here is a quick tip before we get into the details: Keep an eye on buffers!

So, to start testing packet loss, one factor that can be monitored is buffer sizes in the network devices that interconnect source and destination points. Poor buffers cause bandwidth to be unfairly allocated among different types of flows. If some flows do not receive adequate bandwidth, they will exhibit long tails and completion times, degrading performance and resulting in packet drops in the network.

Application Performance

The speed of a network is all about how fast you can move and complete a data file from one location to another. Some factors are easy to influence, and others are impossible, such as the physical distance from one point to the next.

This is why we see a lot of content distributed closer to the source, with intelligent caching, for example, improving user response latency and reducing the cost of data transmission. The TCP’s connection-oriented procedure will affect application performance for different distance endpoints than for source-destination pairs internal to the data center.

We can’t change the laws of physics, and distance will always be a factor, but there are ways to optimize networking devices to improve application performance. One way is to optimize the buffer sizes and select the exemplary architecture to support applications that send burst traffic. There is considerable debate about whether big or small buffers are best or whether we need lossless transport or drop packets.

TCP congestion control

The TCP congestion control and network device buffer significantly affect the time it takes for the flow to complete. TCP, invented over 35 years ago, ensures that sent data blocks are received intact. It also creates a logical connection between source-destination pairs and endpoints at the lower IP layer.

The congestion control element was added later to ensure that data transfers can be accelerated or slowed down based on current network conditions. Congestion control is a mechanism that prevents congestion from occurring or relieves it once it appears. For example, the TCP congestion window limits how much data a sender can send into a network before receiving an acknowledgment.

In the following lab guide, I have attached a host and a web server with a packet sniffer. All ports are in the default VLAN, and the server runs the HTTP service. Once we open the web browser on the host to access the server, we can see the operations of TCP with the 3-way handshake. We have captured the traffic between the client PC and a web server.

TCP uses a three-way handshake to connect the client and server (SYN, SYN-ACK, ACK). First things first: Why is a three-way handshake called a three-way handshake? Three segments are exchanged between the client and server to establish a TCP connection.

Big buffers vs. small buffers

Both small and large buffer sizes have different effects on application flow types. Some sources claim that small buffer sizes optimize performance, while others claim that larger buffers are better.

Many web giants, including Facebook, Amazon, and Microsoft, use small buffer switches. It depends on your environment. Understanding your application traffic pattern and testing optimization techniques are essential to finding the sweet spot. Most out-of-the-box applications will not be fine-tuned for your environment; the only rule of thumb is to lab test.  

**TCP interaction**

Complications arise when TCP congestion control interacts with the network device buffer. The two have different purposes. TCP congestion control continuously monitors network bandwidth using packet drops as a metric, while buffering is used to avoid packet loss.

In a congestion scenario, the TCP is buffered, but the sender and receiver cannot know there is congestion, and the TCP congestion behavior is never initiated. So, the two mechanisms used to improve application performance don’t complement each other and require careful packet loss testing for your environment.

Dropped Packet Test: The Approac

Ping and Traceroute: Where is the packet loss?

At a fundamental level, we have ping and traceroute. Ping measures round-trip times between your computer and an internet destination. Traceroute measures the routers’ response times along the path between your computer and an internet destination.

These will tell you where the packet loss occurs and how severe it is. The next step with the dropped packet test is to find your network’s threshold for packet drop. Here, we have more advanced tools to understand protocol behavior, which we will discuss now.

IPEF3, TCP dump, TCP probe: Understanding protocol behavior. 

A: Tools such as iperf3, TCP dump, and TCP probe can be used to test and understand the effects of TCP. There is no point looking at a vendor’s reports and concluding that their “real-world” testing characteristics fit your environment. They are only guides, and “real-world” traffic tests are misleading. Usually, no standard RFC is used for vendor testing, and they will always try to make their products appear better by tailoring the test ( packet size, etc.) to suit their environment.

B: As an engineer, you must understand the scenarios you anticipate. Be careful of what you read. Recently, there were conflicting buffer testing results from Arista 7124S and Cisco Nexus 5000.

C: The Nexus 5000 works best when most ports are congested simultaneously, while the Arista 7100 performs best when some ports are congested but not all. These platforms have different buffer architectures regarding buffer sizes, disciplines, and management, influencing how you test.

TCP congestion control: Bandwidth capture effect

The discrepancy and uneven bandwidth allocation for flow boil down to the natural behavior of how TCP reacts and interacts with insufficient packet buffers and the resulting packet drops. The behavior is known as the TCP/IP bandwidth capture effect.

The TCP/IP bandwidth capture effect does not affect the overall bandwidth but more individual Query Completion Times and Flow Completion Times (FCT) for applications. Therefore, the QCT and FCT are prime metrics for measuring TCP-based application performance.

packet drop test
Diagram: Packet Drop Test and TCP.

A TCP stream’s transmission pace is based on a built-in feedback mechanism. The ACK packets from the receiver adjust the sender’s bandwidth to match the available network bandwidth. With each ACK received, the sender’s TCP increases the pace of sending packets to use all available bandwidth. On the other hand, TCP takes three duplicate ACK messages to conclude packet loss on the connection and start the retransmission process.

TCP-dropped flows

So, in the case of inadequate buffers, packets are dropped to signal the sender to ease the transmission rate. TCP-dropped flows start to back off and naturally receive less bandwidth than the other flows that do not back off.

The flows that don’t back off get hungry and take more bandwidth. This causes some flows to get more bandwidth than others unfairly. By default, the decision to drop some flows and leave others alone is not controlled and is made purely by chance.

This conceptually resembles the Ethernet CSMA/CD bandwidth capture effect in shared Ethernet. Stations colliding with other stations on a shared LAN would back off and receive less bandwidth. This is not too much of a problem because all switches support full-duplex.

DCTP & Explicit Congestion Notification (ECN)

There is a new variant of TCP called DCTP, which improves congestion behavior. When used with commodity, shallow-buffered switches, DCTCP uses 90% less buffer space within the network infrastructure than TCP. Unlike TCP, the protocol is also burst-tolerant and provides low short-flow latency. DCTCP relies on ECN to enhance the TCP congestion control algorithms.

DCTP tries to measure how often you experience congestion and use that to determine how fast it should reduce or increase its offered load based on the congestion level. DCTP certainly reduces latency and provides more appropriate behavior between streams. The recommended approach is to use DCTP with both ECN and Priority Flow control (pause environments).

Dropped packet test with Microbursts

Microbursts are a type of small bursty traffic pattern lasting only for a few microseconds, commonly seen in Web 2 environments. This traffic is the opposite of what we see with storage traffic, which always has large bursts.

Bursts only become a problem and cause packet loss when there is oversubscription; many communicate with one. This results in what is known as fan-in, which causes packet loss. Fan-in could be a communication pattern consisting of 23-to-1 or 47-to-1, n-to-many unicast, or multicast. All these sources send packets to one destination, causing congestion and packet drops. One way to overcome this is to have sufficient buffering.

Network devices need sufficient packet memory bandwidth to handle these types of bursts. Fan-in can increase end-to-end latency-critical application performance if they don’t have the required buffers. Of course, latency is never good for application performance, but it’s still not as bad as packet loss. When the switch can buffer traffic correctly, packet loss is eliminated, and the TCP window can scale to its maximum size.

Mice & elephant flows.

For the dropped packet test, you must consider two flow types in data center environments. First, we have a large elephant and smaller mice flow. Elephant flows might only represent a low proportion of the total flows but consume most of the total data volume.

Mice flow, for example, control and alarm/control messages, are usually pretty significant. As a result, they should be given priority over more significant elephant flows, but this is sometimes not the case with simple buffer types that don’t distinguish between flow types.

Properly regulating the elephant flows with intelligent switch buffers can be given priority. Mice flow is often bursty flow where one query is sent to many servers.

Many small queries are sent back to the single originating host. These messages are often small, only requiring 3 to 5 TCP packets. As a result, the TCP congestion control mechanism may not even be evoked as the congestion mechanisms take three duplicate ACK messages. However, due to the size of elephant flows, they will invoke the TCP congestion control mechanism (mice flows don’t as they are too small).

Testing packet loss: Reactions to buffer sizes

When combined in a shared buffer, mice and elephant flows react differently. Small, deep buffers operate on a first-come, first-served basis and do not distinguish between different flow sizes; everyone is treated equally. Elephants can fill out the buffers and starve mice’s flow, adding to their latency.

On the other hand, bandwidth-aggressive elephant flows can quickly fill up the buffer and impact sensitive mice flows. Longer latency results in longer flow completion time, a prime measuring metric for application performance.

On the other hand, intelligent buffers understand the types of flows and schedule accordingly. With intelligent buffers, elephant flows are given early congestion notification, and the mice flow is expedited under stress. This offers a better living arrangement for both mice and elephant flows.

First, you need to be able to measure your application performance and understand your scenarios. Small buffer switches are used for the most critical applications and do very well. You are unlikely to make a wrong decision with small buffers, so it’s better to start by tuning your application. Out-of-the-box behavior is generic and doesn’t consider failures or packet drops.

The way forward is understanding the application and tuning of host and network devices in an optimized leaf and spine fabric. If you have a lot of incest traffic, having large buffers on the leaf will benefit you more than having large buffers on the Spine.

Closing Points on Dropped Packet Test

Understanding the reasons behind packet drops is essential for any network troubleshooting endeavor. Common causes include network congestion, hardware issues, or misconfigured network devices. Congestion happens when the network is overwhelmed by too much traffic, leading to packet loss. Faulty hardware, such as a bad router or switch, can also contribute to this problem. Additionally, incorrect network settings or protocols can lead to packets being misrouted or lost altogether.

**The Dropped Packet Test: A Diagnostic Tool**

The dropped packet test is a fundamental diagnostic approach used by network administrators to identify and address packet loss issues. This test typically involves using tools like ping or traceroute to send packets to a specific destination and measure the percentage of packets that successfully make the trip. Through this process, administrators can pinpoint where in the network the packets are being dropped and take corrective actions.

Once the dropped packet test is performed, the results need to be carefully analyzed. A high percentage of packet loss indicates a significant problem that requires immediate attention. It could be a sign of a failing network component or a need for increased bandwidth. On the other hand, occasional packet drops might be normal in a busy network but should be monitored to ensure they do not escalate.

Addressing packet loss involves a combination of hardware upgrades, network optimization, and configuration adjustments. Replacing outdated or faulty equipment, increasing bandwidth, and optimizing network settings can significantly reduce packet drops. Regular network monitoring and maintenance are also critical in preventing packet loss from becoming a recurring issue.

Summary: Dropped Packet Test

Ensuring smooth and uninterrupted data transmission in the fast-paced networking world is vital. One of the challenges that network administrators often encounter is dealing with dropped packets. These packets, which fail to reach their intended destination, can cause latency, data loss, and overall network performance issues. This blog post delved into the dropped packet test, its importance, and how it can help identify and resolve network problems effectively.

Understanding Dropped Packets

Dropped packets are packets of data that are discarded or lost during transmission. This can occur for various reasons, such as network congestion, hardware failures, misconfigurations, or insufficient bandwidth. When packets are dropped, it disrupts the data flow, leading to delays or complete data loss. Understanding the impact of dropped packets is crucial for maintaining a reliable and efficient network.

Conducting the Dropped Packet Test

The dropped packet test is a diagnostic technique used to assess the quality and performance of a network. It involves sending test packets through the network and monitoring their successful delivery. By analyzing the results, network administrators can identify potential bottlenecks, misconfigurations, or faulty equipment that may be causing packet loss. This test can be performed using various tools and utilities available in the market, such as network analyzers or packet sniffers.

Interpreting Test Results

Once the dropped packet test is conducted, the next step is to interpret the test results. The results will typically provide information about the percentage of dropped packets, the specific network segments or devices where the drops occur, and the potential causes behind the packet loss. This valuable data allows administrators to promptly pinpoint and address the root causes of network performance issues.

Troubleshooting and Resolving Packet Loss

Network administrators can delve into troubleshooting and resolving the issue upon identifying the sources of packet loss. This may involve adjusting network configurations, upgrading hardware components, optimizing bandwidth allocation, or implementing Quality of Service (QoS) mechanisms. The specific actions will depend on the nature of the problem and the network infrastructure in place. Administrators can significantly improve network performance and maintain smooth data transmission with a systematic approach based on the dropped packet test results.

Conclusion

The dropped packet test is an indispensable tool in the arsenal of network administrators. Administrators can proactively address network issues and optimize performance by understanding the concept of dropped packets, conducting the test, and effectively interpreting the results. Organizations can ensure seamless communication, efficient data transfer, and enhanced productivity with a robust and reliable network.