rsz_1secure_automation333

Security Automation

Security Automation

In today's fast-paced digital landscape, ensuring robust security measures is paramount. Organizations increasingly turn to automation to bolster their security efforts as threats become more sophisticated. In this blog post, we will explore the world of security automation, its benefits, and how it can revolutionize how we safeguard our digital assets.

Security automation is utilizing technology to streamline and enhance security processes. It uses advanced algorithms, machine learning, and artificial intelligence to automate security tasks such as threat detection, incident response, and vulnerability management. By harnessing the power of automation, organizations can significantly reduce human errors, improve response times, and gain valuable insights into potential risks.

Table of Contents

Highlights: Security Automation

The Role of an Integrated Platform. 

If you are only using scripting in the security automation world, it will only get you so far. Eventually, you will need a fully integrated platform with your security and network infrastructure. For secure automation, there are different types of platforms you can use. This post will address two different types.

Firstly, how Red Hat Tower can integrate and configure network and security devices—also, Splunk SOAR. The SOAR meaning is about abstracting complexity away with security-focused playbooks. This reduces repetitive work and the ability to respond to security events in a standardized way.

Platform Examples

Backing up configs and collecting logs is only a tiny part of automation. Red Hat Ansible Tower and Splunk SOAR have new ways to reach the most advanced use cases. For security automation, Splunk Security with Splunk SOAR has a security-focused application consisting of specially crafted playbooks for every security requirement.

For example, you can check the domain and file reputation or create your own. On the other hand, Red Hat Tower Ansible Architecture allows you to securely reach and support the most edge use cases with increased portability using execution environments and automation mesh. In addition, you can securely bring automation to the edge with a certain overlay functionality.

Related: For additional pre-information, you may find the following post helpful:

  1. Cloud Native meaning
  2. SASE Definition



Security Automation

Key Secure Automation points:


  • No longer rely on scripting.

  • Red Hat Tower and Splunk SOAR.

  • Automation and Orchestration.

  • Ansible Tower security integrations

  • Splunk SOAR security-focused applications.

 

Back to Basics: Security Automation

We can apply our knowledge of automation to different scenarios and workloads that revolve around security. For example, when tedious and everyday tasks are automated, individuals doing those tasks can focus on solving the security problems they are dealing with. This enables a whole new way of looking at how we learn about security, how much we can store, process, and analyze log data (DFIR), and how we can keep applying security updates without interruptions (security operations).

Understanding Security Automation

At its core, security automation involves using advanced technologies and intelligent systems to automate various security processes. It enables organizations to streamline security operations, detect real-time threats, and respond swiftly and effectively. From threat intelligence gathering to incident response and recovery, automation is pivotal in strengthening an organization’s security posture.

The Role of Automation

Security Automation Main Components

Security Automation 

  • By deploying intelligent monitoring systems, security automation can swiftly identify and respond to potential threats in real-time.

  • With security automation, incidents can be detected, analyzed, and remediated swiftly and accurately

  • Security automation tools can continuously scan networks, applications, and systems, providing organizations with real-time vulnerability assessments.

♦ Key Benefits of Security Automation

a) Enhanced Threat Detection: By deploying intelligent monitoring systems, security automation can swiftly identify and respond to potential threats in real time. This proactive approach minimizes the risk of breaches and allows security teams to stay one step ahead of malicious actors.

b) Accelerated Incident Response: Manual incident response can be time-consuming and prone to delays. However, with security automation, incidents can be detected, analyzed, and remediated swiftly and accurately. Automated incident response workflows can help contain and mitigate security breaches before they escalate, reducing the impact on the organization.

c) Efficient Vulnerability Management: Identifying and patching vulnerabilities is critical to maintaining a secure infrastructure. Security automation tools can continuously scan networks, applications, and systems, providing organizations with real-time vulnerability assessments. This enables security teams to prioritize and address vulnerabilities promptly, reducing the window of opportunity for potential attackers.

Challenges and Implementation Considerations

While security automation offers numerous advantages, there are some considerations to consider. Organizations must carefully evaluate their existing security infrastructure, define clear objectives, and select the appropriate automation tools and technologies. Additionally, ensuring adequate training and collaboration between security teams and automation systems is essential to maximize the effectiveness of the automation process.

Continuous Adaptation and Updates

As cyber threats evolve, security automation solutions must stay up-to-date to counter new attack vectors effectively. Regular updates and continuous monitoring are necessary to ensure that automation systems are equipped to handle emerging threats.

Balancing Automation and Human Expertise

While automation brings numerous benefits, balancing automated security processes and human expertise is crucial. Human intervention is still essential for critical decision-making, advanced analysis, and addressing complex security challenges that may require contextual knowledge.

Security Automation: The World of Scripting

In the traditional world of security automation, it was common to use custom in-house automation frequently. As a result, we have a variety of self-driving scripting methods that solve specific short-term security problems. For example, for secure automation, you may need to collect logs from several devices for security. However, this is far from a scalable and sustainable long-term approach to an enterprise’s automation strategy.

With more self-maintained scripting tools and working in siloed, you are creating more security blind spots. With more point tools, you have to make more silos and potential security blind spots, which may trigger the adoption of more narrowly focused tools. The more tools you have, the less control over your environment that could easily open up the spread of lateral movements.

♦ The need for a security platform

For example, look at lateral movements in an Active Directory (AD) network. Lateral movements are a real problem, with some advances in lateral movement techniques being performed using Metasploit, Impact, and PurpleSharp. However, it can be hard to detect that this is a bad actor or a sys admin carrying out daily activities.

Once the bad actor stealthily navigates the network with lateral movements, they can compromise accounts, find valuable assets, and gradually exfiltrate data. All of which can be unnoticed with a below-the-radar style of attacks. A favored vector is to use DNS as a method to exfiltrate data. Therefore, DNS often needs to be checked.

Secure automation
Secure automation and the issue of lateral movements.

SOAR meaning: A quick point.

In this case, you should integrate Splunk SOAR with User Behaviour Analytics (UBA) to detect deviations from the baseline. UBA works with unsupervised machine learning and builds profiles of entities on the network. Today’s attacks are distributed, and multiple entities are used to stage an attack.

An anomaly is sent once there is a significant deviation from normal entity behavior. Of course, an anomaly does not necessarily mean a threat. However, the anomaly can be combined with other network and infrastructure aspects to determine if a bad actor exists. So, for example, we would look at the time of day, frequency, or any other usual activity, such as privilege escalation techniques.

Video: SOAR and SIEM from Splunk

In this product demonstration, we are going to address Splunk Security. Specifically, we will look at the Splunk SIEM and Splunk SOAR. Both of these products are well integrated and abstract, and you have a lot of complexity with security. We will first look at today’s challenging landscape that security teams face.

And how you can use Splunk Products to overcome these challenges. In today’s infrastructure, we have a lot of tools spread around that are not well integrated, which will decrease your security posture.

Introducing Splunk Security
Prev 1 of 1 Next
Prev 1 of 1 Next

Lack of Speed

Without integrated security tools with security automation and a lack of automated and orchestration processes. The manual response slows MTTR and increases the possibility of a successful threat. Bad actors can breach and exfiltrate data when the mean time to detect (MTTD) is too long.

So, the manual approach to detecting, triaging, and responding to threats must be faster. For example, Ransomware is quick; once the binaries are executed, it’s game over. It would help if you focused your efforts on the detection phase of the kill chain. And catch any lateral movements even when they pivot to valuable assets.

SOAR Meaning.
Diagram: Ransomware is quick—the need for SOAR and Tower.

The Need for Security Automation

To address this challenge, you need a security solution that integrates its existing security products to reduce the response and remediation gap. In addition, these automation and orchestration events must be carried out across all its security vendors to consolidate response and remediation.

For secure automation, a unified and standard response to security can be made using pre-approved policies, consistently configuring resources according to pre-approved guidelines, and proactively maintaining them in a repeatable fashion.

Security-focused content collection

This provides a faster, more efficient, and streamlined way to automate the identification, triage, and response processes to security events. In addition, we can use security-focused content. In the case of Red Hat Tower, this comes in the form of collections of roles and modules dedicated to security teams.

Splunk SOAR also has secure-focused applications and content ready to use in the Splunk database. The pre-approved policies and playbooks of Ansible Tower and Splunk SOAR will reduce the chances of misconfiguration and speed up all aspects of security investigation.

SOAR meaning.
Diagram: SOAR meaning and security-focused application.

Secure Automation and Orchestration

When a few waves of Malware target you, Phishing, Ransomware, and under-the-radar attacks, Automation and orchestration are the only ways to combat this, security automation does most of the work, so you no longer have to weed through and manually address every alert as it comes in or process every security action or task.

Level of automation maturity

For example, the level of automation you want to adopt depends on the maturity level of the automation you already have in our environments. If you are new to automation, you can have SOAR or Tower playbooks send an alert for further investigation. So, you can start with a semi-automated approach.

However, if you are further in your automation strategy, you can have different playbooks chained together to carry out a coherent security detection and response. It’s easy to do this in SOAR with a playbook visualizer, and Ansible Tower has workflow templates that can be used with role-based access control.

Red Hat Tower: How to Start

In most organizations, we have IT operations and a security team. These teams have traditionally disjoint roles and responsibilities. The IT Operations are hardening systems, managing the infrastructure, and deploying and maintaining systems. The security operations team would track ongoing threats, Intrusion Detection/Prevention, and perform firewall management activities.

Ansible has a common language.

With these two disjointed teams, we can use Ansible as the common automation language for everyone across your organization. Specifically, Red Hat Tower can be the common language between security tools and can be used for various security use cases that can bring the two teams together.

Red Hat Ansible Tower
Diagram: Red Hat Ansible Tower as the common language.

Red Hat Tower: Security Automation

Red Hat Tower can orchestrate security systems using a series of curated security collections of modules, roles, and playbooks to investigate and respond to threats using trusted content. This enables you to coordinate your enterprise security systems to perform several security duties, such as investigation enrichment, threat hunting, and incident response.

So, you can integrate Red Hat Tower with your security infrastructure here. And have pre-approved playbooks ready to run upon threat detection. So, for example, a playbook can be automatically triggered on the results of a security scan. The following lists some of the use cases for Ansible Tower playbooks.

Secure Automation: Security Patching

You could start with patching. Not patching your servers is one of the biggest causes of breaches. Automated patching boosts system security and stability, improving uptime. And this will be noticed straight away.

Secure Automation: System Hardening

Then, activities such as system hardening are something everyone can do for all systems. With automation, we can rapidly identify systems that require patches or reconfiguration. Then, it is easier to apply patches or change system settings consistently across a large number of systems according to defined baselines. For example, make changes to your SSH config.

Here, you can use automation to configure the SSH daemon, not to allow authentication using an empty password. You can run these playbooks in check mode so those that don’t require full automation rights can run checks safely. Again, I would combine this with role-based access control.

Secure Automation: Network Configuration 

For network management, you can configure an ACL or filter to restrict ACL or filter management access to the device from only the management network. You can also use automation to lock down who has managed to access specific subnets.

Red Hat Tower
Diagram: Security integration with Red Hat Ansible Tower.

Secure Automation: Firewall Integration

If an increase in incident management tickets is due to incorrect firewall rules causing an increase in change requests, aim to reduce the number of tickets or change requests through automation. For our Firewall integration, the role of automation can speed up policy and log configuration changes.

For example, we can add an allowlist entry in the firewall configuration to allow traffic from a particular machine to another.

We can automate a playbook that adds the source and destination IPs as variables. Then, when a source and destination object are defined, the actual access rule between those is defined.

Secure Automation: Intrusion Detection and Prevention Systems

Tower can simplify the rule and log management for your intrusion detection and prevention systems. Automation can be used to manage IDPS rules, and IDPS roles are offered. These roles can work with multiple IDPS providers, so the corresponding playbook needs to have a variable stating the actual IDPS provider. 

Once the role is imported, and this is the first step, the new IDPS rule is handed over via defined variables:

Secure Automation: Privileged Access Management (PAM) Tools

Ansible Tower can streamline the rotation and management of privileged credentials to automate the prevention. So we can streamline credential management, which is hard to do manually. 

Secure Automation: Endpoint Protection

Automation can simplify everyday endpoint management tasks, integrate into Endpoint Protection, and provide event-driven detection, quarantining, and remediation. 

Advanced Red Hat Tower Features

Job Templates vs. Workflow Template

When creating a job template, we choose a job or workflow template. We choose the job template if we want to be able to develop simple employment out of this template. However, creating more complex jobs composed of multiple job templates, with flow control features between one position and the next, is possible with a workflow template. This workflow template can also be integrated into your CI/CD pipelines and Jenkins.

Red Hat Tower
Diagram: Red Hat Tower with Templates.

Security Benefits

This makes it easier to have playbooks that are job templates from different teams. This is used in large environments, so multiple job templates are connected. Then, complex interactions between jobs can be defined in a workflow before the next job starts, depending on the previous position. Any inventory and any credentials can be used. So, it brings a lot of flexibility to automation.

In its multi-playbook workflows, the user can create pipelines of playbooks to be executed in sequence on any inventory using one or more users’ credentials. Security teams can configure a series of jobs that share inventory, playbooks, or permissions to automate investigations or remediations fully, bringing a lot of consistency and security benefits.

Ansible Tower and Scheduling

With Ansible Tower, we have Templates with the Launch feature; think of this as an ad hoc way to run Ansible for one of the tasks. However, if you are using Tower, you should use Schedules to control your automation better. For example, you may have a maintenance window when you apply changes. Here, we can set the times and frequency of playbook runs.

Scheduling this playbook in Tower will automatically refresh systems significantly out of spec, including calling back into Tower to apply our basic configuration once new instances are spun up with the provisioning callback feature. I find this useful for dynamic cloud environments.

Video: Ansible Tower For Beginners

In this product demonstration, we will review the critical components of Ansible Tower and its functionality. Ansible Tower is a considerable step up from the Ansible CLI you may have used with Ansible Core.

We will discuss the autonomy of an automaton job that shares similar objects when using the CLI but has considerable differences, such as Job Templates, better Credentials management, and inventory that you may have encountered with Ansible CLI and Ansible Tower Projects.

Ansible Tower for beginners
Prev 1 of 1 Next
Prev 1 of 1 Next

GitHub for Playbooks

GitHub is all about version control, so multiple people can work on different types of code and review and merge changes. So, it’s all about managing change in your other environments. So when Red Hat Tower runs the playbooks, it checks the URL specified in your playbooks, and it’s here we can have multiple options that can enhance your GitHub integrations, such as webhooks and personal access tokens.

Benefits: Removes Inconsistency of Playbooks

This is an important feature to enable as if you don’t have it checked; there is the possibility that someone notices a problem in a playbook and fixes it, then they run the playbook feeling sure that they are running the latest version. Someone must remember to run the synchronization task before running the playbook, effectively running the older version. Therefore, when using this option, we are removing the inconsistency of playbooks. So, increasing your security posture is very important. A lot of security breaches first start with a simple misconfiguration.

SOAR for Automation: SOAR Meaning

SOAR Meaning

The difference between that attack being a routine annoyance versus a catastrophic event comes down to the robustness of a product and the technologies you choose to adopt. Splunk has several products that can help you here—ranging from the Splunk SIEM to the Splunk SOAR. There are also several Observability products, all of which are well-integrated and can assist you with security automation. 

Customers can solve their primary SIEM use cases using Splunk Enterprise and Splunk Cloud, core Splunk platforms that provide collection, indexing, search, and reporting capabilities. So, the Splunk SIEM collects or ingests the machine data and can make this available to the Splunk SOAR.

Splunk SOAR Meaning

Splunk SOAR drives accuracy and consistency in the incident response process. With SOAR, workflows can be orchestrated via integrations with other technologies and automated to achieve desired outcomes. Utilizing automation with Splunk SOAR can dramatically reduce the time to investigate malware alerts, driving accuracy and consistency across its incident response processes.

SOAR Meaning.
Diagram: SOAR and SIEM integrations.

SOAR and Phantom

SOAR is the rebranding of Phantom but has multi-deployment options. Phantom was just on-premise, but now we have both on-premise and on-cloud delivery.  Consider SOAR as a layer of connective tissue for all security operations.

So, it needs to automate the decision-making and acting. SOAR can take proceeds and take them into playbooks so we can create complex security operation workflows.

So we have an extensive collection of security-focused SOAR applications that interact with the API of existing security and network infrastructure, such as your Firewalls, to support activities such as containment and recovery. We’ll talk about these in just a moment.

Automation Broker

We have an Automation Broker, a modified version of Splunk SOAR with reduced features, so it’s a reverse proxy for automation actions. The Automation Broker is a docker container that uses an encrypted and outbound connection from Splunk Cloud SOAR to the customer premises. It would help to open inbound ports to the perimeter firewall, as the communication is set outbound on the firewalls.

SOAR Meaning: Security-Focused Playbooks

Instead of manually going into other security tools and injecting data, enrich logs and carry out actions such as blocking or manual analysis intervention. SOAR playbooks can be used. You can have several security-focused playbooks that automatically carry out the tasks. The SOAR playbook can automate many repetitive duties. For example, you no longer have to respond manually to repetitive incidents. For example, you can have Splunk SOAR respond to malicious emails with playbooks. 

Actions based on the Playbooks

Then, we could have a list of actions based on playbook results. This could include additional investigation tasks or notifying users. Finally, when you want to push the boundaries of automation, we could have several steps to isolate or quarantine hosts depending on the results of the previous playbooks, which would be integrated with multi-factor authentication to ensure the action is appropriately authorized. 

Additionally, over 800 other security-related apps on Splunkbase with pre-built searches, reports, and visualizations for specific third-party security vendors. These ready-to-use apps and add-ons help monitor security, a next-generation firewall, and advanced threat management capabilities. You can even build your custom application, from monitoring and Observability to improving safety.

SOAR Meaning: SOAR Apps

So you are using many tools from many vendors, and when you respond, each one of these tools does a different event, and each tool does another function. Splunk integrates with all devices with API, and SOAR can directly integrate all tools to act in a specific sequence.

So it can coordinate all security actions. With SOAR, you don’t get rid of your existing tools; instead, SOAR can sit between them and abstract a lot of complexity.

Think of Splunk as the conductor that supports over 350 apps. They have tools to build apps; you can create your own if it has an API. In addition, it can perform over 2000 actions. SOAR apps are Python modules that collect events from anything, SIEM, and then normalize the information and make them available to playbooks.

SOAR Meaning: Example: SOAR playbooks

So, we have a network-based sandbox to detect malware that can enter via email. So, an Alert is received from SIEM, sent to SOAR, and triggers a playbook. SOAR communicates back to SIEM to query Active Directory to identify who is there and which department, and based on that, SOAR can query Carbon Black to see how the threat lives.

Finally, the SOAR can notify an analyst to intervene and double-check the results manually. This could take 30 mins by hand, but SOAR can do it in 30 seconds. 

Let’s look at another SOAR playbook in action. A Splunk SOAR playbook is triggered when an email malware alert is received. Due to the lack of context in these alerts, Splunk SOAR’s first order within the playbook is to query the security information and event management (SIEM) solution for all recipients, then Active Directory to collect context from all affected users’ profiles, business groups, titles, and locations.

  • A key point: SOAR means with workbooks and phases

Another name for a playbook is the SOAR workbook. Each workbook can have several phases, and each phase can have tasks to carry out our security actions. In this scenario, there will be one phase. And several playbooks in a single step. Some playbooks can be triggered automatically, and some are invoked manually.

Then, some are being gathered manually but will have prompts for additional information. These tasks will be semi-automatic because they can automatically import data for you and enrich events. Furthermore, they can import this data and enhance events from several platforms. 

Splunk and Lateral Movements

You can have playbooks to hunt for lateral movements. There are many ways to move laterally in active directory networks. For example, Psexec is a sysadmin tool that allows admins to connect to other machines and perform admin tasks remotely. However, what if psexec is used to gain a remote shell or execute a PowerShell cradle on a remote device? When looking for lateral movement, we identify processes connecting remotely to a host.

To start a threat investigation, we could have a playbook to conduct an initial search for a known lateral movement activity. There is a wealth of information in Windows security logs. The playbook can look for authentication events over the network from rare or unusual hosts or users.

SOAR
Diagram: SOAR and the hunt for bad actors.

Event Window Code

For example, in a Windows event log, you would see a Windows event code for successful login, another log for a network connection, and another for privilege escalation events. Each event doesn’t mean much by itself but indicates a threat together. For example, here you can see that someone has used an admin account to connect over the network from a particular host and gained command-line access to a victim host.

Splunk SOAR’s visual playbook editor

Splunk SOAR comes with 100 pre-made playbooks, so you can start automating security tasks immediately and hunt for lateral movements. To simplify life, we have a Splunk SOAR visual playbook editor that makes creating, editing, implementing, and scaling automated playbooks easier to help your business eliminate security analyst grunt work.  

SOAR Meaning: Splunk Intelligence Management (TruSTAR) Indicator Enrichment

Then, we have a Splunk Intelligence Management (TruSTAR) Indicator Enrichment. This playbook uses Splunk Intelligence Management normalized indicator enrichment, which is captured within the notes of a container, for an analyst to view details and specify subsequent actions directly within a single Splunk SOAR prompt for a manual response.

SOAR Meaning: Crowdstrike Malware Triage

There is a Cowdstrike Malware Triage. This playbook walks through the steps performed automatically by Splunk SOAR to triage file hashes ingested from Crowdstrike and quarantine potentially infected devices.

SOAR Meaning: Finding and Disabling Inactive Users on AWS Splunk SOAR’s

Then, there are playbooks specific to cloud environments. Finding and Disabling Inactive Users on AWS Splunk SOAR’s orchestration, automation, response, collaboration, and case management capabilities are available from your mobile device. 

 

Summary: Security Automation

In today’s rapidly evolving digital landscape, ensuring the security of our online presence has become paramount. With the ever-increasing number of cyber threats, organizations and individuals alike are seeking efficient and effective ways to protect their sensitive information. This is where security automation steps in, revolutionizing the way we defend ourselves from potential breaches and attacks. In this blog post, we explored the concept of security automation, its benefits, and how it can fortify your digital world.

Section 1: Understanding Security Automation

Security automation refers to the process of automating security-related tasks and operations, reducing the need for manual intervention. It involves utilizing advanced technologies, such as artificial intelligence and machine learning, to streamline security processes, detect vulnerabilities, and respond to potential threats in real-time.

Section 2: Benefits of Security Automation

2.1 Enhanced Threat Detection and Response:

By leveraging automation, security systems can continuously monitor networks, applications, and user behavior, instantly detecting any suspicious activities. Automated threat response mechanisms allow for swift actions, minimizing the potential damage caused by cyber attacks.

2.2 Time and Cost Efficiency:

Automation eliminates the need for manual security tasks, freeing up valuable time for security teams to focus on more critical issues. Additionally, by reducing human intervention, organizations can achieve significant cost savings in terms of personnel and resources.

Section 3: Strengthening Security Measures

3.1 Proactive Vulnerability Management:

Security automation enables organizations to proactively identify and address vulnerabilities before they can be exploited by malicious actors. Automated vulnerability scanning, patch management, and configuration checks help maintain a robust security posture.

3.2 Continuous Compliance Monitoring:

Compliance with industry standards and regulations is crucial for organizations. Security automation can ensure continuous compliance by automating the monitoring and reporting of security controls, reducing the risk of non-compliance penalties.

Section 4: Integration and Scalability

4.1 Seamless Integration with Existing Systems:

Modern security automation solutions are designed to seamlessly integrate with a variety of existing security tools and systems. This allows organizations to build a comprehensive security ecosystem that works harmoniously to protect their digital assets.

4.2 Scalability for Growing Demands:

As organizations expand their digital footprint, the security landscape becomes more complex. Security automation provides the scalability required to handle growing demands efficiently, ensuring that security measures keep pace with rapid business growth.

Conclusion:

Security automation is a game-changer in the world of cybersecurity. By harnessing the power of automation, organizations can bolster their defenses, detect threats in real-time, and respond swiftly to potential breaches. The benefits of security automation extend beyond cost and time savings, providing a proactive and scalable approach to safeguarding our digital world.

Splunk Security

Splunk Security

Splunk Security

In today's digital landscape, organizations face increasing cybersecurity challenges. Top priorities are protecting sensitive data, detecting and responding to threats, and ensuring compliance. This is where Splunk Security comes into play. In this blog post, we will explore the capabilities and benefits of Splunk Security, showcasing how it can empower your organization to achieve robust cybersecurity.

Splunk Security is a comprehensive security platform that offers real-time monitoring, threat intelligence, incident response, and compliance management. With its advanced analytics and machine learning capabilities, Splunk Security provides organizations with deep insights into their security posture, enabling proactive detection and response to potential threats.

Summary: Splunk Security

Table of Contents

The Role of Visibility and Analytics

Splunk Security is a powerful tool for monitoring the security of an organization’s network. Splunk Security provides real-time visibility and analytics into network traffic, helping organizations promptly detect and respond to security threats. It can identify malicious activity and vulnerabilities and help organizations protect their assets proactively.

Splunk Security is a comprehensive solution offering various security use cases, including threat detection, vulnerability management, incident response, and compliance reporting features. It is designed to be easy to use and secure, making it an ideal solution for any organization.

Example: Splunk Enterprise Security

The product set offering Splunk Security has several well-integrated products, such as Splunk Enterprise Security, also known as Splunk ES, which is the Splunk SIEM, Splunk SOAR, and User Behavior Analytics (UBA), and a variety of Observability tools at your disposal.

In addition, SOAR Splunk brings a lot of power, especially when you push the boundaries of automation to fully detect and respond to scenarios with multiple phases and tasks. Finally, consider Splunk, the platform in the middle of your infrastructure that removes all the complexity.

One significant benefit to using Splunk security is that it can ingest data from every source and combine it into one platform that will fully satisfy all of your security requirements.

Related: For pre-information, you may find the following helpful:

  1. Security Automation
  2. Observability vs. Monitoring
  3. Network Visibility
  4. Ansible Architecture
  5. Ansible Tower
  6. OpenStack Neutron
  7. OpenvSwitch Performance
  8. Event Stream Processing

Splunk Security 

Splunk Product 

Splunk Enterprise Security ( Splunk ES )

 This is the Splunk SIEM 


Splunk SOAR

Low-code Playbooks

Observabilty Tools

RUM and APM

Splunk Enterprise 

Search and Ingest

Back to Basics: Splunk Security

Splunk Monitoring

Splunk is software for monitoring, searching, analyzing, and visualizing real-time machine-generated data. This tool can monitor and read several log files and store data as events in indexers. In addition, it uses dashboards to visualize data in various forms. Splunk is commonly thought of as “a Google for log files” because, like Google, it can be used to define the state of a network and the activities taking place within it. It is a centralized log management tool but works well with structured and unstructured data.

Real-time monitoring and Detection

One of Splunk Security’s critical strengths is its ability to monitor and analyze massive volumes of data in real-time. By aggregating data from various sources such as logs, network traffic, and security devices, Splunk Security provides a unified view of the entire IT environment. This enables the detection of anomalies, suspicious activities, and potential threats, empowering security teams to take immediate action and mitigate risks effectively.

The Role of Splunk Security

Splunk Security Main Components

Splunk Security

  • Splunk is software for monitoring, searching, analyzing, and visualizing real-time machine-generated data.

  • By aggregating data from various sources such as logs, network traffic, and security devices, Splunk Security provides a unified view of the entire IT environment.

  • Splunk’s forensics capabilities enable detailed analysis and post-incident investigations, helping organizations learn from past incidents and improve their security posture.

Threat Intelligence Integration

Splunk Security integrates seamlessly with external threat intelligence feeds, enriching the analysis and detection capabilities. By leveraging threat intelligence data from trusted sources, organizations can stay ahead of emerging threats and proactively defend their infrastructure. Splunk’s threat intelligence integration empowers security teams to identify patterns, correlate events, and make well-informed real-time decisions.

Incident Response and Forensics

When a security incident occurs, time is of the essence—Splunk Security streamlines incident response by providing automated workflows, playbooks, and case management capabilities. Security teams can quickly investigate and triage alerts, gather evidence, and take necessary actions to contain and remediate the incident. Splunk’s forensics capabilities enable detailed analysis and post-incident investigations, helping organizations learn from past incidents and improve their security posture.

Common Security Challenges

Security Teams are under pressure.

Security teams face diverse challenges, from repetitive tasks to cumbersome processes. They often need help with constant alerts, manual investigations, and the array of tools distributed throughout the organization.

Hundreds of security alerts overpower analysts to investigate and resolve each day fully. As a result, security operations work is rife with monotonous, routine, and repetitive tasks with a complete lack of integration and process.

Lack of integration and process

Some security teams built their log analytics and incident response capabilities from the ground up. However, such a custom-made logging tool requires manually assembling correlated logs with too many custom-built and siloed point products.

Teams are expected to juggle disconnected security tools, consisting of static, independent controls with little or no integration.

In the current environment, many security teams must establish workflows and standard operating procedures for different security events. As a result, analysts can only act quickly and decisively when responding to an attack. However, the real problem is the manual process, especially with manual scripting. 

Issues of scripting

When using traditional scripting for automation, carrying out this capability across many security vendors will be challenging. In addition, each vendor may change the API for its product. As a result, the automation scripts must change, leading to management and maintenance challenges. Most will only be able to integrate and create an automated workflow partially. The difficult-to-maintain processes lead to a need for more context. 

Splunk Security
Diagram: Splunk Security

Security Threats

Phishing, Ransomware, and Supply Chain

We have a rapidly changing threat landscape that includes everything from Phishing to the proliferation of Malware, Supply Chain, and Ransomware. In addition, there is a pervasive nature of Ransomware to when it started, and it has grown considerably since the early Ransomware strains such as Wanna Cry. So, we have a Ransomware wave with loads of Ransomware families that encrypt in different ways. 

Remember that Ransomware applies Malware to many endpoints simultaneously, so if you have a network design of extensive macro segmentation with no intra-segment filtering. Ransomware can compromise all hosts that have valuable assets.

Below is an example of a phishing attack. I’m using the Credential Harvestor to sniff credentials on a Google Web Template. The credential harvester, a credential stealer, is malicious software designed to steal sensitive login information from unsuspecting victims. Its primary targets are online banking platforms, email accounts, and social media platforms. By infiltrating a victim’s device, it quietly captures keystrokes, takes screenshots, or even intercepts network traffic to gather valuable login credentials.

♦ Safeguarding Against Credential Harvestors

Protecting oneself from the clutches of a credential harvester requires a proactive approach. Here are some essential tips to enhance your cybersecurity:

1. Strengthen Passwords: Use complex, unique passwords for each online account, incorporating a mix of uppercase and lowercase letters, numbers, and symbols.

2. Enable Two-Factor Authentication: Implement an additional layer of security by enabling two-factor authentication whenever available. This adds an extra step for authentication, making it harder for attackers to gain unauthorized access.

3. Exercise Caution with Emails and Links: Be vigilant when opening emails or clicking on links, especially from unknown senders. Avoid providing login credentials on suspicious websites or pop-up windows.

4. Keep Software Updated: Regularly update your operating system, antivirus software, and applications to ensure you have the latest security patches and protection against known vulnerabilities.

Malware will endeavor to destroy backups, perform data exfiltration, and then corrupt the data. Once the Ransomware binaries have been executed, encryption starts its game over. 

How might the adversary hop from one machine to another without exploiting vulnerabilities? Some long-established tactics are known: remotely creating WMI processes, scheduling tasks, and building services. However, they often go unseen. It would help if you focused on the detection. For Ransomware, we have about a 5-day window. You will not catch them with the manual process within such a short time.

Splunk Enterprise Security
Diagram: Splunk Enterprise Security. The threats.

Easy to evade; Malware is polymorphic.

Despite innovations like next-generation anti-malware solutions, threat intelligence feeds, and government collaboration initiatives and mandates such as zero trust, many of these attack techniques evade even the most innovative security tools today. For example, malware is polymorphic and programmed to avoid common signatures and rules, and we know that the perimeter-based defense mechanisms have not worked for a while now.

It is hard to do things quickly and thoroughly understanding

Fast detecting and responding to security events takes a lot of work. A security analyst can spend hours on an alert. Multiply that by the hundreds of security alerts they deal with daily. For example, it’s common for an analyst to spend 90 minutes on average to investigate and contain a single phishing alert.

On top of that, a SOC could receive hundreds of phishing emails in a given day. Security analysts are overwhelmed with many phishing alerts to investigate and respond to. It takes too long to process each before the potential threat could cause damage manually. Phishing emails are a great starting point for Splunk SOAR to reply with low-code playbooks automatically.

Splunk ES
Diagram: Splunk ES.

Not to mention that businesses frequently add contractors and others with privileged access to networks, it becomes challenging to understand whether everyone complies with the security policies and best practices or if there are any hidden risks in these activities. As a result, they face new challenges around secure configuration, software vulnerabilities, compliance, and maintaining an audit trail of access and training.

Splunk Security & Splunk ES: The Way, Forward

Data Integration and Automated Response

So, you need to design security around data and build an approach to detect and respond to those risks. This requires a platform that can not collect the data but gain valuable insights. Of course, many platforms can collect data, but turning this data into valuable insights for security is an entirely different challenge.

Therefore, data integration and an automated response will play a more significant role in security. This is where Splunk Enterprise Security ( Splunk ES), Splunk SIEM, and Splunk SOAR products can assist.

So we can’t stop attacks, and you will get breached even by adopting the most robust zero-trust principles. All we can do is find ways to mitigate the risk and mitigate risks promptly. And Splunk has a variety of security products that can help you do this.

One of the most critical ways to evolve and stay ahead is to look at data and derive helpful security insights that can help you detect and respond to known, unknown, and advanced threats and fully use automation and orchestration to improve your security posture.

Splunk Enterprise Security and Splunk SOAR

Automation is changing how teams traditionally use a Splunk SIEM. Splunk SOAR and Splunk Enterprise Security ( Splunk ES ) complement each other very well and allow us to improve security capabilities. So now we have a platform approach to security to fulfill diverse security use cases.

Introduction to Splunk SOAR

Splunk SOAR: Orchestration and automation  

The prime components of Splunk SOAR are automation and orchestration. With orchestration and automation, you will better support product-level workflows that allow security teams to automate complex processes across disparate products.

Introducing automation and orchestrating workflows and responses across your security stack will enable each previously siloed security product to participate more seamlessly in your defense strategy. So, we still have the unique tools, but Splunk SOAR is in the middle of orchestrating the events for each device with Playbooks.

A Splunk SOAR tool can easily thread the intelligence from multiple devices within the SOC, enriching alert data and surfacing it into a single interface. In addition, there is a playbook visualizer, so you can easily stick together security tasks.

Splunk SOAR
Diagram: Splunk SOAR
  • A key point: Integrating existing security infrastructure

By automating the data collection and enrichment process from various sources, the analyst can see valuable details related to the alert as soon as it surfaces. This boosts your defenses by integrating existing security infrastructure, creating a mesh of more difficult-to-penetrate protection.

Splunk SOAR supports 350+ third-party tools and 2,400+ actions so that you can connect and coordinate workflows across teams and tools. This increases the speed of your investigation and response and unlocks value from previous investments. We will have a look at these playbooks in just a moment.

Introduction to Splunk Enterprise Security ( Splunk ES & Splunk SIEM )

Splunk Enterprise Security, the Splunk SIEM technology, is typically deployed to do the following security activities.

Splunk Enterprise Security 

Best Choice

  • Discover external and internal threats.

  • Monitor users’ activities

  • Monitor server and database resource

  • Support compliance requirements 

  • Provide analytics and workflow

  1. Discover external and internal threats. This will help you detect compromised credentials and privileged attacks.
  2. Monitor users’ activities and specific types of users, such as those with privileged access and access to critical data assets. For example, this will help you see if users use the sysadmin tool Psexec or other means to move throughout the network laterally.
  3. Monitor server and database resource access and offer some data exfiltration monitoring capabilities. This can help you detect moments before Ransomware starts to encrypt your files.
  4. Support compliance requirements and provide compliance reporting.
  5. Provide analytics and workflow to support incident response, orchestrate and automate actions and workflows by integrating with other tools such as the SOAR.

Splunk ES & Splunk SIEM: The Value of Machine Data for Security

Splunk ES can complete these activities by gathering and turning unstructured data into valuable meaning. For example, to understand the evidence of an attack and the movement of an attack in an organization, we need to turn to machine data.

Armed with that data, security teams can remediate known threats better and proactively respond to new threats in real time to minimize any potential damage to the organization.

Machine data and monitoring

Data can come in many forms, such as standard logs. So, by ingesting your application logs into Splunk SIEM, you can determine what, for example, is the latency in your application or what the raw error rate of your web server is. This can be carried out by using a simple SPL query against the.

Then, we have a security use case, which is our main concern. Machine data can tell you where a specific attack is coming from or how many login attempts result from invalid user names.

Machine data is everywhere and flows from all the devices we interact with, making up around 90% of today’s data. And harnessing this data can give you powerful security insights. However, machine data can be in many formats, such as structured and unstructured. As a result, it can be challenging to predict and process.

Splunk SIEM. How Splunk Can Leverage Machine Data

This is where Splunk SIEM comes into play, and it can take any data and create an intelligent, searchable index—adding structure to previously unstructured data. This will allow you to extract all sorts of insights, which can be helpful for security and user behavior monitoring. In the case of Splunk Enterprise Security ( Splunk ES ), it helps you know your data very quickly. Splunk is a big data platform for machine data. It collects raw unstructured data and converts them into searchable events.

Splunk ES and Splunk SIEM Stage: Aggregates and Analyzes event data 

SIEM technology aggregates and analyzes the event data produced by networks, devices, systems, and applications. The primary data source has been time-series-based log data, but SIEM technology is evolving to process and leverage other forms of data. SIEM technology aggregates event data produced by security devices, network infrastructure, systems, and applications. 

  • Any source of data

The Index collects data from virtually any source. As data enters Splunk Enterprise Security, it will examine data and understand how to process it. When they find a match, they label the data with source types. At the heart of Splunk is the Index. And data gets ingested into the Index. The Index contains your machine data from various servers, network devices, and web applications.

These events are then stored in the Splunk index. Once the events are in the Index, they can be searched. You can find events that contain values across multiple data sources so that you can run analysis and statistics on events using the Splunk search language.

SOAR Splunk
Diagram: SOAR Splunk

Splunk ES and Splunk SIEM Stage: Searching and Analysis

Once data gets ingested into the Index, it is available for searching and analysis. Then, you can save search results into reports that can then be used to power dashboard panels. And that comes not just from tools that can sift through the volume of alerts and distractions. Analysts must find the cause, impact, and best resolution from all infrastructure elements. This will include the applications, networks, devices, and human users.

Splunk ES and Splunk SIEM Stage: Notable Events and Incident Review

Splunk Enterprise Security allows you to streamline the incident management process. Consolidating incident management will enable effective lifecycle management of security incidents. This, in turn, enables rapid decision-making. Here, we automatically align all security contexts together for fast incident qualification. 

Splunk ES and Splunk SIEM Stage: Event Correlation Rule Management

With Splunk Security, we have a framework for rule management where we can manage all correlation rules across the system.

Detailed Information on Splunk SOAR 

Low-code playbooks

With automated playbooks to orchestrate and execute actions across different point products. Splunk SOAR can automate repetitive tasks, investigation, and response. To carry out the automation, we have several playbooks that are considered to be low-code. So, implementing low-code “playbooks” allows for the codification of processes where automation can be applied to improve consistency and time savings. 

Actions based on the Playbooks

Then, we could have a list of actions based on playbook results. This could include further investigation tasks or notifying users. Finally, when you want to push the borders of automation, we could have several steps to isolate or quarantine hosts depending on the results of the previous playbooks, which would be integrated with multi-factor authentication to ensure the action is appropriately authorized. 

Phases and Task

So, we have noted low-code playbooks and how they can be used to automate tasks and merge with security tools and other Spunk products. All of this is done with workbooks and phases. So, we can have a single workbook with several tasks to complete, and after executing these tasks, we can quickly start a separate phase or even a different workbook.

Diagram: Splunk SOAR

Splunk SOAR Integration with Other Products

So, you want to perform a containment action. This is where the SOAR platform can, for example, use Carbon Black. Here, you can have manual, semi-automatic, or fully automatic. Or you can use Zscaler for containment. So, there are several additional products that SOAR can integrate with.

In this scenario, there will be one phase. And several playbooks in a single phase. First, some playbooks can be triggered automatically, and some are invoked manually. Then, some are being gathered manually but will have prompts for additional information.

These tasks will be semi-automatic because they can automatically import data for you and enrich events. Furthermore, they can import this data and enhance events from several platforms. So, this phase, which consists of a Risk Investigate workbook, is used as your initial triage.

Splunk SOAR Playbook Examples

Splunk SOAR Example: Phishing Investigation and Response

A typical phishing email investigation begins with analyzing the initial data and searching for artifacts. Some artifacts to investigate include attachments within the email, phishing links disguised as legitimate URLs, email headers, the sender’s email address, and even the entire email content.

Phishing Investigate and Respond Playbook 

In this use case, we will highlight the Phishing Investigate and Respond Playbook that automatically investigates and contains incoming phishing emails. The Playbook has a total of 15 actions available. Once Splunk SOAR receives a phishing email alert from a third-party source (e.g., fetching email directly from the mail server), it will automatically kick off the Playbook and begin analyzing the following artifacts: file reputation, URL reputation, and Domain Reputation.

Suppose during the investigation phase, the file, URL, IP address, or domain seems suspicious. In that case, the Playbook will use the predetermined parameters to decide to contain the threat by deleting the email from the user’s inbox.

  • Phishing Investigate and Respond Playbook

  • Crowdstrike Malware Triage Playbook

  • C2 Investigate and Contain Playbook

  • Recorded Future Indicator Enrichment Playbook

  • Recorded Future Correlation Response Playbook

Splunk SOAR Example: Endpoint Malware Triage

Although endpoint detection and response (EDR) or endpoint protection platform (EPP) tools can help monitor any suspicious activity within endpoints in your organization’s systems, these tools can generate many alerts — some of which could be false positives, while others are legitimate threats.

Fortunately, a SOAR tool can orchestrate decisions and actions to investigate, triage quickly, and respond to this high volume of alerts, filtering out the false positives, determining the risk level, and reacting accordingly.

Crowdstrike Malware Triage Playbook 

It enriches the alert detected by Crowdstrike and provides additional context in determining the severity. Once all the information is collected, the analyst prompts to review. Based on the analyst’s choice, the file in question can be added to the custom indicators list in Crowdstrike with a detection policy of “detect” or “none,” The endpoint can be optionally quarantined from the network by the analyst. 

Splunk SOAR Example: Command and Control with Investigation and Containment

C2 Investigate and Contain Playbook

As soon as an alert for a command and control attack surfaces, Splunk SOAR will start the C2 Investigate and Contain Playbook. This Playbook is designed to perform the investigative and potential containment steps to handle a command-and-control attack scenario properly. It will extract file and connection information from a compromised virtual machine, enrich it, and then take containment actions depending on the significance of the data. Significant information includes files with threat scores greater than 50 and IP addresses with reputation status “MALICIOUS,” among other attributes.

Splunk SOAR Example: Alert Enrichment

Indicators of Compromise

When investigating security alerts, you must first look at the indicators of compromise (IOCs), such as IP address, URL, user name, domain, hash, and other relevant criteria. This helps determine the severity of the alert. Many analysts manually dive into the data to search for additional context or hop between different threat intelligence platforms to gather more information.

Recorded Future Indicator Enrichment Playbook 

The Recorded Future Indicator Enrichment Playbook enriches ingested events that contain file hashes, IP addresses, domain names, or URLs. Contextualizing these details around relevant threat intelligence and IOC helps accelerate the investigation. Recorded Future is a security intelligence platform that provides additional context for analysts to respond to threats faster. 

Recorded Future Correlation Response Playbook 

The Recorded Future Correlation Response Playbook gathers more context about the relevant network indicators in response to a Splunk correlation search. Once there’s enough context, the Playbook automatically blocks access upon an analyst’s approval. By comparing traffic monitoring data with Recorded Future bulk threat feeds, Splunk identifies high-risk network connections and forwards them to Splunk SOAR. 

Splunk SOAR queries Recorded Future for details about why the network indicators are on the threat list and presents a decision to the analyst about whether the IP address and domain names should be blocked.

This example uses Layer 4 Traffic Monitoring by Cisco WSA as the network monitoring data source. Cisco Firepower NGFW and Cisco Umbrella can enforce blocking actions at the perimeter using DNS sinkholes. Once the analyst can secure the network access via the Recorded Future Correlation Response Playbook, Splunk SOAR can trigger a second playbook to investigate, hunt, and block a URL. 

Zscaler Hunt and Block URL Playbook

When a suspicious URL is detected, the Zscaler Hunt and Block URL Playbook can identify internal devices that have accessed that URL and triage the organizational importance of those devices. 

Then, depending on the maliciousness of the URL and whether or not the affected device belongs to an executive in the organization, the URL will be blocked, and an appropriate ServiceNow ticket will be created. This Playbook is supported via VirusTotal, Zscaler, Microsoft Exchange, ServiceNow, Splunk, and Carbon Black. Use these pre-built playbooks to help your team save time by tracking down malicious indicators so they can spend more time addressing critical tasks.

Highlights: Splunk Security

In today’s ever-evolving digital landscape, ensuring the security of your organization’s data and infrastructure has become paramount. One solution that has gained significant traction is Splunk Security. In this blog post, we will explore the capabilities and benefits of Splunk Security, and how it can empower your defense strategy.

Section 1: Understanding Splunk Security

Splunk Security is a comprehensive platform designed to help organizations monitor, detect, and respond to security threats effectively. By aggregating and analyzing data from various sources, it provides real-time insights into potential risks and vulnerabilities.

Section 2: Key Features and Functionality

Splunk Security offers a wide range of features that enable proactive threat hunting, incident response, and security analytics. From its powerful search and correlation capabilities to its customizable dashboards and visualizations, Splunk Security provides security teams with a holistic view of their environment.

Section 3: Threat Intelligence Integration

One of the key strengths of Splunk Security is its ability to integrate with external threat intelligence feeds. By leveraging up-to-date threat intelligence data, organizations can enhance their threat detection capabilities and stay ahead of emerging threats.

Section 4: Automation and Orchestration

To address the ever-increasing volume and complexity of security incidents, Splunk Security offers automation and orchestration capabilities. By automating repetitive tasks and orchestrating incident response workflows, security teams can streamline their processes and respond to threats more efficiently.

Section 5: Advanced Analytics and Machine Learning

Splunk Security leverages advanced analytics and machine learning algorithms to identify patterns, anomalies, and potential indicators of compromise. These capabilities enable early detection of threats and provide valuable insights for proactive mitigation strategies.

Conclusion:

In conclusion, Splunk Security is a powerful and versatile solution that can significantly enhance your organization’s defense strategy. By leveraging its comprehensive features, integrating threat intelligence, harnessing automation and orchestration, and utilizing advanced analytics, you can stay one step ahead of cyber threats. Embrace the power of Splunk Security and fortify your security posture today.

Cisco Secure Workload

Cisco Umbrella CASB

Cisco Umbrella CASB

In today’s digital landscape, the cloud has become an indispensable part of businesses of all sizes. However, with the increasing reliance on cloud services, ensuring the security of sensitive data and preventing unauthorized access has become a paramount concern. This is where Cisco Umbrella CASB (Cloud Access Security Broker) comes into play. In this blog post, we will explore the key features and benefits of Cisco Umbrella CASB and how it can help organizations fortify their cloud environment.

Cisco Umbrella CASB is a comprehensive cloud security solution that provides visibility, control, and protection across cloud applications and services. It acts as a gatekeeper, enabling organizations to enforce security policies, detect and prevent threats, and ensure compliance in the cloud.

Table of Contents

Highlights: Cisco Umbrella CASB

 

A Platform Approach

We must opt for a platform approach to visibility and control. More specifically, a platform that works in a 3rd party environment. So, for cloud security, this is where secure access service edge (SASE) can assist. In particular, the Cisco version is SASE, or Cisco Umbrella CASB, which comes with various versions depending on your needs.

The SASE Cisco umbrella CASB solution has a variety of CASB security functions and CASB tools, Data Loss Prevention (DLP), and Umbrella Remote Browser Isolation (RBI), which can help you better understand and control your environment.

Automatic Discovery and Risk Profiling

The manual process involves investigating and mapping traffic patterns, data movement, and usage. For this, we need automatic discovery and risk profiling. It would help if you had visibility in applications, files, and data you may know but also the ones you do not know about. You will be amazed by the number of malicious files and data already in sanctioned applications.

 

Related: For pre-information, you may find the following helpful:

  1. SD WAN SASE
  2. Cisco Secure Firewall
  3. SASE Model
  4. Cisco CloudLock

 

Back to Basics: Cisco Umbrella CASB

The Role of SASE

The Cisco Umbrella SASE solution offers other security functionality, such as a cloud-delivered Layer 7 Firewall, Secured Web Gateways (SWG), DNS-layer security, SD-WAN, and Thousand Eyes integration for Monitoring and Observability conditions. So, we have the traditional security stack you are familiar with and added enhancements to the security stack solution to make it more cloud-friendly. These functionalities are part of a single SASE solution that you can benefit from a Cisco Umbrella dashboard with API integrations. 

Cisco Umbrella SASE

SASE Feature


Cloud Access Security Broker and Data Loss Prevention ( in-line)

DNS-Layer Security


Remote Browser Isolation


Secure Web Gateways (SWG)

Layer 7 Firewall

Key Features of Cisco Umbrella CASB

1. Cloud Application Discovery and Visibility: Cisco Umbrella CASB offers deep visibility into cloud applications and services being used within an organization. It helps identify shadow IT and provides insights into data usage and user behavior.

2. Data Protection and Compliance: With advanced data loss prevention (DLP) capabilities, Cisco Umbrella CASB helps organizations prevent the leakage of sensitive data in the cloud. It enables granular policy enforcement, encryption, and monitoring to ensure compliance with industry regulations.

3. Threat Detection and Response: Cisco Umbrella CASB employs powerful threat intelligence and machine learning algorithms to detect and mitigate cloud-based threats. It provides real-time alerts, anomaly detection, and proactive incident response capabilities to defend against cyber-attacks.

Benefits of Cisco Umbrella CASB

1. Enhanced Cloud Security: By integrating seamlessly with cloud platforms and applications, Cisco Umbrella CASB offers centralized security management and protects against data breaches, malware, and unauthorized access attempts.

2. Improved Visibility and Control: With comprehensive visibility into cloud activity, organizations can gain insights into user behavior, identify risky applications, and enforce policies to control their cloud environment.

3. Streamlined Compliance: Cisco Umbrella CASB helps organizations meet the stringent compliance requirements of various industries by offering robust data protection, encryption, and auditing capabilities.

 

Use Case: Cisco Umbrella CASB

The Cisco Umbrella CASB fulfills a variety of CASB security use cases. The use case for the CASB solution depends on where you are in your SASE and cloud security voyage. For example, if you are interested in blocking Malware and content, then Umbrella DNS filtering would be fine.

Umbrella Security Features

However, you may be looking for additional security requirements. For example, you will need Data Loss Prevention (DLP), Cloud Access Security Brokers (CASB), and Umbrella Remote Browser Isolation (RBI). In that case, we need to move toward Umbrella SIG, which includes Layer 7 Firewalls. Cisco Umbrella offers several packages ranging from DNS Security Essentials to SIG Advantage. More information can be found here: Cisco Umbrella Packages.

Along with these security features, Cisco Umbrella also has continuous file monitoring. You scan data at rest for any sanctioned application and files within those approved applications that could be malicious. These tools will improve your security posture and protect organizations against cloud-specific risks.

This post will examine how you start discovering and controlling applications with Cisco Umbrella. The Cisco Umbrella CASB components take you from the initial Discovery to understanding the Risk to maintaining activity by controlling access to specific applications for certain users and actions.

The Cisco Umbrella’s Data Loss Prevention (DLP), Cloud Access Security Brokers (CASB), and Remote Browser Isolation engines carry out these security activities.

 

Cisco Umbrella CASB
Diagram: Cisco Umbrella CASB.

 

Cloud security threats

Today’s shared challenge is that organizations need to know what applications they have in their environment. They also need to figure out what to do with specific types of data or how to find users and assign policies to them. These requirements must be met on someone else’s infrastructure, the cloud.

There are significant risks to working in cloud environments that differ significantly from on-premises. Could you look at storage? For example, unprotected storage environments pose a much greater security risk in the public cloud than in a private data center.

Within an on-premise private data center, the firewall controls generally restrict direct access to storage, limiting the exposure of an unprotected file to users who already have access to data center systems. On the other hand, an improperly managed storage bucket in the public cloud may be entirely unfiltered for the entire Internet, with only a few clicks by a single person or automated playbooks without role-based access control (RBAC).

Umbrella Remote Browser Isolation

What is Remote Browser Isolation? Browsing the Internet is a dangerous activity. Unfortunately, we have an abundance of threats. These include malicious javascript, malvertising, exploit kits, and drive-by downloads. All of these target users interact with web content via their browsers.

Typically, when a user’s browser is compromised, the attacker achieves access to the machine the browser runs on. However, the bad actors’ target assets are rarely on the first machine they compromise. For this, they will commonly proceed to move throughout the network laterally.

Lateral Movements

Unfortunately, the tool they use to move laterally is often a good sys admin tool, so it can be hard to detect as a security best practice; it’s much better to eliminate the availability of any lateral movements.

However, with Umbrella Remote Browser Isolation (RBI), the remote browser runs in an isolated container in the cloud, thus mitigating the attack surface to an absolute minimum and removing the potential to move laterally. Therefore, the most sensible thing to do is to isolate the browsing function. With browser isolation technologies, Malware is kept off the end user’s system, reducing the surface area for attack by shifting the risk of attack to the server sessions, which can be reset to a known good state on every new browsing session, tab opened, or URL accessed.

Umbrella Remote Browser Isolation protects users from Malware and threats by redirecting browsing to a cloud-based host, which for some is based on a containerized technology. Isolation is achieved by serving web content to users via a remotely spun-up surrogate browser in the cloud.

 

Umbrella Remote Browser Isolation
Diagram: Umbrella Remote Browser Isolation.

 

The Umbrella Remote Browser Isolation allows users to access whatever content they want, such as web location or doc. So the user is sent via an isolation engine, stripping away anything that can be malicious, such as Macros or Malware, and then giving them a fully rendered version of whatever the content is.

For example, this could be a web app or a website. So, with remote browser isolation, you are scrubbing away anything that could be malicious and giving them a rendered clean version.

So, to the user, it is fully transparent, and they have no idea that they are looking at a rendered version, but it gives a clean and safe piece of content that will not introduce Malware into the environments without a performance hit.

 

Cisco Umbrella CASB

You can use Cisco Umbrella CASB to discover your actual usage of cloud services through multiple means, such as network monitoring, integration with existing network gateways and monitoring tools, or even monitoring Domain Name System (DNS) queries. This is a form of discovery service that the CASB solution provides.

This is the first step to CASB security, understanding both sanctioned and shadow I.T. Once the different services are discovered, a CASB solution can monitor activity on approved services through two standard deployment options.

First, we have an API connection or inline (man-in-the-middle) interception. Some vendors offer a multimode approach. Both deployment modes have their advantages and disadvantages.

CASB solution
Diagram: CASB solution

 

The CASB alone is far from a silver bullet and works in combination with other security functions. The power of Cisco Umbrella CASB depends on its Data Loss Prevention (DLP) capabilities, which can be either part of the CASB solution or an external service, depending on the CASB security vendor’s capabilities. In the case of the Cisco Umbrella, it has an inline DLP engine.

Data Loss Prevention

After the Discovery is performed, CASB security can be used as a preventative control to block access to SaaS products. This functionality, however, is being quickly replaced through the integration of DLP. DLP systems inspect network traffic, leaving your systems looking for sensitive data. Traffic carrying unauthorized data is terminated to protect it from loss and leakage.

Through integration with a DLP service, you can continue to allow access to a SaaS product, but you can control what is being done within that SaaS product. So, for example, if somebody uses Twitter, you can restrict specific keywords or statements from being sent to the platform.

So, for example, if you’re using something like an application like Salesforce in the cloud, and you have a policy you’re not allowed to copy customer or download customer databases from Salesforce, the CASB solution can enforce that as well as monitor if someone does attempt to download or violate the policies.

 

Data Loss Prevention
Diagram: Data Loss Prevention.

 

Cisco Umbrella CASB: SASE Capabilities

Cisco Umbrella’s CASB, DLP, and Umbrella remote browser isolation (RBI) offering is a core part of Cisco’s overall SASE strategy. The value of CASB security is from its capability to give insight into cloud application use across cloud platforms and identify unsanctioned use.

CASBs use auto-discovery to detect cloud applications and identify high-risk applications and users. In addition, they include DLP functionality and the capability to detect and provide alerts when abnormal user activity occurs to help stop internal and external threats. This enables Cisco Umbrella to expose shadow I.T. by providing the capability to detect and report on the cloud applications used across your environment.

 

Cisco Umbrella Visibility

Description

App Discovery Provides: 

Extended Visibility into cloud apps in use and traffic volume

App Discovery Provides:

App details and risk information

App Discovery Provides:

Capability to block/allow specific apps

 

Now, we have a central place for all applications. Cisco Umbrella CASB looks at all your cloud applications and puts them on a single box, on a single pane of glass that you can manage and look at what’s happening, but that functionality has to exist already. So, instead of going to a hundred different applications and cloud providers, you’re just going to one system, your CASB solution handling everything.

Pillar1: Visibility 

The CASB security should detect all cloud services, assign each a risk ranking, and identify all users and third-party apps able to log in. More often than there are a lot of power users, such as finance, that have access to large data sets. So, files are shared and exposed within the content of files used, and apps are installed.

This is generally down to a slight majority of users controlling most applications. So it’s these users, which are a small amount, that introduce a considerable amount of security risk. In addition, these users often collaborate with several external parties, which will be cloud-based sharing. Not to mention sharing with non-corporate email addresses.

CASB Security
Diagram: CASB Security.

 

  • A key point: Understanding risk.

the first thing you want to do is understand the risk. Here, you can identify risky applications by gaining visibility on any shadow I.T. These apps that admins have no control or visibility into are being used in their environment that they need to protect.

You can also dig into what identities use these applications and why they are used. How do you gain visibility? You may be wondering how you get all this data. A few sources can be used to discover the data we will discuss.

Applications in your environment can be displayed in different categories and break down risk based on other criteria. For example, there is business risk, usage risk, and vendor compliance. Each risk category has different factors used to make up the risk categories. Cisco Umbrella CASB integrates with Cisco Talos, which helps you get the reputation information by looking at the Host domain and URL associated with informing you if the app has a good reputation.

Pillar2: Discovery 

To gain visibility, we have to perform Discovery. The discovery process involves pulling in, logging data out of other security products, and then analyzing the information. All of the capabilities to discover apps work out of the box. You only need to set the user traffic to the Umbrella system. The first is DNS, which we can also discover with the Secure Web Gateway (SWG) proxy and a cloud-delivered firewall.

These SASE engines offer you a unique view of sanctioned and unsanctioned applications. So, if you send traffic through one of these Cisco Umbrella engines, it can collect this data automatically. Also, Cisco Umbrella has a Layer 7 application Firewall that can provide information such as application protocols that will give you information on the top-used protocols per application. 

Umbrella Remote Browser Isolation
Diagram: Cisco Umbrella CASB and the Discovery process.

 

The Umbrella has several components of engines that help with Discovery, such as native proxy, Firewall, and DNS logs. So, the user can be determined when every engine picks up the traffic, such as DNS or Firewall levels. This will give you a holistic view of the application, such as the risk associated and the identity on a per-app basis. So, now we can have a broader look at risk to understand cloud apps and traffic going to, for example, Malware hosts and going C&C command servers, and if any ToR endpoints are running on your network. 

Pillar 3: Data Security and Control

When dealing with any systematic issue, prevention is critical, with a focus on data protection. A good start would be to define which applications are risky. From there, you can build a workflow and data sets that you need to protect from, for example, data leakage. Once Discovery is performed along with risk assessment, you can prevent unwanted applications in your environment, which is the first step in enforcement.

The first component is the CASB security, then DLP to enforce controls. We are creating DLP policies to prevent data leakage. The CASB should be able to identify and control sensitive information. So here, we have DLP features and the capability to respond to classification labels on content.

There is a component called granular control, in which you can allow access to special applications but control different actions for specific applications and users. For example, you can enable access to the app but block uploads. You can then tie this to an identity so only your finance team can upload it. You can allow, secure, and also isolate. The CASB DLP can operate natively and in conjunction with enterprise DLP products via Internet Content Adaptation Protocol (ICAP) or REST API integration. 

A standard DLP engine for the on-premise and cloud locations will eliminate policy duplication. This Cisco Umbrella solution opts for an inline DLP engine without the need to service chain to an additional appliance.

 

Inline Data Loss Prevention

The Data Loss Prevention policy monitors content classified as personally identifiable or sensitive information. When necessary, content is blocked from an upload or a post. With Cisco DLP, there is only one data loss prevention policy.

Rules are added to the policy to define what traffic to monitor (identities and destinations), the data classifications required, and whether content should be blocked or only monitored. For example, an office may want to monitor its network for file uploads that include credit card numbers because the uploads breach company privacy and security policies. A rule that scans the network and uploads to domains can block these files.

Cisco Umbrella: 80 pre-built data Identifiers

There are two primary functions of DLP. The first piece identifies and classifies sensitive data; the second is the actions to take. Cisco Umbrella has robust DLP classification with over 80 pre-built data identifiers aligned with detailed reporting on every DLP report. So, working with DLP, you have first to select data classification. This is where you start your DLP and have different identities for the data. If you are concerned with financial data sets and want to examine credit card numbers, you can choose a list of predicted identifiers. Then, you can add your customizations.

Data Loss Prevention
Diagram: Data Loss Prevention.

 

Cisco umbrella DLP engine also supports regular expressions that support pattern patterns. This allows you to match any pattern. So we have a custom action and pre-built and then apply this to a DLP policy. As you know, there is only one data loss prevention policy. Rules are added to the policy to define what traffic to monitor (identities and destinations), the data classifications required, and whether content should be blocked or only monitored.

Deployment: CASB Solution

CASBs operate using two approaches: Inline CASB solutions reside in the users and service connection path. They may do this through a hardware appliance or an endpoint agent that routes requests through the CASB. This approach requires the configuration of the network and endpoint devices. However, it provides the advantage of seeing requests before they are sent to the cloud service, allowing the CASB to block submissions that violate policy.

API-based CASB solutions do not interact directly with the user but rather with the cloud provider through the provider’s API. This approach provides direct access to the cloud service and does not require any user device configuration.

However, it also does not allow the CASB to block requests that violate policy. As a result, API-based CASBs are limited to monitoring user activity and reporting on or correcting policy violations after the fact.

 

Starting a SASE Project

DLP starting points

As a starting point, when considering DLP, there are a couple of best practices to follow. First, you must “train” a DLP to understand sensitive data and what is not. Especially with DLP, you should have it in monitoring-only mode and not be aggressive and block. You want to understand what is happening before you start to block.

Sometimes, you want to understand more about data and data I.D. and where it moves. Second, a DLP cannot inspect encrypted traffic; if they do, check the performance hit. Third, some cloud SDKs and APIs may encrypt portions of data and traffic, which will interfere with the success of a DLP implementation.

With Cisco Umbrella, as a best practice, you can start with the pre-built identifiers and create custom dictionaries to monitor your organization’s specific keywords and phrases. Then, you can create specific rules based on users, groups, devices, and locations you want to watch data for. Finally, you can choose which destination and apps you like to monitor; many organizations choose only to monitor when creating DLP rules and then enable block over time. 

 

CASB Solution

Data Loss Prevention

  • Disover all applications

  • Calculate Risk

  • Apply controls to identities

  • Detect and mitigate threats

  • Train the DLP engine

  • Do not be aggressive

  • Encrypted traffic support

  • Pre-build identifiers

 

Cisco Umbrella CASB starting points

Consider the following recommendations when starting a project that consists of CASB functionality. First, discover sanctioned and unsanctioned cloud services and then access the cloud risk based on cloud service categories. This includes all cloud services and cloud plug-ins. Once this information has been gained, it can be measured, along with risk. This can then be compared to the organization’s risk tolerance. 

Next, identify and protect sensitive information. Once you find all sensitive information in the cloud, you can classify it and then apply controls to control its movement, such as DLP. For example, additional protections can be used if sensitive data is moved from the cloud services to a local unmanaged laptop.

 

  • A final note: Detect and mitigate threats.

You can access the user’s behavior and any deviations that may signal out-of-normal activity. The CASB is one of many solutions that should be used here—more mature products with advanced detection, such as Splunk User Behavior Analytics (UBA). For example, trust decreases once a significant deviation from the baseline is noticed. You could implement step-down privileges or more extreme courses, therefore changing the level of access. In addition, it would be helpful to track all data’s movement and detect and eliminate Malware. And then have an implementation strategy for remediation.

 

Summary: Cisco Umbrella CASB

In today’s digital landscape, businesses are rapidly adopting cloud technologies to drive innovation and enhance productivity. However, this shift towards the cloud also introduces new security challenges. Enter Cisco Umbrella CASB, a comprehensive cloud access security broker solution that empowers organizations to safely navigate their cloud journey while ensuring data protection and compliance.

Section 2: Understanding Cisco Umbrella CASB

Cisco Umbrella CASB is a robust platform that provides visibility, control, and protection across all cloud applications and services utilized by an organization. It offers a centralized console to manage cloud access, enforce security policies, and detect potential threats. With its advanced capabilities, Cisco Umbrella CASB enables businesses to embrace the cloud securely.

Section 3: Key Features and Benefits

a) Cloud Application Visibility: Cisco Umbrella CASB offers deep visibility into cloud applications and services being used within an organization. It provides valuable insights into user activities, data transfers, and potential risks, allowing administrators to make informed decisions.

b) Policy Enforcement: With granular policy controls, Cisco Umbrella CASB enables organizations to define and enforce security policies tailored to their specific needs. It ensures that data is accessed, shared, and stored within the cloud according to predefined guidelines, reducing the risk of data breaches or unauthorized access.

c) Threat Detection and Response: By leveraging advanced threat intelligence and machine learning, Cisco Umbrella CASB proactively identifies and mitigates potential threats within cloud environments. It alerts administrators about anomalous activities, suspicious behavior, or policy violations, enabling swift incident response.

Section 4: Seamless Integration and Scalability

Cisco Umbrella CASB seamlessly integrates with existing security infrastructure, including firewalls, proxies, and endpoint security solutions. This integration allows businesses to leverage their existing investments while extending comprehensive cloud security capabilities. Additionally, the solution scales effortlessly as organizations expand their cloud footprint, ensuring continuous protection.

Section 5: Real-World Use Cases

a) Data Loss Prevention: Cisco Umbrella CASB helps prevent sensitive data leakage by monitoring and controlling data transfers within cloud applications. It enables organizations to set up policies that restrict the sharing of confidential information or personally identifiable data, reducing the risk of data loss incidents.

b) Compliance and Governance: With its robust auditing and reporting capabilities, Cisco Umbrella CASB assists organizations in meeting regulatory compliance requirements. It provides detailed logs and insights into user activities, ensuring transparency and accountability in cloud usage.

Section 6: Conclusion

Cisco Umbrella CASB is a game-changer in the realm of cloud security. Its comprehensive feature set, seamless integration, and scalability make it an invaluable asset for organizations aiming to secure their cloud journey. By harnessing the power of Cisco Umbrella CASB, businesses can unlock the true potential of the cloud while safeguarding their critical assets and maintaining compliance.

DNS Security

DNS Security Solutions

DNS Security Solutions

In today's digital landscape, where cybersecurity threats loom large, it is crucial to fortify your online presence. DNS (Domain Name System) security is an often overlooked aspect of online security. In this blog post, we will delve into the world of DNS security solutions, exploring their significance and the measures you can take to protect your digital assets.

The Domain Name System is the backbone of the internet and is responsible for translating user-friendly domain names into IP addresses that computers can understand. However, this critical function also makes DNS vulnerable to cyberattacks. This section will discuss DNS attacks' potential risks and consequences, highlighting the need for robust security measures.

Table of Contents

Highlights: DNS Security Solutions

No Security By Default

This post will outline the domain name system: the DNS structure, the vulnerabilities and abuses of DNS security designs, and guidance on implementing DNS protection with examples of DNS security solutions with Cisco, like Cisco Umbrella DNS. Unfortunately, like many Internet protocols, the DNS system was designed without security in mind and contained several security limitations regarding privacy, integrity, and authenticity.

Constant Security Threats

These security constraints, combined with bad actors’ technological advances, make DNS servers vulnerable to a broad spectrum of attacking DNS vectors, including DNS Reflection attack, DNS tunneling, DoS (Denial of Service), or the interception of private personal information via means of data exfiltration via the DNS protocol. As you can presume, this causes the DNS layer to be an excellent avenue for bad actors to operate when penetrating networks and exfiltrating data.

Related: For pre-information, you will find the following posts helpful:

  1. OpenShift SDN
  2. GTM Load Balancer
  3. Open Networking
  4. SASE Model



DNS Security Cisco

Key DNS Security Solutions Discussion points:


  • Numerous attacking DNS vectors.

  • Decentralized but not secure.

  • DNS queries are not encrypted.

  • Privacy, Integrity and Authenticity do not exist.

  • Issues with UDP as transpot.

  • Cisco DNS Security with Cisco Umbrella DNS

  • DNS Solution and PKI.

Back to Basics: DNS Security Solutions

♦ DNS Caching

The whole resolution process may be more transparent. However, it’s usually relatively fast. One of the features that speed it up considerably is caching. A nameserver processing a recursive query may have to send out several queries to find an answer. However, it discovers a lot of information about the domain namespace as it does so.

Each time it refers to another list of nameservers, it learns that those nameservers are authoritative for some zone, and it knows the addresses of those servers. At the end of the resolution process, when it finally finds the data the original querier sought, it can also store it for future reference.

♦ Types of DNS Attacks

DNS attacks come in various forms, each with modus operandi and potential damage. From DDoS attacks that flood servers to cache poisoning that redirects users to malicious websites, understanding these attack vectors is crucial for implementing adequate security strategies. This section will shed light on some common types of DNS attacks.

♦ DNS Security Solutions

Thankfully, several DNS security solutions are available to safeguard your online presence. This section will explore some of the most effective and widely used security measures. From implementing DNSSEC (DNS Security Extensions) to deploying firewalls and intrusion detection systems, we will discuss how these solutions can help mitigate DNS-related threats.

♦ Best Practices for DNS Security

While deploying DNS security solutions is essential, following best practices to enhance your security posture is equally important. This section will outline some key best practices for DNS security, including regular patching and updates, monitoring DNS traffic, and employing multi-factor authentication. By adopting these practices, you can bolster your defenses against potential threats.

DNS Layer Security: Decentralized but not secure

The DNS protocol was developed to be decentralized and hierarchical, though not secure. Almost since its inception, there have been exploits. We must protect this critical network service. Several technologies have been implemented for DNS protection. These security technologies can be implemented with secure access service edge (SASE) products such as DNS security Cisco with the Cisco Umbrella DNS product. Cisco Umbrella DNS stops threats such as Malware before the initial connection.

DNS Protection: Are DNS inquiries encrypted?

DNS queries are not encrypted. Even if users use a DNS resolver like 1.1.1.1 that does not track their activities, DNS queries travel over the Internet in plaintext. Anyone who intercepts the query can see which websites the user is visiting. This absence of privacy impacts security significantly. If DNS queries are not private, it becomes easier for governments to censor the Internet and for bad acts to lurk on users’ online behavior unknowingly.

DNS Protection with Privacy, Integrity, and Authenticity

So, with DNS, the primary thing we care about with security is not there. In security, we care about privacy, integrity, and authenticity. However, with DNS left to its defaults, with privacy, you can see all the DNS queries in plain text. Then, for integrity, we want to know if someone has made changes between the query and response DNS stages. Finally, for authenticity, we have yet to learn if the DNS server that responded is the server we want to talk to, not some man-in-the-middle snooping and intercepting the DNS queries and forging responses, leading users to malicious websites.

These concerns have directed us to introduce technologies for DNS protection. Some DNS protection technologies include the DNS firewall, DNS as a security tool with DNS reputation and inspection, and secure the channel with DNS over TLS (DoT) and DoH (DNS over HTTPS), as well as security protocol implementations with DNSSEC. When implemented correctly, all of this helps restore the privacy, integrity, and authenticity security issues we have with the current implementation of the DNS protocol.

DNS Security Solutions
Diagram: DNS Security Solutions.

DNS Protection: Lack of DNS Security  Solutions

Early days of DNS

In the early 1980s, the network was much smaller, with fewer relatively well-known and trusted participants. However, as the network scaled, DNS remained an insecure and unauthenticated protocol, even though the networks grew to have many relatively unknown and untrusted participants.

Since 1980, we have been stuck with this protocol. At that time, around a hundred hosts around the USA communicated with each other. Some of these communication protocols include FTP and SMNP. You still needed to find the IP back then, so you had to look it up in a host file. Then, if you wanted to be put into this host file, you would have to call Stanford and request it literally, and they wrote it manually for you.

Diagram: DNS Protection.

Before you can scale, we need to create something to replace the host file. This was when the Domain Name System was designed. So, we have delegation with hierarchy instead of a host file that must be manually edited for new hosts.

With the Domain Name System, we have the concept of hierarchy. There is a Root at the very top, which is responsible for the IP for the servers for the TLDs, which are the .com and .org; there are thousands of them now, and they are responsible for the domains that are in them and not any other domains that not part of that TLD.

DNS protection: DNS creates blind spots

Organizations widely trust DNS. The concept of trust in public and private IP addresses boils down to binary numbers and has nothing to do with one being more trustworthy, except for the excessive trust placed on private IP ranges.

DNS traffic is typically permitted to pass freely through network firewalls and other security infrastructure. However, it is attacked and abused by bad actors with malicious intent. Because of this, DNS traffic can be manipulated through techniques such as DNS tunneling and DNS poisoning. All of which create blind spots in your security posture.

The issue with UDP

Let us start with the basics; clients can ask for DNS if they want to connect to an address such as ‘www.network-insight.com’ and need to know which IP address corresponds to it. Typically, all DNS messages are sent over UDP. This is where the problems start.

The first issue is that UDP is a stateless protocol and that source IP addresses are blindly trusted, similar to how everyone would trust a private IP address over a public one. Therefore, each request and response described here is a single UDP request containing to and from IP addresses. 

Any host can forge the source address on a UDP message, making it look like it came from the expected source. Therefore, a bad actor sitting on a network that does not filter outbound packets can construct a UDP message that says it’s from the authoritative server and send it to the recursive resolver.

Diagram: Attacking DNS.

DNS Security Cisco with DNS Security Solutions:

Neglected attack surface

Today’s bad actors use DNS’s often neglected attack surface – to steal data, spread malware, perform data exfiltration, command, and control network surveillance, along with the capabilities to perform social engineering.

DNS is a bi-directional and Internet-facing protocol that carries a tremendous amount of data, making it an adversary’s most excellent tool for carrying out attacks and causing damage. In addition, the combination of security teams failing to secure their DNS traffic and the ubiquity of DNS makes it a bad actor’s most potent yet unforgotten tool.

While they have solutions that inspect and secure areas like their network with a stateful firewall and web traffic with Secure Web Gateways (SWG) and even some of the newer zero trust technologies, these solutions cannot perform a deep inspection of their DNS traffic, leaving them vulnerable to the many threats today that abuse DNS. This is because they are not designed to inspect DNS traffic. As a result, techniques such as DNS tunneling should be noticed.

In most instances, DNS packets – typically including IP address information – enter networks via unblocked ports without first being inspected by security systems. So, again, DNS activity in a network is rarely monitored. This makes the DNS layer the perfect blind spot for bad actors to manipulate.

Many of today’s sophisticated attacks depend on DNS activity. In addition, there is a rise in Malware; ransomware binaries, once executed, are quick to encrypt, and you can’t trust that your employee won’t click on a phishing email. As a result, there needs to be more trust and high complexity.

Bad actors use this and manipulate DNS to stage the internet infrastructure to support each attack stage to execute their kill chain fully. In many of today’s more sophisticated ransomware attacks, for example, bad actors will use DNS packets to upload Malware to a device.  

 DNS Security 

DNS Attack

Zero-day attack

Cache poisining

Denial of Service and Distributed

DNS Amplification

Fast-Flux DNS

DNS Protection

Introduction to attacking DNS

The vulnerability and abuses of this protocol are comprehensive, and there are several methods of attacking DNS. We have, for example, DNS poisoning, denial of service, spoofing/hijacking, and DNS tunneling.

Unless you have DNS-layer security, the DNS packets typically used to communicate IP addresses will only be inspected as they move through your network. Additionally, most security solutions don’t even register anomalous DNS activity – like DNS tunneling- a sure sign of an in-progress attack. DNS tunneling uses the DNS protocol to communicate non-DNS traffic over port 53. It sends HTTP(s) and additional protocol traffic over DNS.

DNS tunneling establishes DNS tunnels between their servers and victims’ machines. This connection between attacker and victim allows for the exfiltration of sensitive data and the execution of command and control operations.

DNS Poisoning 

DNS Poisoning, or DNS cache poisoning, is where forged DNS data is submitted into a DNS resolver’s cache. This results in the resolver returning an incorrect IP address for a domain. Therefore, rather than going to the indented website unknown to the user, their traffic can be redirected to a malicious machine. More often, this will be a replica of the original site used for malicious purposes, such as distributing Malware or collecting login information.

DNS poisoning was first uncovered in 1998. Where a recursive server sends a query out to the Root. As we are using UDP, there is no connection, and the only thing back then to identify the query as it came back as a response was simply a Query ID. That was a little short. Now, there was the possibility to trick a DNS recursive resolver into storing incorrect DNS records. Once the nameserver has stored the wrong response, it will return it to anyone who asks.

This “DNS poisoning” attack could allow random attackers to deceive DNS and redirect web browsers to false servers, hijacking traffic. Furthermore, the incorrect stored entry will remain until the cache entry expires, down to the TTL, which could lead to weeks of compromise.

DNS poisoning
Diagram: DNS poisoning.

So, if you attacked the server with forged responses for a domain and tried to brute-force the Query ID not very long ago, you could eventually guess it and insert your response into that recursive server cache.

And if you set the TTL for a low time, such as a week, then everyone that queries that recursive server will get your chosen IP address for this domain name. Today, there have been changes to mitigate DNS poisoning. They have made the Query string very long and hard to guess, so it is hard to do, but it can still happen.

DNS Spoofing

Then we have DNS Spoofing, or hijacking is very easy to do and difficult to detect. For example, let’s say you type the incurred domain name. So you try to go somewhere that does not exist and are returned to a search page with many ads. This is the ISP that is hijacking NX domain responses. So when you try to query for a name that does not exist, your ISP sees this, crafts its response, and sends you to a search page to sell you ads. This commonly happens on public Wi-Fi networks.

So, we have similar DNS spoofing and DNS poisoning attacks, but they have distinguishable characteristics. Both DNS attacks attempt to trick users into revealing sensitive data and could result in a targeted user installing malicious software that can be used later in the kill chain. Poisoning DNS cache changes entries on DNS resolvers or servers where IP addresses are stored. 

DNS Amplification Attack (DNS Flood)

Then, we have the DNS amplification style of DNS attack. They are also known as DNS floods. A bad actor exploits vulnerabilities to initially turn small queries into much larger payloads, which are used to bring down the victim’s hosts.

So, we know that DNS uses UDP for transport, meaning a bad actor can spoof the source address of a DNS request and send the response to any IP address of their choosing. In this case, they can amplify DDoS attacks using DNS responses larger than the initial query packet. For example, fake DNS lookups to open recursive servers can achieve a 25x to 40x amplification factor. This is because the source IP of the bogus lookups is the victim’s website, which becomes overwhelming.

DNS Flood Attack

DNS flood targets one or more DNS servers belonging to a given zone, attempting to impede the resolution of resource records of that zone and its sub-zones. This attack overwhelms the network capacity that connects authoritative servers to the Internet.

Once the bandwidth is depleted with malicious traffic, legitimate traffic carrying DNS queries from legitimate sources cannot contact the authoritative servers. DNS flood attacks differ from DNS amplification attacks. Unlike DNS floods, DNS amplification attacks reflect and amplify traffic off unsecured DNS servers to hide the attack’s origin and increase its effectiveness.

Diagram: DNS Security Solutions and flood attacks.

Random Subdomain Attack

Random Subdomain DDoS attacks are becoming popular in recent attacks, such as in the Mirai attack on Dyn. In these DNS attacks, many queries are sent for a single or a few target domains, yet they include highly varying nonexistent subdomains generated randomly.

This denial of service attack hits a domain’s authoritative name servers with multiple requests for random, nonexistent subdomains. The name servers become bogged down when replying to these phony requests and need help to respond to legitimate queries. These attacks are also called NXDomain attacks; they can result in denial of service at the recursive resolver level.

DNS Tunneling

Then, we have DNS tunneling, which we briefly mentioned. DNS tunneling is frequently used to deliver payloads encoded in DNS queries and responses, exfiltrate data, and execute command and control attacks as the attackers use SSH, TCP, or HTTP to pass, for example, Malware or stolen information into DNS queries undetected.

This allows the bad actor to exfiltrate sensitive data in small chunks within DNS requests to bypass security. With the amount of DNS traffic and requests a network typically sees, attackers can hide data theft easily.

The bad actor can use standard protocols like TCP or SSH, encoded within DNS protocol requests. At the same time, it is not an attack on DNS. This form of malicious activity can use DNS to exfiltrate data.

DNS Tunneling
Diagram: DNS Tunneling.

DNS Security Cisco

Cisco Umbrella DNS: The DNS Firewall

There are several ways these attacks can be prevented. Firstly, the DNS firewall enables DNS layer security. DNS-layer security effectively prevents malicious activity at the earliest possible point and, in the case of Malware, contains callbacks to attackers. DNS security solutions can be accomplished with products such as Cisco Umbrella DNS.

DNS Security Cisco with DNS-layer security

Cisco Umbrella DNS uses DNS-layer security encompassing the Internet’s infrastructure to block malicious and unwanted domains before a connection is established as part of recursive DNS resolution. In addition, it utilizes a technology known as selective cloud provide that redirects specific requests noted as risky for a deeper and more thorough inspection.

Cisco Umbrella DNS accomplishes this process transparently through the DNS response without adding latency or degrading performance. Just as a standard firewall watches incoming and outgoing web traffic and blocks unsafe connections, a DNS firewall works the same way. The distinction is that DNS firewalls analyze and filter queries based on threat feeds and threat intelligence. There are two kinds of DNS Firewalls: those for recursive servers and those for authoritative servers.

A DNS firewall provides several security services for DNS servers. A DNS firewall sits between a user’s recursive resolver and the authoritative nameserver of the website or service they are trying to reach. This can help with reputation filtering and domain reputation.

Cisco Umbrella DNS: Secure the channel

We have DNS over TLS and DNS over HTTPS, two standards for encrypting DNS queries to prevent external parties from being able to read them. DNS over TLS (DoT) and DoH (DNS over HTTPS) add a secure layer to an insecure protocol. By using DoH and DoT, users can ensure the privacy of DNS responses and block eavesdropping on their DNS requests (which reveals the sites they are visiting).

Cisco Umbrella DNS: Secure the protocol

Application layers use security protocols such as HTTPS, DMARC, etc. So, the DNS protocol should be no exception. DNS Security Extensions (DNSSEC) is a security protocol that defends against attacks by digitally signing data to help guarantee its validity. The signing must happen at every level in the DNS lookup process. That can make it a complicated setup.

DNSSEC was one of the first things we started implementing, which is much older than many assume. The first talks about DNSEEC were in the early 1990s. It is a way to ensure that you know that a record you get back has not been tampered with and that the server you are talking to is the server you intend to talk to. All of this is done with PKI. 

Cisco Umbrella DNS
Diagram: Cisco Umbrella DNS.

Public Key Infrastructure (PKI) 

The server has a public and private key pair. So we have the public key, and they can sign the record. However, as we maintain a distributed hierarchy in DNS, we must guarantee that these are signed up to the Root. DNSSEC implements a hierarchical digital signing policy across all layers of DNS.

For example, in the case of a ‘google.com’ lookup, a root DNS server would sign a key for the.COM nameserver, and the.COM nameserver would then sign a key for google.com’s authoritative nameserver. DNSSEC not only allows a DNS server to verify the authenticity of the records it returns, but It also enables the assertion of the “non-existence of records.”

DNS resolvers can also be configured to provide security solutions. For example, some DNS resolvers provide content filtering, which can stop sites known to distribute Malware and spam, and botnet protection, which blocks communication with known botnets. Many of these secure DNS resolvers are free to use

Summary: DNS Security Solutions

Summary: DNS Security Solutions

This blog post delved into DNS security solutions, exploring the key concepts, benefits, and best practices for safeguarding one’s online activities.

Section 1: Understanding DNS Security

The DNS, often called the internet’s phonebook, translates domain names into IP addresses, allowing us to access websites by typing in familiar URLs. However, this critical system is susceptible to various security risks, such as DNS spoofing, cache poisoning, and DDoS attacks. Understanding these threats is crucial in comprehending the importance of DNS security solutions.

Section 2: DNS Security Solutions Explained

Several effective DNS security solutions are available to mitigate risks and fortify your online presence. Let’s explore a few key options:

  • DNS Filtering: This solution involves implementing content filtering policies to block access to malicious websites, reducing the likelihood of falling victim to phishing attempts and malware infections.
  • DNSSEC: Domain Name System Security Extensions (DNSSEC) provide cryptographic authentication and integrity verification of DNS data, preventing DNS spoofing and ensuring the authenticity of domain name resolutions.
  • Threat Intelligence Feeds: By subscribing to threat intelligence feeds, organizations can stay updated on emerging threats and proactively block access to malicious domains, bolstering their overall security posture.

Section 3: Benefits of DNS Security Solutions

Implementing robust DNS security solutions offers numerous benefits to individuals and organizations alike. Some notable advantages include:

– Enhanced Data Privacy: DNS security solutions protect sensitive user information, preventing unauthorized access or data breaches.

– Improved Network Performance: By filtering out malicious requests and blocking access to suspicious domains, DNS security solutions help optimize network performance and reduce potential downtime caused by cyberattacks.

– Mitigated Business Risks: By safeguarding your online infrastructure, DNS security solutions minimize the risk of reputational damage, financial loss, and legal repercussions due to cyber incidents.

Section 4: Best Practices for DNS Security

While investing in DNS security solutions is crucial, adopting best practices is equally important to maximize their effectiveness. Here are a few recommendations:

-Regularly update DNS software and firmware to ensure you benefit from the latest security patches and enhancements.

– Implement strong access controls and authentication mechanisms to prevent unauthorized access to DNS servers.

– Monitor DNS traffic for anomalies or suspicious activities, enabling prompt detection and response to potential security breaches.

Conclusion:

In an era where online threats continue to evolve, prioritizing DNS security is vital for individuals and organizations. By understanding the risks, exploring effective solutions, and implementing best practices, you can fortify your online security, safeguard your data, and confidently navigate the digital landscape.

rsz_rh1

Ansible Tower

Ansible Tower

In today's fast-paced world, businesses rely heavily on efficient IT operations to stay competitive and meet customer demands. Manual and repetitive tasks can slow the workflow, leading to inefficiencies and increased costs. This is where Ansible Tower comes in – a powerful automation platform that empowers organizations to streamline their IT operations and achieve greater productivity. In this blog post, we will explore the benefits and features of Ansible Tower and how it can revolutionize your IT infrastructure.

Ansible Tower is a web-based user interface and management platform for Ansible, an open-source automation engine. It provides a centralized hub for managing and orchestrating IT infrastructure, making automating complex tasks and workflows easier. With Ansible Tower, IT teams can automate the deployment, configuration, and management of applications and systems, increasing efficiency and reducing human error.

Table of Contents

Highlights: Ansible Tower

Ansible Automation Platform

To operationalize your environment and drive automation to production, you need to have everything centrally managed and better role-based access. So you understand who is automating and what they are doing, along with a good audit trail. This is where Red Hat Ansible and Ansible Tower can assist with several Ansible Tower features and Ansible Tower use cases. Red Hat Tower, also known as the Ansible Automation Platform, is a web-based UI and RESTful API for Ansible that allows users to manage the Ansible network in an easy and scalable way.

Big Step from the CLI

Ansible Tower is a big setup from using just the CLI for automation. Tower’s primary purpose is to make automation more accessible and safer with scale to do in the enterprise. It does this by presenting several Ansible Tower features from a web-based U.I.

All the Ansible Tower features, such as Projects, Credentials, and Inventory, are isolated objects with different settings. However, once these components are combined or linked, they form an automation job within a Job Template. Therefore, consider the Job template, the Tower object that glues all other components together to create an automation journey.

Related: For additional pre-information, you may find the following posts helpful:

  1. Security Automation
  2. Network Configuration Automation
  3. NFV Use Cases
  4. SD WAN Security 
  5. Security Automation



Ansible Tower Use Cases

Key Ansible Tower Discussion points:


  • The need for a platform approach to Automation.

  • Ansible Tower vs Ansible Core.

  • Ansible Tower use cases such as security and edge networking.

  • Ansible Tower features.

  • Autonomy of an Automation job with Ansible Tower.

  • Automation requirements.

 

Back to Basics: Ansible Tower

The control plane for the Ansible Automation Platform is the automation controller. This is the platform that is replacing Ansible Tower. However, when discussing the Ansible Tower use cases throughout this post, we will refer to it as Ansible Tower.

For a quick recap with Ansible Tower, we have several key components, such as a user interface (UI), role-based access control (RBAC), workflows, and continuous integration and continuous delivery (CI/CD) for supporting your team to scale with more efficiency and flexibility with automation throughout the enterprise.

Ansible Tower ( Ansible Automation Platform) helps formalize how automation is deployed, initiated, delegated, and audited, permitting enterprises to automate while reducing the risk of sprawl and variance. We can, for example, manage inventory, launch and schedule workflows, track changes, and incorporate them into reporting, all from a centralized user interface and RESTful API.

♦ Key Features and Benefits

Centralized Automation: Ansible Tower provides a single control point for managing automation across the entire infrastructure. It allows you to define and execute playbooks, schedule jobs, and monitor their progress, all from a user-friendly interface. This centralized approach saves time and effort and ensures consistency in automation processes.

Role-Based Access Control: Security is a top concern for any organization. Ansible Tower offers robust role-based access control (RBAC) mechanisms, allowing you to define granular permissions and access levels for different users and teams. This ensures that the right people have the right level of access, enhancing security and compliance.

Integration and Extensibility: Ansible Tower integrates with various tools and technologies, including cloud platforms, version control systems, and monitoring solutions. This enables you to leverage existing infrastructure investments and extend the capabilities of Ansible Tower to suit your specific needs.

♦ Ansible Tower Use Cases

Infrastructure Provisioning: With Ansible Tower, you can automate the provisioning of infrastructure resources, whether spinning up virtual machines in the cloud or configuring network devices. This eliminates manual errors, accelerates deployment times, and ensures consistent configurations across the infrastructure.

Application Deployment: Ansible Tower simplifies deploying and managing applications across different environments. Creating reusable playbooks allows you to automate the entire application lifecycle, from deployment to scaling and updates. This enables faster release cycles and reduces the risk of configuration drift.

Continuous Integration and Delivery: Ansible Tower integrates seamlessly with popular CI/CD tools, enabling you to automate the entire software development lifecycle. From building and testing to deploying and monitoring, Ansible Tower provides a unified platform for end-to-end automation, improving collaboration and accelerating time to market.

Ansible Red Hat: Ansible CLI

In more undersized team environments where everyone is well-versed in Ansible, maintaining control over automating the infrastructure and adhering to Ansible’s best practices in terms of using playbooks, meeting your security conditions, and delegating control is manageable. 

However, challenges emerge as teams start to scale and the use case of automation becomes diverse; many organizations now have team-based usage needs that stretch well beyond Ansible’s command line interface (CLI) with Ansible Core.

When moving automation to a product and numerous teams using the CLI for automation, the problem is governance and control. For example, various users will write their Playbooks stored locally on their laptops. 

These Playbooks can be controlled, but the controlling factors may not be enforced. Consequently, the potentially uncontrolled playbooks are configuring your organization’s entire infrastructure. So, we need to find a way to extend automation throughout the enterprise in a more controlled, secure, and scalable way. This can only be done with a platform approach to security, not CLI.

Red Hat Ansible
Diagram: Red Hat Ansible and the need for automation.

Red Hat Tower: Ansible Tower Use Cases

Nowadays, we are looking to expand automation to various Ansible Tower use cases, not just simple application deployments but even the ability to orchestrate multi-machine deployments with multi-vendor environments. The platform must support clustering and reach some far-edge use cases, such as edge networking.

There is a variety of Ansible Tower use cases that can be achieved with Automation mesh. Every product out there needs automation tied in—even the Cisco ACI. If you glance at the Cisco ACI programmable network, using Endpoint Groups (EPGs) is a significant benefit of the ACI network. However, you need something to configure the endpoints in the Endpoint Groups.  

Ansible Tower: Ansible Tower use cases

You need to shift towards a platform such as Ansible Red Hat Tower with a central point for handling automation that allows you to enforce standards with your automation from the top organizational level to the exact CLI arguments that can be run and by whom.

Ansible Tower goes above just running automated Playbooks; it helps you have better security, control, and visibility of your entire infrastructure. Ansible Tower can tie multiple processes and actions into a coherent workflow with a central control point. It has several Ansible Tower features that make scaling automation at scale safe.

For Ansible Tower use cases related to security, you can integrate Ansible Tower with an Enterprise security system. For control, we can have role-based access control on all of the Ansible Tower objects using Teams and Groups. You can integrate Tower with a central logging system, such as the ELK stack for visibility. And for metrics, Ansible Tower can be combined with Prometheus. Prometheus scaps metrics from HTTP endpoints.

Ansible Tower can also be integrated with various open networking use cases. Open networking describes a network that uses open standards and commodity hardware. Ansible Tower here can perform on multi-vendor networking equipment. 

Ansible Tower features

    Ansible Tower Features

Ansible Tower


Workflow Designer


Job Scheduling 


RBAC


Source Control


Credential Management

The Big Question: Why Automate?

So, when beginning automation, you must first figure out why you should automate. So, the only thing that matters is how quickly you can deploy the application. To answer this, you must consider how quickly you can move from Dev, Test, and Production.

This is where the challenges are anchored, as we have different teams, such as network, load balancing, security, storage, and virtualization teams, to get involved. What can you do to make this more efficient? We can test Ansible Tower against a staging environment before production deployment. All of which can be integrated into your CI/CD pipeline. This will help you better predicate and manage your infrastructure.

Ansible Tower uses cases to open possibilities when integrated with Jenkins. Ansible Tower is a powerful tool in a CI/CD process since it takes responsibility for the environment provision and inventory management, leaving Jenkins with only one job: orchestrating the process.

Video: Introducing Automation and Ansible CLI

In this tutorial, we are going to discuss Ansible Automation. In particular, Ansible Engine is run from the CLI. We will discuss the challenging landscape forcing us to move to automation. At the same time, Ansible playbooks and the other main components were introduced.

Ansible Automation Explained
Prev 1 of 1 Next
Prev 1 of 1 Next

Multiple Inventories

The Ansible architecture, of course, supports multiple inventories. Creating similar dev, test, and production inventories is not a problem if you want to create them. We make three inventories (‘dev,’ ‘test,’ and ‘prod’), each with identical sets of servers but with custom Ansible variables for their environment. This allows you to have a single Playbook with Ansible variables that separate the site-specific information to run against many inventories.

What to automate

Every task that you can describe and explain can be automated. This generally starts with the device and service provisioning, such as ACL and Firewall rules. You can also carry out consistency checks, continuously running checks with automation against your environments. The Survey feature is an Ansible Tower feature used to run consistency checks. Here, you can have less experience running automatic checks that don’t need complete automation requirements. 

Ansible Tower use cases: Starting advice

Imagine that the developers of a Playbook are not the same people as the infrastructure owners. Who can run what Inventory becomes essential as we begin to scale out automation in the enterprise?  At a fundamental level, playbooks manage configurations and deployments to remote machines. In addition, they can sequence multi-tier rollouts involving rolling updates at a more advanced level and delegate actions to other hosts.

You can run continuous tests, which can be reported as an inconsistency when something goes wrong. This could be as simple as checking the VRRP neighbor and determining if you can see the neighbor. Or you could fit more detailed information, such as a stateful inspection firewall, and examine the contents to ensure your firewall works as expected. You can go further with routing adjustment and failure remediation, all with automation. It depends on how far you want to push the limitations of automation.  

Ansible Tower use cases: Be careful of automating mistakes

With automation, you can automate mistakes. A good starting point is to start with read-online, such as extracting configuration and checking specific parameters are there. Then, you could move to devise provisioning and service provisionings such as VLAN segments, load balancing rules, and firewall changes. Once you have mastered these operations, you could examine additional Ansible Tower use cases, such as traffic re-routing and advanced security use cases where Ansible Tower can assist in your threat-hunting effort.  

Ansible Tower Features

Highlighting an Organization’s objects

Sometimes, you have multiple independent groups of people that you need to manage autonomous machines. One central Ansible Tower feature to discuss is using the Organization’s objects. Hence, if you have two parts of an enterprise with entirely different requirements but still require Ansible Tower, they can share a single Red Hat Tower instance without overlapping configuration in the user interface by Organizations.

An Organization is a tenant with unique User accounts, Teams, Projects, Inventories, and Job Templates. It is like having a separate instance of Ansible Tower that allows you to segregate roles and responsibilities.

Ansible Tower features
Diagram: Ansible Tower features

Red Hat Ansible: Role-based access control (RBAC)

An Organization is the highest level of role-based access control and is a collection of Teams, Projects, and Inventories. If you have a small deployment, you only need one Organization. However, larger deployments allow users and teams to be configured with access to specific sets of resources. Ansible Tower has a default Organization. Users exist at the Red Hat Tower level and can have roles in multiple Organizations.

Role-based access control

When combined with the Ansible Tower features, such as role-based access control capabilities, Playbooks can be deployed at the push of a button but in a controlled and easily audited way. Role-based access control: You can set up teams and users in various roles. These can integrate with your existing LDAP or A.D. environment.

So you can control who has access to what, when, and where. So you can explicitly restrict playbook access to authorized users. So, for example, we can have one team that can run playbooks in check mode, which is like a read-only mode, while other, more experienced users can have full administrative access with the ability to upgrade IOS versions to a fleet of routers. Developers log into Ansible Tower and, under RBAC, see only the job templates they have permission to access and deploy.

Autonomy of an Automation Job

In this next section, I will introduce the autonomy of an automation job in Red Hat Tower, giving you a good outline of the available Ansible Tower features. We have a new way to manage old Ansible objects and new Tower objects. You will notice that some of the objects used in Ansible Engine are the same, such as Playbooks and Inventory, and we have some new objects, such as Job Templates.

Red Hat Tower
Diagram: Red Hat Tower and automation jobs.

Playbooks and Projects

We still maintain Playbooks containing your tasks. These Playbooks are stored in Projects. And this is synced to wherever you are starting your playbook. 

Credential Management 

One significant benefit of using Ansible Tower is that it separates credentials from the Project. This allows you to have different Credentials for different Inventories. So, we can have one playbook targeting all hosts, run against different inventories with other credentials, and keep all your software release environments the same. This scenario is perfect for constancy in dev, test, staging, and production environments.

Inventory

The final part is the Red Hat Ansible Inventory. You need to know how to connect with SSH or API; we can have many examples here. GitHub, Netbox, and ServiceNow. Even Though ServiceNow is an ITSM tool, it can be used as a CMDB database for inventory.

Automation Job 

All of these Ansible Tower features sync together to form what is known as an automation job. So when you look at Job templates and jobs, they always need to reference Projects, Inventory, and Credentials; otherwise, they can’t run. A basic four-stage process involves getting a playbook to run from Tower. The four stages are as follows:

  1. Define a project.
  2. Define an inventory.
  3. Define credentials.
  4. Define a template.

The first three stages can be performed in any order, but the template mentioned in the final stage pulls together the three previously created facets. Therefore, it must be specified last. 

Main Details on Ansible Tower Features

Projects allow you to define that area or space that allows all your resources and playbooks to exist. It is a location where our playbooks are stored. The defaults point to GitHub, but you can choose manual as the source control credential type, and then we would have our playbooks in the local directory.  This is different from the recommended approach for production as you don’t have any version control for projects stored locally on the Tower machines.

Red Hat Ansible: Highlighting Projects Management

Before creating Job Templates, Credentials, Inventories, and everything necessary to run a Playbook, Tower needs to know where to find all the files required for the automation job. This is where projects come into play, and we can execute a lot of governance in project management. 

Source control and branching

First, governance of playbooks with Source Control Management (SCM). The Tower project components support the storage of playbooks in all major SCM systems, such as GitHub. 

Red Hat Ansible
Diagram: Red Hat Ansible and source control.

GitHub

Managing can be challenging even if only two people work on a Playbook. So, how do we follow changes across the enterprise? What if other people made a mistake? How do you roll back if they change the local machine’s text editor? So you can commit and push changes to GitHub and go back and forth to see who made what change. The advantages of adopting source control are:

  1. Increased scalability and manageability
  2. Audit trails of any modification
  3. Better security  
  4. The ability to perform distributed and automated testing 
  5. Multiple life cycle environments for the Ansible code (i.e., dev, test, Q.A. & prod)
  6. Consistency with CI/CD pipeline integration

Red Hat Ansible: Inventory

Basic Inventory

In its most basic form, an Inventory delivers host information to Ansible to trigger the tasks on the right managed assets. These may be containers, edge devices, or network nodes. In traditional and non-dynamic environments, the static inventory is adequate. However, as we develop our use of automation, we must transition to more effective methods of gathering ever-changing environment details. This is where dynamic inventory and smart inventories come into play.

Dynamic Inventory

When you have a dynamic inventory, such as one on AWS with an EC2 group, this will populate several different variables directly from AWS. This allows you to keep current on any insurance you have launched on AWS. A prime example is using a dynamic Inventory Plugin to gather inventory information from a cloud provider or hypervisor. Ansible Red Hat has built-in dynamic Inventory support, so you don’t need to edit configuration files or install additional Python modules.

Ansible Red Hat
Diagram: Ansible Red Hat and the Inventory.

Smart Inventory

Ansible and Ansible Tower have long been able to pull inventory from several sources, such as a local CMDB, private cloud, or public cloud. However, could you tell me what you need to do to automate your inventory? For example, let’s say you want to create an inventory across all machines tagged “dev” or all machines running a potentially vulnerable piece of software.

This is where you can use Smart Inventories. Smart inventory allows you to create inventories off Ansible Tower fact caching support. So, could you please create new inventories that include all hosts that match specific criteria? This can be based on host attributes such as groups or gathering facts. Gathering facts could be anything, such as the manufacturer or installed software service.

Benefits of Smart Inventories

This can be particularly helpful for dynamically creating inventories with a specific type of host based on a filter and saves the need for manually creating many different groups—or worse, having to add the same host multiple times.

Video: Ansible Inventory

This short educational tutorial will discuss the Ansible Inventory used to hold the Ansible Managed Host. We will then discuss the different types of inventories, static and dynamic. Along with other ways, you can apply variables to host in the inventory.

The Ansible Inventory | Ansible Automation
Prev 1 of 1 Next
Prev 1 of 1 Next

Red Hat Ansible: Machine Credentials 

When running a job template against one or more remote hosts or nodes, you must create a credential and associate it with your job template. The default is the machine credential, but we have many different credential types. A machine credential is, for example, an SSH username and password or an SSH username and a private key— these are stored securely in the backend database of Tower. 

Credential via Hashicorp Vault

Ansible Credential Plugin integration via Hashicorp Vault is an API addressable secrets engine that will make life easier for anyone wishing to handle secrets management and automation better. To automate effectively, modern systems require multiple secrets: certificates, database credentials, keys for external services, operating systems, and networking. 

Understanding who is accessing secret credentials and when is complex and often platform-specific, and managing key rotation, secure storage, and detailed audit logging across a heterogeneous toolset is almost impossible. Red Hat Tower solves numerous issues, but its integration with enterprise secret management solutions can utilize secrets on demand without human interaction.

Ansible Tower use cases
Diagram: Ansible Tower use cases and security.

Ansible Vault

Then we have Ansible Vault. Ansible Vault is a feature that keeps sensitive data in encrypted form, for example, passwords or keys, instead of saving them as plain text in roles or playbooks.  An Ansible vault is a standard file on your disk that you can edit using your favorite text editor, with one key difference. When you hit save, the file is locked inside strong AES-256 Encryption. What I like about this is that these vault files can be securely placed in source control or distributed to multiple locations.

Red Hat Ansible: Ansible Templates

With Ansible Tower, a Playbook is run from a Job Template. Within the job templates, we can specify the number of parameters and environment details for running the playbook. The template is a job definition with all of its parameters. In addition, the Job Template can be launched or scheduled. Scheduling is suitable for running playbooks at regular intervals, such as a nightly backup of configurations of all network devices.

 Video: Ansible Tower Job Templates

In this product demonstration, we will review the critical components of Ansible Tower and its use of job templates. We will then examine the different job template parameters you can use to create and deploy an automation job to your managed assets.

Ansible Tower Job Template
Prev 1 of 1 Next
Prev 1 of 1 Next

Two Options: Job or Workflow Template

So we have two options: add a standard Template or a Workflow Template. A job template runs a single playbook with one set of settings. On the other hand, we have a workflow template that says I want to run this job with this playbook, and then if that passes or fails, we are, for example, a continuous workflow of multiple templates. 

Job Template

Workflow Template

  • Default

  • Single Tasks

  • Useful with the check feature

  • Multiple teams

  • Chaining automation

  • Useful with delegation

Workflow Template

The real value here is that you can have one team of users; let’s say the Linux team creates a template. This template will reference its inventory and playbooks and has its permission structure with role-based access control. Then, we can have a Network team that has developed its Playbooks and grouped them into a template with its Inventory, Credentials, and permission structure.

Different teams, playbooks, and credentials

A job template allows you to connect all of this. This is done with a Job Workflow template visualizer, enabling you to connect numerous playbooks, updates, and workflows, even if different users run them, use other inventories, or have other credentials. The vital point is that the various teams use different Playbooks, Credentials, and Inventories, yet everything is easily linked in one automation unit. Therefore, complex dependencies between the templates can be broken down into steps.

Workflow approval nodes 

Workflow approval nodes require human interaction to advance the workflow. This interaction lets decision-makers approve the automation before it’s applied in the environment. A simple example of where this could be useful is the finance team checking if funds are available before deploying new services to the public cloud. Or if you want someone to double-check that there is enough capacity on the target hosts.

Ansible Red Hat: Automation Requirements

Ansible network
Diagram: Automation requirements.
  • Requirement: Low barrier of entry

With push-button deployment access, non-privileged users can safely deploy entire applications without any previous Ansible knowledge or risk of causing damage. 

  • Requirement: Better control and manageability

Ansible Tower is a welcomed addition to the power of the original Red Hat Ansible CLI version. It ensures that you can operate your infrastructure with automation and gain all the benefits of automation in a well-managed, secure, and auditable manner. Now, we need the ability to delegate authority to different users or teams and lock down access to particular projects or resources.

  • Requirement: The ability to schedule

Manual and ad hoc practices, even with the role of automation, can be inconsistent. Ansible Tower offers a more uniform and reliable way to manage your environment with Job Scheduling. One of Tower’s primary features is the ability to schedule jobs. Scheduling can enable periodic remediation, continuous deployment, or even scheduled nightly backups.

  • Requirement: Better visibility and real-time updates

Administrators want a real-time view of what Ansible is up to at any time, such as job status updates and playbook runs, as well as what’s working in their Ansible environment. All Ansible automation is centrally logged, ensuring audibility and compliance. With Ansible Tower, we have real-time analyses. It provides a real-time update about the completion of Ansible plays and tasks and each host’s success and failure. In addition, we can see our automation’s status and which will run next.

  • Requirements: Centralized logging and metrics

The Ansible Tower dashboard could better view our inventory, hosts, scheduled tasks, and manual job runs. However, we can incorporate Ansible Tower with the ELK stacks for additional information to better understand and predict future trends.

  • Requirement: Inventory management

Ansible Tower supports multiple Inventories, making creating dev, test, and similar production inventories easy. This will help you have better consistency throughout. Additionally, this provides a better way to manage and track their inventory across complex, hybrid virtualized, and cloud environments.

  • Requirement: System tracking and audit trail

System tracking. Verifies that machines are in compliance and configured as they should be. 

  • Requirement: Enterprise integration

For additional Ansible Tower use cases, several authentication methods make it easy to embed into existing tools and processes to help ensure the right people can access Ansible Tower resources. For example, Ansible Tower can link to central directories, such as Lightweight Directory Access Protocol (LDAP) and Azure Active Directory, to assist with authentication with the ability to create user accounts locally on the server itself.

Enterprise integration integrates Ansible into an existing environment and enterprise toolset. Self-service I.T. Provides the flexibility to free up time and delegate automation jobs to others.

  • Requirement: RESTful API

This allows Red Hat Tower to interact with other I.T. gear—enabling you to integrate Ansible Tower into existing areas of your infrastructure or your pipeline. For example, we can integrate Ansible Tower with ServiceNow and Inflowblox.  Every component and function of Ansible Tower can be API-driven. So it depends on your organization and how they operationalize their automation via the API or U.I.

Ansible Tower is a game-changer when it comes to streamlining IT operations. Its powerful features, centralized management, and extensive integrations make it a valuable tool for organizations of all sizes. By leveraging Ansible Tower, businesses can achieve greater efficiency, reduce human error, and drive innovation. Embrace the power of automation with Ansible Tower and embark on a journey towards a more agile and productive IT infrastructure.

Summary: Ansible Tower

In today’s fast-paced technological landscape, efficient IT operations are crucial for businesses to stay competitive. This is where Ansible Tower comes into play. This blog post explored its features and benefits and how it can revolutionize your IT workflows.

Section 1: Understanding Ansible Tower

Ansible Tower is a powerful automation platform that allows you to centralize and control your IT infrastructure. It provides a user-friendly web-based interface, making managing and automating complex tasks easy. With Ansible Tower, you can effortlessly orchestrate and scale your IT operations, saving time and resources.

Section 2: Key Features of Ansible Tower

Ansible Tower offers a wide range of features that enhance your IT operations. Some notable features include:

1. Job Templates: Create reusable templates for your automation tasks, ensuring consistency and efficiency.

2. Role-Based Access Control: Assign granular permissions to users and teams, ensuring proper access control.

3. Inventory Management: Easily manage your infrastructure inventory, making it simple to target specific hosts.

4. Workflow Visualization: Gain insights into your automation workflows with visual representations, enabling better tracking and troubleshooting.

Section 3: Benefits of Using Ansible Tower

Implementing Ansible Tower in your IT environment brings several benefits:

1. Increased Efficiency: Automate repetitive tasks, eliminating manual errors and saving your IT team valuable time.

2. Enhanced Collaboration: With a centralized platform, teams can collaborate seamlessly, improving communication and productivity.

3. Scalability and Flexibility: Ansible Tower allows you to scale your automation efforts, adapting to your growing infrastructure needs.

4. Compliance and Auditability: Maintain compliance with industry standards by enforcing security policies and tracking changes made through Ansible Tower.

Section 4: Real-World Use Cases

Various organizations across industries have adopted Ansible Tower. Here are a few real-world use cases:

1. Continuous Deployment: Streamline your software deployment processes, ensuring consistency and reducing time-to-market.

2. Configuration Management: Manage and enforce configuration standards across your infrastructure, guaranteeing consistency and minimizing downtime.

3. Security Compliance: Automate security hardening and configuration checks, ensuring compliance with industry regulations.

Conclusion:

Ansible Tower is a game-changer when it comes to streamlining IT operations. Its powerful features, scalability, and ease of use empower organizations to automate tasks, improve productivity, and enhance collaboration. Whether a small startup or a large enterprise, Ansible Tower can revolutionize your IT workflows, enabling you to stay ahead in the ever-evolving digital landscape.

Network visibility

Network Visibility

Network Visibility

In the interconnected world of today, where businesses heavily rely on networks to function and communicate, network visibility has emerged as a crucial factor in ensuring robust security and optimal performance. By providing real-time insights into network traffic, network visibility empowers organizations to detect and mitigate potential threats, troubleshoot issues efficiently, and optimize network resources. In this blog post, we will delve into the world of network visibility, exploring its benefits, key components, and best practices for implementation.

Network visibility refers to the ability to gain comprehensive insights into network traffic, both at the macro and micro levels. It involves capturing and analyzing data packets flowing through the network infrastructure, enabling organizations to monitor network behavior, identify anomalies, and gain actionable intelligence.

By having a holistic view of the network, organizations can proactively address security vulnerabilities, optimize resource allocation, and ensure a seamless end-user experience.

Table of Contents

Highlights: Network Visibility

Network Visibility Tools

The traditional network visibility tools give you the foundational data to see what’s going on in your network. Network visibility solutions are familiar, and network visibility tools such as NetFlow and IPFIX have been around for a while. However, they give you an incomplete part of the landscape. Then, we have a new way of looking with a new practice of distributed systems observability.

Observability

Observability software engineering brings a different context to the meaning of the data, allowing you to examine your infrastructure and its applications from other and more exciting angles. It combines traditional network visibility with a platform approach, enabling robust analysis and visibility with full-stack microservices observability.

Related: Before you proceed, you may find the following posts helpful:

  1. Observability vs. Monitoring
  2. Reliability In Distributed Systems
  3. WAN Monitoring
  4. Network Traffic Engineering
  5. SASE Visibility



Network Visibility Solutions

Key Network Visibility Discussion points:


  • The challenges with monitoring distributed systems.

  • Observability vs monitoring

  • Starting network visibility.

  • Network visibility tools.

  • Network visibility solutions.

  • Multilayer machine learning.

Back to Basics: Network Visibility

The Role of Network Security

Your network and valuable assets are under internal and external threats, ranging from disgruntled employees to worldwide hackers. There is no perfect defense because hackers can bypass, compromise, or evade almost every safeguard, countermeasure, and security control. In addition, bad actors are continually creating new attack techniques, writing new exploits, and discovering new vulnerabilities.

Some essential security aspects stem from understanding bad actors’ strategies, methods, and motivations. You can anticipate future attacks once you learn to think like a hacker. This lets you devise new defenses before a hacker can breach your organization’s network.

Understanding Network Visibility

Network visibility refers to gaining clear insights into the network infrastructure, traffic, and the applications running on it. It involves capturing, monitoring, and analyzing network data to obtain valuable information about network performance, user behavior, and potential vulnerabilities. By having a comprehensive network view, organizations can make informed decisions, troubleshoot issues efficiently, and proactively address network challenges.

Network Visibilty Tools

Main Network Visibility Components

Network Visibility

  • Network visibility relies on robust traffic monitoring tools that capture and analyze network packets in real-time.

  • Network taps are hardware devices that provide a non-intrusive way to access network traffic.

  • Network packet brokers act as intermediaries between network taps and monitoring tools.

  • Packet capture tools capture network packets and provide detailed insights.

  • Flow-based monitoring tools collect information on network flows.

♦ Key Components of Network Visibility

a) Traffic Monitoring: Effective network visibility relies on robust traffic monitoring tools that capture and analyze network packets in real time. These tools provide granular details about network performance, bandwidth utilization, and application behavior, enabling organizations to identify and resolve bottlenecks.

b) Network Taps: Network taps are hardware devices that provide a non-intrusive way to access network traffic. Organizations can gain full visibility into network data by connecting to a network tap without disrupting network operations. This ensures accurate monitoring and analysis of network traffic.

c) Network Packet Brokers: Network packet brokers act as intermediaries between network taps and monitoring tools. They collect, filter, and distribute network packets to the appropriate monitoring tools, optimizing traffic visibility and ensuring efficient data analysis.

d) Packet Capture and Analysis:

Packet capture tools capture network packets and provide detailed insights into network behavior, protocols, and potential issues. These tools enable deep packet inspection and analysis, facilitating troubleshooting, performance monitoring, and security investigations.

e) Flow-Based Monitoring:

Flow-based monitoring tools collect information on network flows, including source and destination IP addresses, protocols, and data volumes. By analyzing flow data, organizations can gain visibility into network traffic patterns, identify anomalies, and detect potential security threats.

 

 Lab Guide: Tcpdump

Capturing Traffic: Network Analysis

Tcpdump is a command-line packet analyzer tool that allows you to capture and analyze network packets. It captures packets from a network interface and displays their contents in real-time or saves them to a file for later analysis. With tcpdump, you can inspect packet headers, filter packets based on specific criteria, and perform detailed network traffic analysis.

Notes:

  1. Run tcpdump -D. This should show you the available interfaces to collect packet data.

  2. Run sudo tcpdump -i ens33 -s0 -w sample.pcap. This command does the following:

    • Captures packets coming from and going to the interface (-i) ens5

    • Sets the snaplengh (-s) to the maximum size. You may specify a number if you want to reduce the size of the packet being captured intentionally.

    • Writes (-w) the capture into a packet capture (pcap) format.

  3. Open a web browser and visit http://network-insight.net to generate some IP traffic.

Analysis:

Tcpdump finds its applications in various scenarios. One everyday use case is network troubleshooting, where administrators can capture and analyze packets to identify network issues such as latency, packet loss, or misconfigurations. Another use case is network security analysis, where tcpdump can help detect and investigate malicious activities, such as suspicious network traffic or potential intrusion attempts. Furthermore, tcpdump can be used for network performance monitoring, protocol debugging, and even educational purposes.

Tips:

To make the most out of tcpdump, here are some tips and tricks:

– Utilize filters: Tcpdump allows you to apply filters based on source/destination IP addresses, ports, protocols, and more. Filters help focus on relevant packet captures and reduce noise.

– Save to file: By saving captured packets to a file, you can analyze them later or share them with colleagues for collaborative troubleshooting or analysis.

– Combine with other tools: Tcpdump can be used with network analysis tools like Wireshark for a more comprehensive analysis. Wireshark provides a graphical interface and additional features for in-depth packet inspection.

Conclusion:

Tcpdump is a powerful and versatile tool for network packet analysis. Its ability to capture, filter, and analyze packets makes it invaluable for network administrators, security analysts, and anyone seeking to understand and troubleshoot network traffic. By leveraging tcpdump’s features and following best practices, you can gain valuable insights into your network and ensure its optimal performance and security.

Benefits of Network Visibility

a) Enhanced Performance Management: Network visibility enables organizations to monitor network performance metrics in real-time, such as latency, packet loss, and throughput. Organizations can promptly identify and address performance issues, optimize network resources, improve user experience, and reduce downtime.

b) Advanced Threat Detection: With the rise in sophisticated cyber threats, network visibility plays a crucial role in detecting and mitigating security breaches. Organizations can detect suspicious activities, unauthorized access attempts, and potential data exfiltration by analyzing network traffic patterns and anomalies.

c) Compliance and Regulatory Requirements: Many industries have strict compliance and regulatory requirements regarding data security and privacy. Network visibility helps organizations meet these requirements by providing visibility into data flows, ensuring secure transmission, and facilitating audit trails.

Implementing Network Visibility Strategies

a) Define Objectives: Organizations must identify specific network visibility objectives, such as improving application performance or enhancing security monitoring. Clear goals will guide the selection and implementation of appropriate network visibility solutions.

b) Choose the Right Tools: Organizations should evaluate and select the correct network visibility tools and technologies based on the defined objectives. This includes traffic monitoring tools, network taps, and network packet brokers that align with their requirements and infrastructure.

c) Integration and Scalability: Implementing network visibility solutions requires seamless integration with existing network infrastructure. Organizations should ensure compatibility and scalability to accommodate future growth and changing network dynamics.

 

Lab Guide: Network Visibility with Cisco IOS

Visibility with CDP

Cisco CDP is a proprietary Layer 2 network protocol developed by Cisco Systems. It operates at the Data Link Layer of the OSI model and enables network devices to discover and gather information about other directly connected devices. By exchanging CDP packets, devices can learn about their neighbors, including device types, IP addresses, and capabilities.

♦ The Benefits of Cisco CDP

a) Enhanced Network Visibility: Cisco CDP provides network administrators with real-time information about neighboring devices, enabling them to map and understand the network topology. This visibility helps identify potential points of failure, optimize network design, and troubleshoot issues promptly.

b) Simplified Network Management: With Cisco CDP, network administrators can quickly identify and track devices connected to the network. This simplifies device inventory management, configuration updates, and network change monitoring.

c) Improved Network Efficiency: Cisco CDP automatically reduces manual configuration efforts and minimizes human errors by automatically discovering neighboring devices. This leads to improved network efficiency, faster troubleshooting, and reduced downtime.

Use Cases for Cisco CDP

a) Network Troubleshooting: Cisco CDP can help identify the root cause when network issues arise. Administrators can quickly isolate faulty or misconfigurations by revealing information about neighboring devices and their connections.

b) Network Design and Planning: During a network infrastructure’s design and planning phase, Cisco CDP assists in creating accurate network diagrams and understanding device interconnections. This information is valuable for optimizing network performance and capacity planning.

c) Security Auditing: Cisco CDP also plays a role in network security auditing. Administrators can ensure network integrity and mitigate potential security risks by identifying unauthorized devices or rogue switches.

Conclusion:

Cisco CDP is a game-changer when it comes to network management and efficiency. With its ability to provide detailed information about neighboring devices, network administrators gain unparalleled visibility into their network topology. This enhanced visibility leads to improved troubleshooting, simplified management, and optimized network design. By leveraging the power of Cisco CDP, businesses can ensure a robust and reliable network infrastructure that meets their ever-evolving needs.

Security threats with network analysis and visibility

Remember, those performance problems are often a direct result of a security breach. So, distributed systems observability goes hand in hand with networking and security. It does this by gathering as much data as possible, commonly known as machine data, from multiple data points. It then ingests the data and applies normalization and correlation techniques with some algorithm or statistical model to derive meaning.  

network visibility tools
Diagram: The challenges of network visibility tools.

Starting Network Visibility

Network visibility solutions

Combating the constantly evolving threat actor requires good network analysis and visibility along with analytics into all areas of the infrastructure, especially the host and user behavior aligning with the traffic flowing between hosts. This is where machine learning (ML) and multiple analytical engines detect and respond to suspicious and malicious activity in the network.

This is done against machine data that multiple tools have traditionally gathered and stored in separate databases. Adding content to previously unstructured data will allow you to extract all sorts of valuable insights, which can be helpful for security, network performance, and user behavior monitoring.

System observability and data-driven visibility

The big difference between traditional network visibility and distributed systems observability is between seeing and understanding what’s happening in your network and, more importantly, understanding why it’s happening. This empowers you to get to the root cause more quickly. Be it a network or security-related incident. For all of this, we need to turn to data to find meaning, often called data-driven visibility in real-time, required to maximize positive outcomes while minimizing or eliminating issues before they happen.

Machine data and observability

Data-drive visibility is derived from machine data. So, what is machine data? Machine data is everywhere and flows from all the devices we interact with, making up around 90% of today’s data. And harnessing this data can give you powerful insights for networking and security. Furthermore,  machine data can be in many formats, such as structured and unstructured.

As a result, it can be challenging to predict and process. So when you find issues in machine data, you need to be able to fix them in less time. So, you need to be able to pinpoint, correlate, and alert specific events so we can save time. 

We need a platform that can perform network analysis and visibility instead of only using multiple tools dispersed throughout the network. A platform can take data from any device and create an intelligent, searchable index. For example, a SIEM solution can create a searchable index for you. There are several network visibility solutions, such as cloud-based or on-premise-based solutions. 

distributed systems observability
Diagram: Distributed systems observability and machine data.

Network Visibility Tools

Traditional, legacy, or network visibility tools are the data we collect with SNMP, network flows, and IPFIX, even from routing tables and geo-locations. To recap, IPFIX is an accounting technology that monitors traffic flows. IPFIX then interprets the client, server, protocol, and port used, counts the number of bytes and packets, and sends that data to an IPFIX collector.

Network flow or traffic is the amount of data transmitted across a network over a specific period. The flow identification is performed based on five fields in the packet header. These fields are the following: source I.P. address, destination I.P. address, protocol identifier, source port number, and destination port number.

Then, we have SNMP, a networking protocol for managing and monitoring network-connected devices. The SNMP protocol is embedded in multiple local devices. None of these technologies is going away; they must be correlated and connected.

Traditional network visibility tools:

Populate charts and create baselines

From this data, we can implement network security. First, we can create baselines, identify anomalies, and start to organize network activity. Alerts are triggered when thresholds are met. So we get a warning about a router that is down or an application is not performing as expected. This can be real-time or historical. However, this is all good for the previous way of doing things. But for example, when an application is not performing well, a threshold tells you nothing; you need to be able to see the full paths and any use of each part of the transaction.

All of which were used to populate the charts and graphs. These dashboards rely on known problems that we have seen in the past. However, today’s networks fail in creative ways often referred to as unknown/unknown, calling for a new approach to distributed systems observability that Site Reliability Engineering (SRE) teams employ.

Observability Software Engineering

To start an observability project, we need diverse data and visibility to see various things happening today. We don’t just have known problems anymore. We have a mix of issues that we have yet to see before. Networks fail in creative ways, some of which have never happened before. We need to look at the network differently with new and old network visibility tools and the practices of observability software engineering.

We need to diversify your data so we have multiple perspectives to understand better what you are looking at. And this can only be done with a distributed systems observability platform. What does this platform need?

Network analysis and visibility:

Multiple data types and point solutions

So, we need to get as much data as possible from all network visibility tools such as flows, SNMP, IPFIX, routing tables, packets, telemetry logs, metrics, logs, and traces. Of course, we are familiar with and have used everything in the past, and each data type provides a different perspective. However, the main drawback of not using a platform is that it lends itself to a series of point solutions, leaving gaps in network visibility.

Now we have a database of each one. So, we could have a database for network traffic flow information for application visibility or a database for SNMP. The issue with the point solution is that you can only see some things. Each data point acts on its island of visibility, and you will have difficulty understanding what is happening. At a bare minimum, you should have some automation between all these devices.

  • A key point: The use of automation as the starting point

Automation could be used to glue everything together. There are two variants of the Ansible architecture: a CLI version known as Ansible Core and a platform-based approach with Ansible Tower. Automation does not provide visibility, but it is a starting point to glue together the different point solutions to increase network visibility.

For example, you are collecting all logs from all firewall devices and sending them to a backend for analysis. Ansible variables are recommended, and you can use the Ansible inventory variable to fine-tune how you connect to your managed assets. In addition, variables bring many benefits and modularity to Ansible playbooks.

distributed systems observability
Diagram: Distributed systems observability: The issue of point solutions.

Isolated monitoring for practical network analysis and visibility

I know what happens on my LAN, but what happens in my service provider networks.? I can see VPC flows from a single cloud provider, but what happens in my multi-cloud designs? I can see what is happening in my interface states, but what is happening in my overlay networks? 

For SD-WAN monitoring, if a performance problem with one of my applications or a bad user experience is reported from a remote office, how do we map this back to tunnels? We have pieces of information that are missing end-to-end pictures. For additional information on monitoring and visibility in SD-WAN environments, check out this SDWAN tutorial.

The issue without data correlation?

How do we find out if there is a problem when we have to search through multiple databases and dashboards? And when there is a problem, how do you correlate to determine the root cause? What if you have tons of logs and must figure out that this interface utilization correlates with this slow DNS lookup time, which links to a change in BGP configuration?

So you can see everything with traditional or legacy visibility, but how do you go beyond that? How do you know why something has happened? This is where distributed systems observability and the practices of observability software engineering come in—having full-stack observability with network visibility solutions into all network angles.

Distributed Systems Observability:

Seeing is believing

The difference between seeing and understanding. Traditional network visibility solutions let you see what’s happening on your networks. But on the other hand, observability helps you understand why it is happening. With observability, we are not replacing network visibility; we are augmenting this with a distributed systems observability platform that lets us combine all the dots to form a complete picture. With a distributed systems observability platform, we still collect the same information.

For example, routing information, network traffic, VPC flow logs, results from synthetic tests, metrics, traces, and logs. But now we have several additional steps of normalization and correlations that the platform takes care of for you.

Distributed systems observability and normalization

Interface statistics could be packet per second; flow data might be in the percentage of traffic, such as 10% is DNS traffic. Then, we have to normalize and correlate it to understand what happens for the entire business transaction. So, the first step is to ingest as much data as possible, identify or tag data, and then normalize the data. Keep in mind this could be short-lived data, such as interface statistics.

network visibility tools
Diagram: Connecting the dots with network visibility tools.

Applying machine learning algorithms

All these different types of data are ingested, normalized, and correlated. And this can not be done with a human. Distributed systems observability gives you practical, actionable intelligence that automates the root cause and measures network health by applying machine learning algorithms.

We will discuss these machine learning algorithms and statistically analyze them momentarily. Supervised and unsupervised machine learning is used heavily in the security world. So, in summary, for practical network analysis and visibility, we need to do the following:

Number  

Summary Point 1

We must inject a large amount of data from many sources and types

Summary Point 2

Automate baseline and anomaly detection and make this more accurate

Summary Point 3

Accurate group data and create a structured amount of unstructured data

Summary Point 4

Then, correlate data to learn how everything is related to each other

      • This will give you full stack observability for enhanced network visibility that traditional network visibility tools cannot give you.

Full Stack Observability

We’d like to briefly describe the transitions we have gone through and why we need to address full-stack observability. First, we had a monolithic application, which is still very alive today, and this is where the mission-critical system lives. We then moved to the cloud and started adopting containers and platforms. Then, there was a drive to re-architect the code and begin from the beginning with cloud-native and now with observability.

Finally, monitoring becomes more important with the move to containers and kubernetes. Why? Because the environments are dynamic, you need to embed security somehow.

The traditional world of normality

In the past, network analysis and visibility were simple. Applications ran in single private data centers, potentially two data centers for high availability. These data centers were on-premises, and all components were housed internally. 

In addition, the network and infrastructure were pretty static, and there were few changes to the stack, for example, daily. However, nowadays, we are in a different environment where we have complex and distributed applications. This is with components/services located in many other places and types of places, on-premises and in the cloud, depending on local and remote services. 

The wave of containers and its effect on the network analysis and visibility

There has been a considerable rise in the use of containers. The container wave introduces dynamic environments with cloud-like behavior where you can scale up and down very quickly and easily. We have temporary components. These things are coming up and down inside containers and are part of services.

The paths and transactions are both complex but also shifting. So you have multiple steps or services for an application: A business transaction. It would be best if you strived to have the automatic discovery of business transactions and application topology maps of how the traffic flows.

The wave of Microservices and its effect on network analysis and visibility

With the wave towards microservices, we get the benefits of scalability and business continuity, but managing is very complex. In addition, what used to be method calls or interprocess calls within the monolith host now go over the network and are susceptible to deviations in latency. 

The issue of silo-based monitoring

With all these new waves of microservices and containers, we have an issue in silo monitoring with poor network analysis and visibility in a very distributed environment. Let us look at an example of isolating a problem with traditional network visibility and monitoring.

For mobile or web, the checkout is slow; for the application, there could be JVM perf issues. Then, on the database, we could have a slow SQL query; on the network side, we have an interface rate of 80%. So traditional network visibility and monitoring with a silo-based approach have their tools, but something needs to be connected; how do you quickly get to the root cause of this problem?

Network visibility solutions

When you look at monitoring, it’s based on event alerts and the dashboard. All of which is populated with passive ( sampling ) to generate a dashboard. It is also per domain. However, we have very complex, distributed, and hybrid environments.

We have a hybrid notion from a code perspective and physical location with cloud-native solutions. The way you consume API will be different, too, in each area. For example, how you consume API for SaaS will differ for authentication for on-premise and cloud. Keep in mind that API security is a top concern.

With our network visibility solutions, we must support all the journeys in a complex and distributed world. So we need system full-stack observability and observability software engineering to see what happens in each domain and to know what is happening in real-time.

So, instead of being passive with data, we are active with metrics, logs, traces, events, and any other types of data we can inject. If there is a network problem, we inject all network-related data. If there is a security problem, we inject all security-related information.

Example: Getting hit by Malware 

If Malware hits you, you need to be able to detect a container quickly. Then, avoid remote code execution attempts from succeeding while putting the affected server in quarantine for patching.

So There are several stages you need to perform. And the security breach has affected you across different domains and teams. The topology now all changes, too. The backend and front end will change, so we must re-route traffic while keeping the performance. To solve this, we need to analyze different types of data.

The different data types

So you need to inject as much telemetry data as possible: application, security, VPC, VNETs, and Internet statistics. So, we get all this data created via automation, metrics, events, logs, and distributed tracing based on open telemetry.

    • Metrics: Metrics are aggregated measurements grouped or collected at the standard interface or a given period. For example, there could be a 1 min aggregate, so some details are lots. Aggregation helps you save on storage but requires proper pre-planning on what metrics to consider.
    • Events are discrete actions happening at a specific moment in time. The more metadata associated with an event, the better. Events help confirm that particular actions occurred at a specific time. 
    • Logs: Logs are detailed and have timestamps associated with them. These can either be structured or unstructured. As a result, logs are very versatile and empower many use cases.
    • Traces: Traces are events that change between different application components. This item was purchased via credit cards at this time; it took 37 seconds to complete the transactions. All chain details and dependencies are part of the trace. Traces allow you to follow what is going on.

In the case of Malware detection. This is where a combination of metrics, traces, and logs would have helped you, and switching between views and having automated correlation will help you get to the root cause. But you must also detect and respond appropriately, leading us to secure network analytics.

Secure Network Analytics

We need good, secure network analytics for visibility and detection and then respond best. We have several different types of analytical engines that can be used to detect a threat. In the last few years, we have seen an increase in the talk and drive around analytics and how it can be used in networking and security. Many vendors claim they do both supervised and unsupervised machine learning. All of which are used in the detection phase.

distributed systems observability
Diagram: Distributed Systems Observability and issues of point solutions.

Algorithms and statistical models

For analytics, we have algorithms and statistical models. The algorithms and statistical models aim to achieve some outcome and are extremely useful in understanding constantly evolving domains with many variations.  This is precisely what the security domain is, by definition.

However, the threat landscape is growing daily, so if you want to find these threats, you need to shift through a lot of data, commonly known as machine data, that we discussed at the start of the post.

For supervised machine learning, we get a piece of Malware and build up a threat profile that can be gleaned from massive amounts of data. So when you see a matching behavior profile for that, you can make an alarm. But you need a lot of data to start with.

Crypto mining

This can capture very evasive threats such as crypto mining. A cryptocurrency miner is a software that uses your computer resources to mine cryptocurrency. A crypto mining event of the current miner is just a long-lived flow. It would be best if you had additional ways to determine or gather more metrics to understand that this long-lived flow is malicious and is a cryptocurrency miner.

full stack obervabiliity
Diagram: Full stack observability will capture crypto mining.

Multilayer Machine Learning Model

By their nature, crypto mining and even Tor will escape most security controls. To capture these, you need a multilayer machine learning model of supervised and unsupervised. So, if you are on a standard network that blocks Tor, it will stop 70% of the time; the other 30% of the entry and exit nodes are unknown.

Machine Learning (ML)

Supervised and unsupervised machine learning gives you the additional visibility to find those unknown / unknowns. The unique situations that are lurking on your networks. So here we are making an observation, and these models will help you understand whether these are not normal. There are different observation triggers.

First, there are known bad behavior, such as security policy violations and communication to known C&C. Then, we have anomaly conditions, which are observed behavior different from usual. And we need to make these alerts meaningful to the business.

network visibility tools
Diagram: Full stack observability with a layered approach.

Meaningful alerts

If I.P. addresses 192.168.1.1/24, upload a large amount of data. It should say that the PCI server is uploading a large amount of data to a known malicious external network, and these are the remediation options. The statement or alert needs to mean something to the business.

We need to express the algorithms in the company’s language. This host could have a behavior profile that does not expect it to download or upload anything. 

Augment Information

When events leave the system, you can enrich it with data from other systems. You can enhance data inputs with additional telemetry to improve data with other sources that give it more meaning. To help with your alarm, you can add information to the entity. There’s a lot of telemetry in the network. Most devices support NetFlow and IPFIX; you can have Encrypted Traffic Analyses (ETA) and Deep Packet Inspection (DPI).

encrypted traffic analysis
Diagram: Encrypted traffic analyses.

So you can get loads of valuable insights from these different types of, let’s say, technologies. You can get usernames, device identities, roles, pattern behavior, and locations for additional data sources here. ETA can get a lot of information just by looking at the header without performing decryption. So you can enhance your knowledge of the entity with additional telemetry data. 

Network analysis and visibility with a tiered alarm system

Once an alert is received, you can create actions such as sending a Syslog message, email, SMTP trap, and webhooks. So you have a tiered alarm system with different priorities and severity on alarms. Then, you can enrich or extend the detection with data from other products. It can query other products via their API, such as Cisco Talos.

Instead of presenting all the data, they must give them the data they care about. This will add context to the investigation and help the overworked security analyst who is spending 90 mins on one Phishing email investigation.

Summary: Network Visibility

Network visibility refers to real-time monitoring, analyzing, and visualizing network traffic, data, and activities. It provides a comprehensive view of the entire network ecosystem, including physical and virtual components, devices, applications, and users. By capturing and processing network data, organizations gain valuable insights into network performance bottlenecks, security threats, and operational inefficiencies.

The Benefits of Network Visibility

Enhanced Network Performance: Organizations can proactively identify and resolve performance issues with network visibility. They can optimize network resources, ensure smooth operations, and improve user experience by monitoring network traffic patterns, bandwidth utilization, and latency.

Strengthened Security Posture: Network visibility is a powerful security tool that enables organizations to detect and mitigate potential threats in real-time. By analyzing traffic behavior, identifying anomalies, and correlating events, businesses can respond swiftly to security incidents, safeguarding their digital assets and sensitive data.

Improved Operational Efficiency: Network visibility provides valuable insights into network usage, allowing organizations to optimize resource allocation, plan capacity upgrades, and streamline network configurations. This results in improved operational efficiency, reduced downtime, and cost savings.

Implementing Network Visibility Solutions

Network Monitoring Tools: Deploying robust monitoring tools is essential for achieving comprehensive visibility. These tools collect and analyze network data, generating detailed reports and alerts. Various monitoring techniques, from packet sniffing to flow-based analysis, can suit specific organizational needs.

Traffic Analysis and Visualization: Network visibility solutions often include traffic analysis and visualization capabilities, enabling organizations to gain actionable insights from network data. These visual representations help identify traffic patterns, trends, and potential issues at a glance, simplifying troubleshooting and decision-making processes.

Real-World Use Cases

Network Performance Optimization: A multinational corporation successfully utilizes network visibility to identify bandwidth bottlenecks and optimize network resources. By monitoring traffic patterns, they could reroute traffic and implement Quality of Service (QoS) policies, enhancing network performance and improving user experience.

Security Incident Response: A financial institution leverages network visibility to swiftly detect and respond to cybersecurity threats. By analyzing network traffic in real-time, they identified suspicious activities and potential data breaches, enabling them to take immediate action and mitigate risks effectively.

Conclusion: Network visibility is no longer a luxury but a necessity for businesses operating in today’s digital landscape. It empowers organizations to proactively manage network performance, strengthen security postures, and improve operational efficiency. By implementing robust network visibility solutions and leveraging the insights they provide, businesses can unlock the full potential of their digital infrastructure.

wan monitoring

WAN Monitoring

SD WAN Monitoring

In today's digital landscape, the demand for seamless and reliable network connectivity is paramount. This is where Software-Defined Wide Area Networking (SD-WAN) comes into play. SD-WAN offers enhanced agility, cost savings, and improved application performance. However, to truly leverage the benefits of SD-WAN, effective monitoring is crucial. In this blogpost, we will explore the importance of SD-WAN monitoring and how it empowers businesses to conquer the digital highway.

SD-WAN monitoring involves the continuous observation and analysis of network traffic, performance metrics, and security aspects within an SD-WAN infrastructure. It provides real-time insights into network behavior, enabling proactive troubleshooting, performance optimization, and security management.

WAN monitoring refers to the practice of actively monitoring and managing a wide area network to ensure its smooth operation. It involves collecting data about network traffic, bandwidth utilization, latency, packet loss, and other key performance indicators. By continuously monitoring the network, administrators can identify potential issues, troubleshoot problems, and optimize performance.

Key Points:

a. Proactive Network Management: WAN monitoring enables proactive identification and resolution of network issues before they impact users. By receiving real-time alerts and notifications, administrators can take immediate action to mitigate disruptions and minimize downtime.

b. Enhanced Performance: With WAN monitoring, administrators gain granular visibility into network performance metrics. They can identify bandwidth bottlenecks, optimize routing, and allocate resources efficiently, resulting in improved network performance and user experience.

c. Security and Compliance: WAN monitoring helps detect and prevent security breaches by monitoring traffic patterns and identifying anomalies. It enables the identification of potential threats, such as unauthorized access attempts or data exfiltration. Additionally, it aids in maintaining compliance with industry regulations by monitoring network activity and generating audit logs.

a. Scalability: When selecting a WAN monitoring solution, it is important to consider its scalability. Ensure that the solution can handle the size and complexity of your network infrastructure, accommodating future growth and network expansions.

b. Real-time Monitoring: Look for a solution that provides real-time monitoring capabilities, allowing you to detect issues as they occur. Real-time data and alerts enable prompt troubleshooting and minimize the impact on network performance.

c. Comprehensive Reporting: A robust WAN monitoring solution should offer detailed reports and analytics. These reports provide valuable insights into network performance trends, usage patterns, and potential areas for improvement.

Highlights: SD WAN Monitoring

The Role of SD-WAN

SD-WAN, or Software-Defined Wide Area Network, is a technology that enables organizations to connect and manage multiple locations through a centralized control mechanism. It allows for simplified network management, enhanced security, and improved application performance. By leveraging software-defined networking principles, SD-WAN provides businesses greater flexibility and agility in their network infrastructure.

Benefits of SD-WAN Monitoring

A. Proactive Issue Detection and Troubleshooting: SD-WAN monitoring equips IT teams to identify potential network issues before they impact user experience. With comprehensive visibility into network performance and traffic patterns, organizations can proactively address bottlenecks, latency, or other performance-related challenges.

B. Performance Optimization: By closely monitoring network traffic and application performance, SD-WAN monitoring enables organizations to optimize bandwidth allocation, ensuring critical applications receive priority and resources are efficiently utilized. This leads to enhanced user experience and increased productivity.

C. Security Management: SD-WAN monitoring plays a vital role in maintaining a secure network environment. It allows for real-time threat detection, anomaly detection, and policy enforcement. IT teams can promptly identify and mitigate security risks, ensuring the integrity and confidentiality of data transmitted across the SD-WAN.

SD WAN Monitoring

Choosing the Right SD-WAN Monitoring Solution

A. Comprehensive Network Visibility: Look for an SD-WAN monitoring solution that offers granular visibility into network performance, application behavior, and security events. Real-time analytics and customizable dashboards are essential for effective monitoring.

B. Scalability and Flexibility: As your network expands, scalability becomes crucial. Ensure the monitoring solution can handle growth and adapt to changing network requirements. Scalability should include multi-site monitoring capabilities and integration with different SD-WAN vendors.

C. Intelligent Alerting and Reporting: The monitoring solution should provide intelligent alerting mechanisms, notifying IT teams of critical issues or deviations from normal network behavior. Detailed reporting and analytics help assess network performance over time and make informed decisions.

SD-WAN Monitoring: The Components

Stage1: Application Visibility

For SD-WAN to make the correct provisioning and routing decisions, visibility into application performance is required. Therefore, SD-WAN enforces the right QoS policy based on how an application is tagged. To determine what prioritization they need within QoS policies, you need monitoring tools to deliver insights on various parameters, such as application response times, network saturation, and bandwidth usage. You control the overlay.

Stage2: Underlay Visibility

Then it would help if you considered underlay visibility. I have found a gap in visibility between the tunnels riding over the network and the underlying transport network. SD-WAN visibility leans heavily on the virtual overlay. For WAN underlay monitoring, we must consider the network is a hardware-dependent physical network responsible for delivering packets. The underlay network can be the Internet, MPLS, satellite, Ethernet, broadband, or any transport mode. A service provider controls the underlay.

Stage3: Security Visibility

Finally, and more importantly, security visibility. Here, we need to cover the underlay and overlay of the SD-WAN network, considering devices, domains, IPs, users, and connections throughout the network. Often, malicious traffic can hide in encrypted packets and appear like normal traffic—for example, crypto mining. The traditional deep packet inspection (DPI) engines have proven to fall short here.

We must look at deep packet dynamics (DPD) and encrypted traffic analysis (ETA). Combined with artificial intelligence (AI), it can fingerprint the metadata of the packet and use behavioral heuristics to see through encrypted traffic for threats without the negative aspects of decryption.

SD-WAN monitoring

Diagram: SD-WAN monitoring.

The traditional WAN

So, within your data center topology, the old approach to the WAN did not scale very well. First, there is cost, complexity, and the length of installation times. The network is built on expensive proprietary equipment that is difficult to manage, and then we have expensive transport costs that lack agility. Not to mention the complexity of segmentation with complex BGP configurations and tagging mechanisms used to control traffic over the WAN. There are also limitations to forwarding routing protocols. It’s not that they redesigned it severely; it’s just a different solution needed over the WAN.

There was also a distributed control plane where every node had to be considered and managed. And if you had multi-vendor equipment at the WAN edge, different teams could have managed this in other locations. 

No WAN control

You could look at 8 – 12 weeks as soon as you want to upgrade. With the legacy network, all the change control is with the service provider. I have found this to be a major challenge. There was also a significant architectural change where a continuous flow of applications moved to the cloud. Therefore, routing via the primary data center where the security stack was located was not as important. Instead, it was much better to route the application directly into the cloud in the first cloud world. 

WAN Modernization

The initial use case of SD-WAN and other routing control platforms was to increase the use of Internet-based links and reduce the high costs of MPLS. However, when you start deploying SD-WAN, many immediately see the benefits. So, as you deploy SD-WAN, you are getting 5 x 9s with dual internal links, and MPLS at the WAN edge of the network is something you could move away from, especially for remote branches.

There was also the need for transport independence and to avoid the long lead times associated with deploying a new MPLS circuit. With SD-WAN, you create SD-WAN overlay tunnels over the top of whatever ISP and mix and match as you see fit.

With SD-WAN, we now have an SD-WAN controller in a central location. This brings with it a lot of consistency in security and performance. In addition, we have a consistent policy pushed through the network regardless of network locations.

  • SD-WAN monitoring and performance-based application delivery

SD-WAN is also application-focused; we now have performance-based application delivery and routing. This type of design was possible with traditional WANs but was challenging and complex to manage daily. It’s a better use of capital and business outcomes. So we can use the less expensive connection without dropping any packets. There is no longer leverage in having something as a backup. With SD-WAN, you can find several virtual paths and routes around all failures.

Now, applications can be routed intelligently, and using performance as a key driver can make WAN monitoring more complete. It’s not just about making a decision based on up or down. Now we have the concept of brownouts, maybe high latency or high jitter. That circuit is not down, but the application will route around the issue with intelligent WAN segmentation.

Performance-based application delivery

Diagram: Performance-based application delivery.

Troubleshoot brownouts

    • Detecting brownouts

Traditional monitoring solutions focus on device health and cannot detect complex network service issues like brownouts. Therefore, it is critical to evaluate solutions that are easy to deploy and use to simulate end-user behavior from the suitable locations for the relevant network services. Most of the reported brownouts reported causes require active monitoring to detect. Five of the top six reasons brownouts occur can only be seen with active monitoring: congestion, buffer full drops, missing or misconfigured QoS, problematic in-line devices, external network issues, and poor planning or design of Wi-Fi.

Troubleshooting a brownout is difficult, especially when understanding geo policy and tunnel performance. What applications and users are affected, and how do you tie back to the SD-WAN tunnels? Brownouts are different from blackouts as application performance is affected.

SD-WAN Monitoring and Visibility

So, we have clear advantages to introducing SD-WAN; managers and engineers must consider how they operationalize this new technology. Designing and installing is one aspect, but how will SD-WAN be monitored and maintained? Where do visibility and security fit into the picture? While most SD-WAN solutions provide native network and application performance visibility, this isn’t enough. I would recommend that you supplement native SD-WAN visibility with third-party monitoring tools. SD-WAN vendors are not monitoring or observability experts. So, it is like a networking vendor jumping into the security space.

The issues of encrypted traffic and DPI

Traditionally, we look for anomalies against unencrypted traffic, and you can inspect the payload and use deep packet inspection (DPI). Nowadays, there is more than simple UDP scanning. Still, bad actors appear in encrypted traffic and can mask and hide activity among the usual traffic. This means some DPI vendors are ineffective and can’t see the payloads. Without appropriate visibility, the appliance will send a lot of alerts that are false positives.

Deep packet inspection technology

Deep packet inspection technology has been around for decades. It utilizes traffic mirroring to analyze the payload of each packet passing through a mirrored sensor or core device, the traditional approach to network detection and response (NDR). Most modern cyberattacks, including ransomware, lateral movement, and Advanced Persistent Threats (APT), heavily utilize encryption in their attack routines. However, this limitation can create a security gap since DPI was not built to analyze encrypted traffic.

Deep packet inspection technology

Diagram: Deep packet inspection technology.

So, the legacy visibility solutions only work for unencrypted or clear text protocols such as HTTP. In addition, DPI requires a decryption proxy, or middlebox, to be deployed for encrypted traffic. Middleboxes can be costly, introduce performance bottlenecks, and create additional security concerns. Previously, security practitioners would apply DPI techniques to unencrypted HTTP traffic to identify critical session details such as browser user agent, presence of a network cookie, or parameters of an HTTP POST. However, as web traffic moves from HTTP to encrypted HTTPS, network defenders are losing visibility into those details.

Good visibility and security posture

Introducing telemetry

We need to leverage your network monitoring infrastructure effectively for better security and application performance monitoring to be more effective, especially in the world of SD-WAN. However, this comes with challenges with collecting and storing standard telemetry and the ability to view encrypted traffic. The network teams spend a lot of time on security incidents, and sometimes, the security team has to look after network issues. So, both of these teams work together. For example, packet analysis needs to be leveraged by both teams, and flow control and other telemetry data need to be analyzed by the two teams.

The role of a common platform

It’s good that other network and security teams can work off a common platform and standard telemetry. A network monitoring system can be used while plugging into your SD-WAN controller to help operationalize your SD-WAN environments. Many application performance problems arise from security issues. So, you need to know your applications and examine encrypted traffic without decrypting.

Network performance monitoring and diagnostics

We have Flow, SNMP, and API for network performance monitoring and diagnostics. We have encrypted traffic analysis and machine learning (ML) for threat and risk identification for security teams. This will help you reduce complexity and will increase efficiency and emerge. So we have many things, such as secure access service edge (SASE) SD-WAN, and the network and security teams are under pressure to respond better.

Network Performance Monitoring

Diagram: Network performance monitoring.

Merging of network and security

The market is moving towards the merging of network and security teams. We see this with cloud, SD-WAN, and also SASE. So, with the cloud, for example, we have a lot of security built into the fabric. With VPC, we have security group policies built into the fabric. SD-WAN, we have end-to-end segmentation commonly based on an overlay technology. That can also be terminal on a virtual private cloud (VPC). Then, SASE is a combination of all.

SASE Model

We need to improve monitoring, investigation capabilities, and detection. This is where the zero trust architecture and technologies such as single packet authorization can help you monitor and enhance detection with the deduction and response solutions. In addition, we must look at network logging and encrypted traffic analyses to improve investigation capabilities. Regarding investment, we have traditionally looked at packets and logs but have SNMP, NetFlow, and API. There are a lot of telemetries that can be used for security, viewed initially as performance monitoring. Now, it has been managed as a security and cybersecurity use case.

SD-WAN Monitoring: The need for a baseline

You need to understand and baseline the current network for smooth SD-WAN rollouts. Also, when it comes to policy, it is no longer just a primary backup link and a backup design. Now, we have intelligence application profiling.  Everything is based on performance parameters such as loss, latency, and jitter. So, before you start any of this, you must have good visibility and observability. You need to understand your network and get a baseline for policy creation, and getting the proper visibility is the first step in planning the SD-WAN rollout process.

Network monitoring platform

For traditional networks, they will be SNMP, Flow data, and a lot of multi-vendor equipment. You need to monitor and understand how applications are used across the environment, and not everyone uses the same vendor for everything. For this, you need a network monitoring platform, which can easily be scaled to perform baseline and complete reporting and take into all multi-vendor networks. To deploy SD-WAN, you need a network monitoring platform to collect multiple telemetries, be multi-vendor, and scale. 

Variety of telemetry

Consuming packets, decoding this to IPFIX, and bringing API-based data is critical. So, you need to be able to consume all of this data. Visibility is key when you are rolling out SD-WAN. You first need to baseline to see what is expected. This will let you know if SD-WAN will make a difference and what type of difference it will make at each site. So, with SD-WAN, you can deploy application-aware policies that are site-specific or region-specific, but you first need a baseline to tell you what policies you need at each site.

QoS visibility

With a network monitoring platform, you can get visibility into QoS. This can be done by using advanced flow technologies to see the marking. For example, in the case of VOIP, the traffic should be marked as expedited forwarding (EF). Also, we need to be visible in the queueing, and shaping is also critical. You can assume that the user phones automatically market the traffic as EF.

QoS classification

Still, a misconfiguration at one of the switches in the data path could be remarking this to best efforts. Once you have all this data, you must collect and store it. The monitoring platform needs to be able to scale, especially for global customers, and to collect information for large environments. Flow can be challenging. What if you have 100,000 flow records per second? 

WAN capacity planning

When you have a baseline, you need to understand WAN capacity planning for each service provider. This will allow you to re-evaluate your service provider’s needs. In the long run, this will save costs. In addition, we can use WAN capacity planning to let you know each site is reaching your limit. WAN capacity planning is not just about reports. Now, we are looking extensively at the data to draw value. Here, we can see the introduction of artificial intelligence for IT operations (AIOps) and machine learning to help predict WAN capacity and future problems. This will give you a long-term prediction when deciding on WAN bandwidth and service provider needs.

Get to know your sites and POC.

You also need to know the sites. A network monitoring platform will allow you to look at sites and understand bandwidth usage across your service providers. This will enable you to identify what your critical sites are. You will want to have various sites and have a cross-section of other sites that may be on satellite connection or LTE, especially with retail. So look for varying sites, and learn about problematic sites where your users have problems with applications that are good candidates for proof of concepts. 

Your network performance management software will give you visibility into what sites to include in your proof of concept. This platform will tell you what sites are critical and which are problematic in terms of performance and would be a good mix for a proof of concept. When you get inappropriate sites in the mix, you will immediately see the return on investment (ROI) for SD-WAN. So uptime will increase, and you will see this immediately. But for this to be in effect, you first need a baseline.

Identity your applications: Everything is port 80

So, we have latency, jitter, and loss. Understanding when loss happens is apparent. However, with specific applications, with 1 – 5 % packet loss, there may not be a failover, which can negatively affect the applications. Also, many don’t know what applications are running. What about people connecting to the VPN with no split tunnel and then streaming movies?  We have IP and ports to identity applications running on your network, but everything is port 80 now. So, you need to be able to consume different types of telemetry from the network to understand your applications fully.

The issues with deep packet inspection

So, what about the homegrown applications that a DPI engine might not know about? Many DPI vendors will have trouble identifying these. It would help if you had the network monitoring platform to categorize and identify applications based on several parameters that DPI can’t. A DPI engine can classify many applications but can’t do everything. A network monitoring platform can create a custom application, let’s say, based on an IP address, port number, URL, and URI.  

Connecting to the SD-WAN Controller

The traditional approach is to talk to each device individually. However, in the SDN world, we need the network monitoring platform to connect the controller instead of each device. If you can connect to the controller, we have a single point of truth; this is where all the information can be gleaned. So, here, API connections can be formed to the SD-WAN controller to form a holistic view of all devices at the WAN edge. So, we need to collect different types of information from the SD-WAN controller for SD-WAN monitoring. 

SD-WAN API connectivity 

For example, some SD-WAN vendors give metrics via SNMP, Netflow, and API. So it would help if you collected all different types of telemetry data. This will allow you to identify network semantics, where the sites are, the WAN interface, service provider transports, and logical constructs such as the IP address allocations. Multi-telemetry will help you drive workflows, geo-location mappings, performance visibility, and detailed reporting. All of this can be brought together on one platform. So, we are bringing in all information about the environment, which is impossible with a Netflow v5 packet.

Requirements for a network monitoring platform

network monitoring platform

Diagram: Network monitoring platform.

Know application routing

The network monitoring platform needs to know the application policy and routing. It needs to know when there are error threshold events as applications are routed based on intelligence policy. Once the policy is understood, you must see how the overlay application is routed. With SD-WAN, we have per segment per topology to do this based on VRF or service VPN. We can have full mesh or regions with hub and spoke. Per segment, topology verification is also needed to know that things are running correctly. To understand the application policy, what the traffic looks like, and to be able to verify brownouts. 

Performance based routing

  • SD-WAN multi-vendor

Due to mergers or acquisitions, you may have an environment with multiple vendors for SD-WAN. Each vendor has its secret source, too. The network monitoring platform needs to bridge the gap and monitor both sides. There may even be different business units. So, how do you leverage common infrastructure to achieve this? We first need to leverage telemetry for monitoring and analysts. This is important as if you are putting in info packet analysis; this should be leveraged by both security and network teams, reducing tool sprawl.

Overcome the common telemetry challenges.

Trying standard telemetry does come with its challenge, and every type of telemetry has its one type of challenge. Firstly, Big Data: This is a lot of volume in terms of storage size—the speed and planning of where you will do all the packet analysis. Next, we have the collection and performance side of things. How do we collect all of this data? From a Flow perspective, you can get flow from different devices. So, how do you collect from all the edge devices and then bring them into a central location?

Finally, we have cost and complexity challenges. You may have different products for different solutions. We have an NPM for network performance monitoring, an NDR, and packet captures. Other products work on the same telemetry. Some often start with packet capture and move to an NPM or NDR solution.

A final note on encrypted traffic

  • SD-WAN encryption

With SD-WAN, everything is encrypted across public transport. So, most SD-WAN vendors can meter traffic on the LAN side before it enters the SD-WAN tunnels, but many applications are encrypted end to end. You need to identify even keystrokes through encrypted sessions. How can you get fully encrypted visibility? By 2025, all traffic will be encrypted. Here, we can use a network monitoring platform to identify and analyze threats among encrypted traffic.

  • Deep packet dynamics

So, you should be able to track and classify with what’s known as deep packet dynamic, which could include, for example, byte distributions, sequence of packets, time, jitter, RTT, and interflow stats. Now, we can push this into machine learning to identify applications and any anomalies associated with encryption. This can identify threats in encrypted traffic without decrypting the traffic.

Deep packet dynamics improve encrypted traffic visibility while remaining scalable and causing no impediment to latency or violation of privacy. Now, we have a malware detection method and cryptographic assessment of secured network sessions that does not rely on decryption. This can be done without having the keys or decrypting the traffic. Managing the session key for decryption is complex and can be costly computationally. It is also often incomplete. They often only support session key forwarding on Windows or Linux or not on MacOS, never mind the world of IoT.

encrypted traffic analysis

Diagram: Encrypted traffic analytics.

Encrypted traffic analytics

Cisco’s Encrypted Traffic Analytics (ETA) uses the software Stealthwatch to compare the metadata of benign and malicious network packets to identify malicious traffic, even if it’s encrypted, providing insight into threats in encrypted traffic without decryption. In addition, recent work on Cisco’s TLS fingerprinting can provide fine-grained details about the enterprise network’s applications, operating systems, and processes.

The issue with packet analysis is that everything is encrypted, especially with TLS1.3. The monitoring of the traffic and the WAN edge is encrypted. How do you encrypt all of this, and how do you store all of this? How do you encrypt traffic analysis? Decrypting traffic can create an exploit and potential attack surface, and you also don’t want to decrypt everything.

Summary: SD WAN Monitoring

In today’s digital landscape, businesses heavily rely on their networks to ensure seamless connectivity and efficient data transfer. As organizations increasingly adopt Software-Defined Wide Area Networking (SD-WAN) solutions, the need for robust monitoring becomes paramount. This blog post delved into SD-WAN monitoring, its significance, and how it empowers businesses to optimize their network performance.

Section 1: Understanding SD-WAN

SD-WAN, short for Software-Defined Wide Area Networking, revolutionizes traditional networking by leveraging software-defined techniques to simplify management, enhance agility, and streamline connectivity across geographically dispersed locations. By abstracting network control from the underlying hardware, SD-WAN enables organizations to optimize bandwidth utilization, reduce costs, and improve application performance.

Section 2: The Role of Monitoring in SD-WAN

Effective monitoring plays a pivotal role in ensuring the smooth operation of SD-WAN deployments. It provides real-time visibility into network performance, application traffic, and security threats. Monitoring tools enable IT teams to proactively identify bottlenecks, latency issues, or network disruptions, allowing them to address these challenges and maintain optimal network performance swiftly.

Section 3: Key Benefits of SD-WAN Monitoring

3.1 Enhanced Network Performance: SD-WAN monitoring empowers organizations to monitor and analyze network traffic, identify performance bottlenecks, and optimize bandwidth allocation. This leads to improved application performance and enhanced end-user experience.

3.2 Increased Security: With SD-WAN monitoring, IT teams can monitor network traffic for potential security threats, detect anomalies, and quickly respond to attacks or breaches. Monitoring helps ensure compliance with security policies and provides valuable insights for maintaining a robust security posture.

3.3 Proactive Issue Resolution: Real-time monitoring allows IT teams to identify and resolve issues before they escalate proactively. Organizations can minimize downtime, optimize resource allocation, and ensure business continuity by leveraging comprehensive visibility into network performance and traffic patterns.

Section 4: Best Practices for SD-WAN Monitoring

4.1 Choosing the Right Monitoring Solution: Select a monitoring solution that aligns with your organization’s specific needs, supports SD-WAN protocols, and provides comprehensive visibility into network traffic and performance metrics.

4.2 Monitoring Key Performance Indicators (KPIs): Define relevant KPIs such as latency, packet loss, jitter, and bandwidth utilization to track network performance effectively. Regularly monitor these KPIs to identify trends, anomalies, and areas for improvement.

4.3 Integration with Network Management Systems: Integrate SD-WAN monitoring tools with existing network management systems and IT infrastructure to streamline operations, centralize monitoring, and enable a holistic network view.

Conclusion:

SD-WAN monitoring is a critical component of successful SD-WAN deployments. By providing real-time visibility, enhanced network performance, increased security, and proactive issue resolution, monitoring tools empower organizations to maximize the benefits of SD-WAN technology. As businesses continue to embrace SD-WAN solutions, investing in robust monitoring capabilities will be essential to ensuring optimal network performance and driving digital transformation.

security

Implementing Network Security

Implementing Network Security

In today's interconnected world, where technology reigns supreme, the need for robust network security measures has become paramount. This blog post aims to provide a detailed and engaging guide to implementing network security. By following these steps and best practices, individuals and organizations can fortify their digital infrastructure against potential threats and protect sensitive information.

Network security is the practice of protecting networks and their infrastructure from unauthorized access, misuse, or disruption. It encompasses various technologies, policies, and practices aimed at ensuring the confidentiality, integrity, and availability of data. By employing robust network security measures, organizations can safeguard their digital assets against cyber threats.

Network security encompasses a range of measures designed to protect computer networks from unauthorized access, data breaches, and other malicious activities. It involves both hardware and software components, as well as proactive policies and procedures aimed at mitigating risks. By understanding the fundamental principles of network security, organizations can lay the foundation for a robust and resilient security infrastructure.

Before implementing network security measures, it is crucial to conduct a comprehensive assessment of potential risks and vulnerabilities. This involves identifying potential entry points, evaluating existing security measures, and analyzing the potential impact of security breaches. By conducting a thorough risk assessment, organizations can develop an effective security strategy tailored to their specific needs.

- Implementing Strong Access Controls: One of the fundamental aspects of network security is controlling access to sensitive information and resources. This includes implementing strong authentication mechanisms, such as multi-factor authentication, and enforcing strict access control policies. By ensuring that only authorized individuals have access to critical systems and data, organizations can significantly reduce the risk of unauthorized breaches.

- Deploying Firewalls and Intrusion Detection Systems: Firewalls and intrusion detection systems (IDS) are essential components of network security. Firewalls act as a barrier between internal and external networks, monitoring and filtering incoming and outgoing traffic. IDS, on the other hand, analyze network traffic for suspicious activities or patterns that may indicate a potential breach. By deploying these technologies, organizations can detect and prevent unauthorized access attempts.

- Regular Updates and Patches: Network security is an ongoing process that requires constant attention and maintenance. Regular updates and patches play a crucial role in addressing vulnerabilities and fixing known security flaws. It is essential to keep all network devices, software, and firmware up to date to ensure optimal protection against emerging threats.

Highlights: Implementing Network Security

Implementing Network Security 

Understanding Network Security

Network security refers to the practices and measures used to prevent unauthorized access, misuse, modification, or denial of computer networks and their resources. It involves implementing various protocols, technologies, and best practices to ensure data confidentiality, integrity, and availability. Individuals and organizations can make informed decisions to protect their networks by understanding network security fundamentals.

Firewalls: Firewalls are a crucial barrier between an internal network and the external world. They monitor and control incoming and outgoing network traffic based on predetermined security rules. By analyzing packet data, firewalls can identify and block potential threats, such as malicious software or unauthorized access attempts. Implementing a robust firewall solution is essential to fortify network security.

Intrusion Detection Systems (IDS): Intrusion Detection Systems (IDS) play a proactive role in network security. They continuously monitor network traffic, analyzing it for suspicious activities and potential security breaches. IDS can detect patterns and signatures of known attacks and identify anomalies that may indicate new or sophisticated threats. By alerting network administrators in real time, IDS helps mitigate risks and enable swift response to potential security incidents.

Virtual Private Networks (VPNs): In an era of prevalent remote work and virtual collaboration, Virtual Private Networks (VPNs) have emerged as a vital component of network security. VPNs establish secure and encrypted connections between remote users and corporate networks, ensuring the confidentiality and integrity of data transmitted over public networks. By creating a secure “tunnel,” VPNs protect sensitive information from eavesdropping and unauthorized interception, offering a safe digital environment.

Authentication Mechanisms: Authentication mechanisms are the bedrock of network security, verifying the identities of users and devices seeking access to a network. From traditional password-based authentication to multi-factor authentication and biometric systems, these mechanisms ensure that only authorized individuals or devices gain entry. Robust authentication protocols significantly reduce the risk of unauthorized access and protect against identity theft or data breaches.

Encryption: Encryption plays a crucial role in maintaining the confidentiality of sensitive data. By converting plaintext into an unreadable format using complex algorithms, encryption ensures that the information remains indecipherable to unauthorized parties even if intercepted. Whether it’s encrypting data at rest or in transit, robust encryption techniques are vital to protecting the privacy and integrity of sensitive information.

Understanding IPv4 and IPv6 Network Security

IPv4 Network Security:

IPv4, the fourth version of the Internet Protocol, has been the backbone of the Internet for several decades. However, its limited address space and security vulnerabilities have prompted the need for a transition to IPv6. IPv4 faces various security challenges, such as IP spoofing, distributed denial-of-service (DDoS) attacks, and address exhaustion.

Issues like insufficient address space and lack of built-in encryption mechanisms make IPv4 networks more susceptible to security breaches. To enhance IPv4 network security, organizations should implement measures like network segmentation, firewall configurations, intrusion detection systems (IDS), and regular security audits. Staying updated with security patches and protocols like HTTPS can mitigate potential risks.

IPv6 Network Security:

IPv6, the latest version of the Internet Protocol, offers significant improvements over its predecessor. Its expanded address space, improved security features, and built-in encryption make it a more secure choice for networking.

IPv6 incorporates IPsec (Internet Protocol Security), which provides integrity, confidentiality, and authentication for data packets. With IPsec, end-to-end encryption and secure communication become more accessible, enhancing overall network security.

IPv6 simplifies IP address assignment and reduces the risk of misconfiguration. This feature and temporary addresses improve network security by making it harder for attackers to track devices.

IPv4 to IPv6 Transition

Security Considerations

Dual Stack Deployment: While transitioning from IPv4 to IPv6, organizations often deploy dual-stack networks, supporting both protocols simultaneously. However, this introduces additional security considerations, as vulnerabilities in either protocol can impact the overall network security.

Transition Mechanism Security: Various transition mechanisms, such as tunneling and translation, facilitate communication between IPv4 and IPv6 networks. Ensuring the security of these mechanisms is crucial, as they can introduce potential vulnerabilities and become targets for attackers.

 

 

Example: IPv6 Access Lists

Understanding IPv6 Access-Lists

IPv6, the next-generation Internet Protocol, brings new features and enhancements. One critical aspect of IPv6 is the access list, which allows network administrators to filter and control traffic based on various criteria. Unlike IPv4 access lists, IPv6 access lists offer a more robust and flexible approach to network security.

One of the primary purposes of IPv6 access lists is to filter traffic based on specific conditions. This section will explore various filtering techniques, including source and destination IP address, protocol, and port-based filtering. We will also discuss using prefix lists and leveraging them to enhance traffic filtering capabilities.

Monitoring and Incident Response

Implement a comprehensive monitoring system to detect and respond to potential security incidents. This includes real-time network traffic analysis, log monitoring, and intrusion detection systems. Establish an incident response plan that outlines the steps to be taken in case of a security breach, ensuring a swift and effective response to minimize damage.

Example: IPv6 Connectivity

Understanding Multicast Communication

Multicast communication allows data to be efficiently transmitted from a single sender to multiple recipients. Unlike unicast (one-to-one) or broadcast (one-to-all) communication, multicast offers a scalable and optimized approach for distributing information across networks. It conserves bandwidth while ensuring that data reaches only the intended recipients.

IPv6 introduces several types of addresses, each serving a specific purpose. These include unicast, anycast, and multicast addresses. Unicast addresses identify individual nodes, anycast addresses designate a group of nodes where the data can be delivered to the nearest one, and multicast addresses enable communication with multiple nodes simultaneously.

The Solicited Node Multicast Address is significant among the different types of multicast addresses. Its primary purpose is to resolve IPv6 unicast addresses to their corresponding multicast addresses efficiently. By using the solicited-node multicast address, devices can discover the presence of other nodes on the network without flooding the entire network with unnecessary traffic.

Structure of IPv6 Solicited Node Multicast Address

The structure of an IPv6 Solicited Node Multicast Address is derived from the corresponding unicast address. It starts with the prefix FF02:0:0:0:0:1:FF00::/104, followed by the last 24 bits of the unicast address. This ensures that the multicast address is unique to the unicast address and can be easily derived.

Using IPv6 Solicited Node Multicast Addresses brings several benefits to network communication. First, it enables efficient neighbor discovery and address resolution, reducing unnecessary network traffic. Second, it facilitates the implementation of protocols like the Neighbor Discovery Protocol (NDP) in IPv6 networks. Overall, incorporating solicited-node multicast addresses enhances the scalability, performance, and reliability of IPv6 networks.

Computer Technology is changing.

Computer networking technology is evolving and improving faster than ever before. Most organizations and individuals now have access to wireless connectivity. However, malicious hackers increasingly use every means available to steal identities, intellectual property, and money.

Many organizations spend little time, money, or effort protecting their assets during the initial network installation. Both internal and external threats can cause a catastrophic system failure or compromise. Depending on the severity of the security breach, a company may even be forced to close its doors. Business and individual productivity would be severely hindered without network security.

The Role of Trust

Trust must be established for a network to be secure. An organization’s employees assume all computers and network devices are trustworthy. However, it is essential to note that not all trusts are created equal. Different layers of trust can (and should) be used.

Privileges and permissions are granted to those with a higher trust level. Privileges allow an individual to access an asset on a network, while permissions authorize an individual to access an asset. Violations of trust are dealt with by removing the violator’s access to the secure environment. For example, an organization may terminate an untrustworthy employee or replace a defective operating system.

Example: IPSec in IPv6 over IPv4 GRE

IPv6 over IPv4 GRE (Generic Routing Encapsulation) is a tunneling protocol that allows the transmission of IPv6 packets over an existing IPv4 network infrastructure. It encapsulates IPv6 packets within IPv4 packets, enabling seamless communication between networks that have not yet fully adopted IPv6.

IPSec (Internet Protocol Security) ensures the confidentiality, integrity, and authenticity of the data transmitted over the IPv6 over the IPv4 GRE tunnel. IPSec safeguards the tunnel against malicious activities and unauthorized access by providing robust encryption and authentication mechanisms.

3.1 Enhanced Security: With IPSec’s encryption and authentication capabilities, IPv6 over IPv4 GRE with IPSec offers a high level of security for data transmission. This is particularly important in scenarios where sensitive information is being exchanged.

3.2 Seamless Transition: IPv6 over IPv4 GRE allows organizations to adopt IPv6 gradually without disrupting their existing IPv4 infrastructure. This smooth transition path ensures minimal downtime and compatibility issues.

3.3 Expanded Address Space: IPv6 provides a significantly larger address space than IPv4, addressing the growing demand for unique IP addresses. By leveraging IPv6 over IPv4 GRE, organizations can tap into this expanded address pool while still utilizing their existing IPv4 infrastructure.

Network Visibility

Appropriate network visibility is critical to understanding network performance and implementing network security components. Much of the technology used in network performance, such as Netflow, is security-focused. The landscape is challenging; workloads move to the cloud without monitoring or any security plan. We need to find a solution to have visibility over these clouds and on-premise applications without refuting the entire tracking and security stack.

Networking is Complex

Our challenge is that the network is complex and constantly changing. We have seen this with WAN monitoring and the issues that can arise from routing convergence. This may not come as a hardware refresh, but it constantly changes from a network software perspective and needs to remain dynamic. If you don’t have complete visibility while the network changes, this will result in different security blind spots.

Security Tools

Existing security tools are in place, but better security needs to be integrated. Here, we can look for the network and provide that additional integration point. In this case, we can use a network packet broker to sit in the middle and feed all the security tools with data that has already been transformed or, let’s say, optimized for that particular security device it is sending back to, reducing false positives.

Port Scanning

When interacting with target systems for the first time, it is expected to perform a port scan. A port scan is a way of identifying open ports on the target network. Port scans aren’t just conducted for the sake of conducting them. They allow you to identify applications and services by listening to ports. Identifying security issues on your target network is always the objective so your client or employer can improve their security posture. To identify vulnerabilities, we need to identify the applications.

Follow a framework

A business needs to follow a methodology that provides additional guidance. Adopting a framework could help solve this problem. Companies can identify phases to consider implementing security controls using NIST’s Cybersecurity Framework. According to NIST, the phases are identifying, protecting, detecting, responding, and recovering. The NIST Cybersecurity Framework is built around these five functions.

Example: IPv6 Neighbor Discovery

Understanding IPv6 Neighbor Discovery Protocol

The Neighbor Discovery Protocol (NDP) is a fundamental part of the IPv6 protocol suite. It replaces the Address Resolution Protocol (ARP) used in IPv4 networks. NDP plays a crucial role in various aspects of IPv6 networking, including address autoconfiguration, neighbor discovery, duplicate address detection, and router discovery. Network administrators can optimize their IPv6 deployments by understanding how NDP functions and ensuring smooth communication between devices.

Address Autoconfiguration

One of NDP’s key features is its ability to facilitate address autoconfiguration. With IPv6, devices can generate unique addresses based on specific parameters, eliminating the need for manual configuration or reliance on DHCP servers. NDP’s Address Autoconfiguration process enables devices to obtain their global and link-local IPv6 addresses, simplifying network management and reducing administrative overhead.

Neighbor Discovery

Neighbor Discovery is another vital aspect of NDP. It allows devices to discover and maintain information about neighboring nodes on the same network segment. Through Neighbor Solicitation and Neighbor Advertisement messages, devices can determine the link-layer addresses of neighboring devices, verify their reachability, and update their neighbor cache accordingly. This dynamic process ensures efficient routing and enhances network resilience.

Duplicate Address Detection

IPv6 NDP incorporates Duplicate Address Detection (DAD) to prevent address conflicts. When a device joins a network or configures a new address, it performs DAD to ensure the uniqueness of the chosen address. By broadcasting Neighbor Solicitation messages with the tentative address, the device can detect if any other device on the network is already using the same address. DAD is an essential mechanism that guarantees the integrity of IPv6 addressing and minimizes the likelihood of address conflicts.

Understanding Multicast Communication

Multicast communication plays a vital role in IPv6 networks, enabling efficient transmission of data to multiple recipients simultaneously. Unlike unicast communication, where data is sent to a specific destination address, multicast uses a group address to reach a set of interested receivers. This approach minimizes network traffic and optimizes resource utilization.

The Role of Solicited Node Multicast Address

The Pv6 Solicited Node Multicast Address is a specialized multicast address primarily used in IPv6 networks. It is crucial in enabling efficient neighbor discovery and address resolution processes. When a node joins an IPv6 network, it sends a Neighbor Solicitation message to the solicited node multicast address corresponding to its IPv6 address. This allows neighboring nodes to quickly respond with Neighbor Advertisement messages, establishing a communication link.

The construction of a Pv6 Solicited Node Multicast Address involves a specific pattern. It is formed by taking the prefix FF02:0:0:0:0:1:FF00/104 and appending the last 24 bits of the unicast address of the node being resolved. This process ensures that the solicited node multicast address is unique and only reaches the intended recipients.

Using Pv6 Solicited Node Multicast Address brings several benefits to IPv6 networks. Firstly, it significantly reduces the volume of network traffic by limiting the scope of Neighbor Solicitation messages to interested nodes. This helps conserve network resources and improves overall network performance. Additionally, the rapid and efficient neighbor discovery enabled by solicited node multicast addresses enhances the responsiveness and reliability of communication in IPv6 networks.

Example Technology: IPv6 Network Address Translation 

Understanding NPTv6

NPTv6, an evolution of NAT64, is an IPv6 transition technology that facilitates communication between IPv6-only and IPv4-only networks. It allows for seamless connectivity by translating IPv6 prefixes to IPv4 addresses, enabling efficient communication across different network types. NPTv6 bridges the gap between IPv6 and IPv4 by providing this translation mechanism, facilitating the transition to the next-generation internet protocol.

NPTv6 offers several notable features that make it a compelling choice for network architects and administrators. Firstly, it provides transparent communication between IPv6 and IPv4 networks, ensuring compatibility and interoperability. Additionally, NPTv6 supports stateful and stateless translation modes, providing flexibility for various deployment scenarios. Its ability to handle large-scale address translation efficiently makes it suitable for environments with extensive IPv6 adoption.

The adoption of NPTv6 brings forth numerous benefits and implications for network infrastructure. Firstly, it simplifies the transition process by eliminating the need for dual-stack configurations, reducing complexity and potential security vulnerabilities. NPTv6 also promotes IPv6 adoption by enabling communication with legacy IPv4 networks, facilitating a gradual migration strategy. Moreover, NPTv6 can alleviate the strain on IPv4 address exhaustion, extending the lifespan of existing IPv4 infrastructure.

Example Technology: NAT64

Understanding NAT64

NAT64 is a translator between IPv6 and IPv4, allowing devices using different protocols to communicate effectively. With the depletion of IPv4 addresses, the transition to IPv6 becomes crucial, and NAT64 plays a vital role in enabling this transition. By facilitating communication between IPv6-only and IPv4-only devices, NAT64 ensures smooth connectivity in a mixed network environment.

NAT64 operates by mapping IPv6 to IPv4 addresses, allowing seamless communication between the two protocols. It employs various techniques, such as stateful and stateless translation, to ensure efficient packet routing between IPv6 and IPv4 networks. NAT64 enables devices to communicate across different network types by dynamically translating addresses and managing traffic flow.

NAT64 offers several advantages, including preserving IPv4 investments, simplified network management, and enhanced connectivity. It eliminates the need for costly dual-stack deployment and facilitates the coexistence of IPv4 and IPv6 networks. However, NAT64 also poses challenges, such as potential performance limitations, compatibility issues, and the need for careful configuration to ensure optimal results.

NAT64 finds applications in various scenarios, including service providers transitioning to IPv6, organizations with mixed networks, and mobile networks facing IPv4 address scarcity. It enables these entities to maintain connectivity and seamlessly bridge the gap between network protocols. NAT64’s versatility and compatibility make it a valuable tool in today’s evolving network landscape.

Related: For pre-information, you may find the following post helpful:

  1. Technology Insight For Microsegmentation
  2. SASE Visibility
  3. Network Traffic Engineering
  4. Docker Default Networking 101
  5. Distributed Firewalls
  6. Virtual Firewalls



Implementing Network Security.

Key Implementing Network Security Discussion points:


  • The use of a network packet broker.

  • Monitoring and Observability.

  • The different hacking stages.

  • How to implement network security.

  • The issues with encrypted traffic.

Back to Basics: Implementing Network Security

The Role of Network Security

For sufficient network security to be in place, it is essential to comprehend its central concepts and the implied technologies and processes around it that make it robust and resilient to cyber-attacks. However, all of this is complicated when the visibility is blurred by not having a demarcation of the various network boundaries.

Moreover, network security touches upon multiple attributes of security controls that we need to consider, such as security gateways, SSL inspection, threat prevention engines, policy enforcement, cloud security solutions, threat detection and insights, and attack analysis w.r.t frameworks, to name a few.

implementing network security
Diagram: Implementing network security.

One of the fundamental components of network security is the implementation of firewalls and intrusion detection systems (IDS). Firewalls act as a barrier between your internal network and external threats, filtering out malicious traffic. On the other hand, IDS monitors network activity and alerts administrators of suspicious behavior, enabling rapid response to potential breaches.

Enforcing Strong Authentication and Access Controls

Unauthorized access to sensitive data can have severe consequences. Implementing robust authentication mechanisms, such as two-factor authentication (2FA) or biometric verification, adds an extra layer of security. Additionally, enforcing stringent access controls, limiting user privileges, and regularly reviewing user permissions minimize the risk of unauthorized access.

Regular Software Updates and Patch Management

Cybercriminals often exploit vulnerabilities in outdated software. Regularly updating and patching your network’s software, including operating systems, applications, and security tools, is crucial to prevent potential breaches. Automating the update process helps ensure your network remains protected against emerging threats whenever possible.

Data Encryption and Secure Communication

Protecting sensitive data in transit is essential to maintain network security. Implementing encryption protocols, such as Secure Sockets Layer (SSL) or Transport Layer Security (TLS), safeguards data as it travels across networks. Additionally, using Virtual Private Networks (VPNs) ensures secure communication between remote locations and adds an extra layer of encryption.

Site to Site VPN

Assessing Vulnerabilities

Conducting a comprehensive assessment of your network infrastructure before diving into network security implementation is crucial. Identify potential vulnerabilities, weak points, and areas that require immediate attention. This assessment will serve as a foundation for developing a tailored security plan.

Building a Strong Firewall

One of the fundamental elements of network security is a robust firewall. A firewall acts as a barrier between your internal network and the external world, filtering incoming and outgoing traffic based on predefined rules. Ensure you invest in a reliable firewall solution with advanced features such as intrusion detection and prevention systems.

Firewall traffic flow
Diagram: Firewall traffic flow and NAT

Enforcing Access Controls

Controlling user access is vital to prevent unauthorized entry and data breaches. Implement strict access controls, including strong password policies, multi-factor authentication, and role-based access controls (RBAC). Regularly review user privileges to ensure they align with the principle of least privilege (PoLP).

Encrypting Data

Data encryption is critical to network security, mainly when transmitting sensitive information. Utilize industry-standard encryption algorithms to protect data at rest and in transit. Implement secure protocols like HTTPS for web communication and VPNs for remote access.

Monitoring and Intrusion Detection

Network security is an ongoing process that requires constant vigilance. Implement a robust monitoring and intrusion detection system (IDS) to detect and respond promptly to potential security incidents. Monitor network traffic, analyze logs, and employ intrusion prevention systems (IPS) to protect against attacks proactively.

Knowledge Check: Malware

Virus

Antivirus software is often used to protect or eradicate malicious software, so it is probably no surprise that virus is one of the most commonly used words to describe malware. Malware is not always a virus, but all computer viruses are malware. For a virus to infect a system, it must be activated by the user. For the virus to be executed, the user must do something. After infecting the system, the virus may inject code into other programs, so the virus remains in control when those programs run. Regardless of whether the original executable and process are removed, the system will remain infected if the infected programs run. The virus must be removed entirely.

Worm

There is a common misconception that worms are malicious but not. In addition to Code Red and Nimda, many other notorious worms have caused severe damage around the world. It is also possible to contract worms like Welchia/Nachi, in addition to removing another worm, Blaster, that worm patched systems so they were no longer vulnerable to Blaster. Removing malware such as Blaster is not enough to combat a worm. Removing malware is insufficient; if the worm’s vulnerability is not fixed, it will reinfect from another source.

Trojan

As with viruses, Trojans are just another type of malware. Its distinctive feature is that it appears to be something it’s not. Although it’s probably well known, the term Trojan horse was used to describe it. During the Trojan War, the Greeks built a horse for the Trojans as a “gift” to them. There were Greeks inside the gift horse. Instead of being a wooden horse statue, it was used to deliver Greek soldiers who crept out of the horse at night and attacked Troy from within.

Botnet

Viruses, worms, and Trojan horses can deliver botnets as part of their payload. Botnets are clients that are installed when you hear the word. Botnets are collections of endpoints infected with a particular type of malware. Botnet clients connect to command-and-control infrastructure (or C&C) through small pieces of software. The client receives commands from the C&C infrastructure. The purpose of a botnet is primarily to generate income for its owner, but it can be used for various purposes. Clients serve as facilitators of that process.

Monitoring Observability

Increased enterprise security challenges demand new efforts and methods to stay ahead of threat actors. Therefore, monitoring the environment must be taken from multiple vantage points. Then, we can identify patterns that could be early indicators of attack. Finally, once we know there is an attack, we can implement a proactive response model, which will be crucial to success. 

We need good network observability tools to understand what is happening in your environment. Bad actors are always at work, experiencing new things and creating new ways to exploit them. When deciding on your monitoring solution, consider how you gain complete network visibility. We must assume that the actor already has access to the zero-trust approach to security.

web applicationn firewall

So we assume the threat already has access and authentication at all levels, along with having the correct security appliance in places such as the Web Application Firewalls (WAF), Intrusion Detection Systems (IDS), and Intrusion Prevention System (IPS). But the most crucial point is to assume we have a breach and the bad actor is already on our network.

Hacking Stages

♦ The hacking stages

There are different stages of an attack chain, and with the correct network visibility, you can break the attack at each stage. Firstly, there will be the initial recon, access discovery, where a bad actor wants to understand the lay of the land to determine the next moves. Once they know this, they can try to exploit it. 

network derived intelligence
Diagram: Network-derived intelligence.
    • Stage 1: Deter

You must first deter threats and unauthorized access, detect suspicious behavior and access, and automatically respond and alert. So, it would help if you looked at network security. We have our anti-malware devices, perimeter security devices, identity access, firewalls, and load balancers for the first stage, which deters.

    • Stage 2: Detect

The following dimension of security is detection. Here, we can examine the IDS, log insights, and security feeds aligned with analyses and flow consumption. Again, any signature-based detection can assist you here.

    • Stage 3: Respond

Then, we need to focus on how you can respond. This will be with anomaly detection and response solutions. Remember that all of this must be integrated with, for example, the firewall enabling you to block and then deter that access.

Red Hat Ansible Tower

Ansible is the common automation language for everyone across your organization. Specifically, Ansible Tower can be the common language between security tools. This leads to repetitive work and the ability to respond to security events in a standardized way. If you want a unified approach, automation can help you here, especially with a Platform such as Ansible Tower. It would help if you integrated Ansible Tower and your security technologies. 

Example: Automating firewall rules. We can add an allowlist entry in the firewall configuration to allow traffic from a particular machine to another. We can have a playbook that first adds the source and destination I.P.s as variables. Then, when a source and destination object are defined, the actual access rule between those is defined. All can be done with automation.

Ansible vs Tower
Diagram: Ansible vs Tower. Source Red Hat.

Implementing Network Security

There is not one single device that can stop an attack. We need to examine multiple approaches that should be able to break the attack at any part of this attack chain. Whether the bad actors are doing their TCP scans, ARP Scans, or Malware scans, you want to be able to identify them before they become a threat. You must always assume threat access, leverage all possible features, and ensure every application is critical and protected. 

We must improve various technologies’ monitoring, investigation capabilities, and detection. The zero-trust architecture can help you monitor and improve detection. In addition, we must look at network visibility, logging, and Encrypted Traffic Analyses (ETA) to improve investigation capabilities.

Knowledge Check: Ping Sweeps

Consider identifying responsive systems within address spaces rather than blindly attacking them. Responding to network messages means responding appropriately to the messages sent to them. In other words, you can identify live systems before attempting to attack or probe them. Performing a ping sweep is one way to determine if systems are alive. Ping sweeps involve sending ping messages to every computer on the network. As a standard message, the ping uses ICMP echo requests. They may not be noticed if you are not bombarding targets with unusually large or frequent messages. Firewall rules may block ICMP messages outside the network, so ping sweeps may not succeed.

Network-derived intelligence

So, when implementing network security, you need to consider that the network and its information add a lot of value. This can still be done with an agent-based approach, where an agent collects data from the host and sends it back to, for example, a data lake where you set up a dashboard and query. However, an agent-based approach will have blind spots. It misses a holistic network view and can’t be used with unmanaged devices like far-reaching edge IoT.

The information gleaned from the host misses out on data that can be derived for the network. Especially with network-derived traffic analysis, you can look into unmanaged hosts such as IoT: any host and its actual data.

This is not something that can be derived from a log file. The issue we have with log data is if a bad actor gets internal to the network, the first thing they want to do to cover their footprints is log spoofing and log injections.

Agent-based and network-derived intelligence

An agent-based approach and network-derived intelligence’s deep packet inspection process can be appended. Network-derived intelligence allows you to pull out tons of metadata attributes, such as what traffic this is, what the characteristics of the traffic are, what a video is, and what the frame rate is.

The beauty is that this can get both north-south and east-west traffic and unmanaged devices. So, we have expanded the entire infrastructure by combining an agent-based approach and a network-derived intelligence.

Detecting rogue activity: Layers of security 

Now, we can detect new vulnerabilities, such as old SSL ciphers, shadow I.T. activity, such as torrent and crypto mining, and suspicious activities, such as port spoofing. Rogue activities such as crypto mining are a big concern. Many workflows get broken, and many breaches and attacks install crypto mining software.

This is the best way for a bad actor to make money. The way to detect this is not to have an agent but to examine network traffic and look for anomalies in the traffic. When there are anomalies in the traffic, the traffic may not look too different. This is because the mining software will not generate log files, and there is no command and control communication. 

We make the observability and SIEM more targeted to get better information. With the network, we have new capabilities to detect and invent. This adds a new layer of defense in depth and makes you more involved in the cloud threats that are happening at the moment. Netflow is used for network monitoring, detection, and response. Here, you can detect the threats and integrate them with other tools so we can see the network intrusion as it begins. It makes a decision based on the network. So you can see the threats as they happen.

layers of security
Diagram: Layers of security.

Security Principles: Monitoring and Observability

So, when implementing network security, we must follow security principles and best practices. Firstly, monitoring and observability. To set up adequate security controls on a zero-trust network, you need to have a clear picture of all the users and devices with access to a network and what access privileges they require to do their jobs.

Therefore, the comprehensive audit should include up-to-date access lists and policies. We also need to ensure that network security policies are updated. Testing their effectiveness regularly is an excellent idea to ensure that no vulnerabilities have escaped notice—finally, monitoring. Zero-trust network traffic is constantly monitored for unusual or suspicious behavior.

You can’t protect what you can’t see.

The first step in the policy optimization process is how the network connects, what is connecting, and what it should be. You can’t protect what you can’t see. Therefore, everything desperately managed within a hybrid network must be fully understood and consolidated. Secondly, once you know how things connect, how do you ensure they don’t reconnect through a broader definition of connectivity?

zero trust environment

You must support different user groups, security groups, and IP addresses. You can’t just rely on IP addresses to implement security controls anymore. We need visibility at traffic flow, process, and contextual data levels. Without this granular application, visibility, mapping, and understanding normal traffic flow and irregular communication patterns is challenging.

Complete network visibility

We also need to identify when there is a threat easily. For this, we need a multi-dimensional security model and good visibility. Network visibility is integral to security, compliance, troubleshooting, and capacity planning. Unfortunately, custom monitoring solutions cannot cope with the explosive growth networks.

We also have reasonable solutions from Cisco, such as Cisco’s Nexus Dashboard Data Broker (NDDB).  Cisco’s Nexus Dashboard Data Broker (NDDB) is a packet brokering solution that provides a software-defined, programmable solution that can aggregate, filter, and replicate network traffic using SPAN or optical TAPs for network monitoring and visibility. 

What prevents visibility?

There is a long list of things that can prevent visibility. Firstly, there are too many devices and complexity and variance between vendors in managing them. Even CLI commands from the same vendor vary. Too many changes result in the inability to meet the service level agreement (SLA), as you are just layering on connectivity without fully understanding how the network connects.

This results in complex firewall policies. For example, you have access but are not sure if you should have access. Again, this leads to significant, complex firewall policies without context. More often, the entire network lacks visibility. For example, AWS teams understand the Amazon cloud but do not have visibility on-premise. We also have distributed responsibilities across multiple groups, which results in fragmented processes and workflows.

Security Principles: Data-flow Mapping

Network security starts with the data. Data-flow mapping enables you to map and understand how data flows within an organization. But first, you must understand how data flows across your hybrid network and between all the different resources and people, such as internal employees, external partners, and customers. This includes the who, what, when, where, why, and how your data creates a strong security posture. You are then able to understand access to sensitive data.

Data-flow mapping will help you create a baseline. Once you have a baseline, you can start implementing Chaos Engineering projects to help you understand your environment and its limits. One example would be a chaos engineering kubernetes project that breaks systems in a controlled manner.

Chaos Engineering

What prevents mapping sensitive data flows

What prevents mapping sensitive data flow? Firstly, there is an inability to understand how the hybrid network connects. Do you know where sensitive data is, how to find it, and how to ensure it has the minimum necessary access?

With many teams managing different parts and the rapid pace of application deployments, there are often no documents. No filing systems in place. There is a lack of application connectivity requirements. People don’t worry about documenting and focus on connectivity. More often than not, we have an overconnected network environment.

We often connect first and then think about security—also, the inability to understand if application connectivity violates security policy and lacks application-required resources. Finally, there is a lack of visibility into the cloud and deployed applications and resources. What is in the cloud, and how is it connected to on-premise and external Internet access?

network packet broker

Implementing Network Security and the Different Types of Telemetry

Implementing network security involves leveraging the different types of telemetry for monitoring and analysis. For this, we have various kinds of packet analysis and telemetry data. Packet analysis is critical, involving new tools and technologies such as packet brokers. In addition, SPAN taps need to be installed strategically in the network infrastructure.

Telemetry, such as flow, SNMP, and API, is also examined. Flow is a technology similar to IPFIX and NETFLOW. We can also start to look at API telemetry. Then, we have logs that provide a wealth of information. So, we have different types of telemetry and different ways of collecting and analyzing it, and now we can use this from both the network and security perspectives. 

From the security presence, it would be for threat detection and response. Then, for the network side of things, it would be for network and application performance. So there are a lot of telemetries that can be used for security. These technologies were initially viewed as performance monitoring. However, security and networking have been merged to meet the cybersecurity use cases. So, in summary, we have flow, SNMP, and API for network and application performance, encrypted traffic analysis, and machine learning for threat and risk identification for security teams. 

The issues with packet analysis: Encryption.

The issue with packet analysis is that everything is encrypted, especially with TLS1.3. And at the WAN Edge. So how do you decrypt all of this, and how do you store all of this? Decrypting traffic can create an exploit and potential attack surface, and you also don’t want to decrypt everything.

Do not fully decrypt the packets.

One possible solution is not fully decrypting the packets. However, when looking at the packet information, especially in the header, which can consist of layer 2 and TCP headers. You can immediately decipher what is expected and what is malicious. You can look at the packers’ length and the arrival time order and understand what DNS server it uses.

Also, look at the round trip time and the connection times. There are a lot of understandings and features that you can extract from encrypted traffic without fully decrypting it. Combining all this information can be fed to different machine learning models to understand good and bad traffic.

You don’t need to decrypt everything.  So you may not have to look at the actual payload, but from the pattern of the packets, you can see with the right tools that one is a bad website, and another is a good website.

Key Points: Implementing network security

I have summarized how you might start implementing network security into four main stages. First, implementing network security begins with good visibility; this visibility must be combined with all our existing security tools. A packet broker can be used along with good automation. Finally, this approach must span all our environments, both on-premises and in the cloud.

Implementing network security
Diagram: A final note on implementing network security.
  • Stage 1: Know your infrastructure with good visibility

The first thing is getting to know all the traffic around your infrastructure. Once you know, they need to know this for on-premises, cloud, and multi-cloud scenarios. It would help if you had higher visibility across all environments. 

  • Stage 2: Implement security tools

In all environments, we have infrastructure that our applications and services ride upon. Several tools are used to protect this infrastructure, which will be placed in different parts of the network. As you know, we have firewalls, DLP, email gateways, and SIEM. We also have other tools to carry out various security functions. These tools will not disappear or be replaced anytime soon but must be better integrated.

  • Stage 3: Network packet broker

You can introduce a network packet broker. So, we can have a packed brokering device that fetches the data and then sends the data back to the existing security tools you have in place. Essentially, this ensures that there are no blind spots in the network. Remember that this network packet broker should support any workload to any tools. 

  • Stage 4: Cloud packet broker

In the cloud, you will have a variety of workloads and several tools, such as SIEM, IPS, and APM. These tools need access to your data. A packet broker can be used in the cloud, too. So, if you are in a cloud environment, you need to understand the native cloud protocols, such as VPC mirroring; this traffic can be brokered, allowing some transformation to happen before we move the traffic over. These transformant functions can include de-duplication, packet slicing, and TLS analyses.

This will give you complete visibility into the data set across VPC at scale, eliminating any blind spots and improving the security posture by sending appropriate network traffic, whether packets or metadata, to the tools stacked in the cloud. 

Implementing robust network security measures is of utmost importance in an era where cyber threats continue to evolve and become more sophisticated. Individuals and organizations can fortify their network security posture by assessing vulnerabilities, establishing firewalls and intrusion detection systems, enforcing strong authentication and access controls, conducting regular software updates, and implementing data encryption and secure communication protocols. Remember, network security is an ongoing process that requires continuous monitoring and adaptation to stay one step ahead of potential threats.

Network Security Components

Section 1: Firewalls – The First Line of Defense

Firewalls act as a barrier between your internal network and the outside world. They analyze incoming and outgoing network traffic and block potentially harmful data packets. By setting up firewalls properly, you can control access to your network and protect against unauthorized access attempts.

Section 2: Encryption – Securing Your Data

Encryption converts sensitive data into an unreadable format called ciphertext using cryptographic algorithms. This ensures that even if an attacker gains access to your data, they won’t be able to make sense of it. Implementing encryption protocols, such as SSL/TLS, for data transmission and using encryption algorithms for stored data adds an extra layer of protection.

Section 3: User Authentication – Verifying Legitimate Access

User authentication is vital to prevent unauthorized access to your network. Implementing strong password policies, multi-factor authentication, and regularly reviewing user privileges are effective measures to ensure that only authorized individuals can access your network resources.

Section 4: Intrusion Detection Systems – Detecting and Responding to Threats

Intrusion Detection Systems (IDS) monitor network traffic and identify suspicious activities or potential security breaches. IDS can be network- or host-based, providing real-time alerts and enabling swift response to mitigate potential risks.

Section 5: Network Monitoring – Keeping an Eye on Your Network

Network monitoring tools enable you to monitor network traffic, identify anomalies, and detect potential security incidents. You can proactively address any vulnerabilities by constantly monitoring your network, ensuring your system’s security and integrity.

Section 6: Best Practices for Network Security

To enhance your network security, it is essential to follow best practices. Some key recommendations include regularly updating software and firmware, conducting security audits, performing regular backups, educating employees on cybersecurity awareness, and staying informed about the latest security threats and solutions.

Summary: Implementing Network Security

In today’s interconnected world, where digital communication and data exchange are the norm, ensuring your network’s security is paramount. Implementing robust network security measures not only protects sensitive information but also safeguards against potential threats and unauthorized access. This blog post provided you with a comprehensive guide on implementing network security, covering key areas and best practices.

Section 1: Assessing Vulnerabilities

Before diving into security solutions, it’s crucial to assess the vulnerabilities present in your network infrastructure. Conducting a thorough audit helps identify weaknesses such as outdated software, unsecured access points, or inadequate user permissions.

Section 2: Firewall Protection

One of the fundamental pillars of network security is a strong firewall. A firewall is a barrier between your internal network and external threats, monitoring and filtering incoming and outgoing traffic. It serves as the first line of defense, preventing unauthorized access and blocking malicious activities.

Section 3: Intrusion Detection Systems

Intrusion Detection Systems (IDS) play a vital role in network security by actively monitoring network traffic, identifying suspicious patterns, and alerting administrators to potential threats. IDS can be network- or host-based, providing real-time insights into ongoing attacks or vulnerabilities.

Section 4: Securing Wireless Networks

Wireless networks are susceptible to various security risks due to their inherent nature. Implementing robust encryption protocols, regularly updating firmware, and using unique and complex passwords are essential to securing your wireless network. Additionally, segregating guest networks from internal networks helps prevent unauthorized access.

Section 5: User Authentication and Access Controls

Controlling user access is crucial to maintaining network security. Implementing robust user authentication mechanisms such as two-factor authentication (2FA) or biometric authentication adds an extra layer of protection. Regularly reviewing user permissions, revoking access for former employees, and employing the principle of least privilege ensures that only authorized individuals can access sensitive information.

Conclusion:

Implementing network security measures is an ongoing process that requires a proactive approach. Assessing vulnerabilities, deploying firewalls and intrusion detection systems, securing wireless networks, and implementing robust user authentication controls are crucial steps toward safeguarding your network. By prioritizing network security and staying informed about emerging threats, you can ensure the integrity and confidentiality of your data.

Enterprise Isometric Internet security firewall protection information

Network Security Components

Network Security Components

In today's interconnected world, network security plays a crucial role in protecting sensitive data and ensuring the smooth functioning of digital systems. A strong network security framework consists of various components that work together to mitigate risks and safeguard valuable information. In this blog post, we will explore some of the essential components that contribute to a robust network security infrastructure.

Network security encompasses a range of strategies and technologies aimed at preventing unauthorized access, data breaches, and other malicious activities. It involves securing both hardware and software components of a network infrastructure. By implementing robust security measures, organizations can mitigate risks and ensure the confidentiality, integrity, and availability of their data.

Network security components form the backbone of any robust network security system. By implementing a combination of firewalls, IDS, VPNs, SSL/TLS, access control systems, antivirus software, DLP systems, network segmentation, SIEM systems, and well-defined security policies, organizations can significantly enhance their network security posture and protect against evolving cyber threats.

Table of Contents

Highlights: Network Security Components

Common Threats and Vulnerabilities

This section illuminates the various threats and vulnerabilities that networks face. It explores the risks of malware, phishing attacks, social engineering, and insecure network configurations. Understanding these threats is essential for designing effective security measures to counteract them.

As cyber threats continue to evolve, advanced security technologies are gaining importance. We have Intrusion Detection Systems (IDS), Intrusion Prevention Systems (IPS), and Security Information and Event Management (SIEM) tools. Exploring these technologies helps organizations avoid potential attacks and quickly respond to security incidents.

Different Network Security Layers

Design and implementing a network security architecture is a composite of different technologies working at different network security layers in your infrastructure, spanning on-premises and in the cloud. So, we can have other point systems operating at the network security layers or look for an approach where each network security device somehow works holistically. These are the two options. Whichever path of security design you opt for, you will have the same network security components carrying out their security function, either virtual or physical, or a combination of both.

 

Platform and Point Solution Approach

However, there will be a platform-based or individual point solution approach. Some of the traditional security functionality that has been around for decades, such as firewalls, are still widely used, along with new ways to protect, especially regarding endpoint protection.

Example: IPv6 Access Lists

Understanding IPv6 Access-lists

IPv6 access lists are a fundamental part of network security architecture. They filter and control the flow of traffic based on specific criteria. Unlike their IPv4 counterparts, IPv6 access lists are designed to handle the larger address space provided by IPv6. They enable network administrators to define rules that determine which packets are allowed or denied access to a network.

IPv6 access lists can be categorized into two main types: standard and extended. Standard access lists are based on the source IPv6 address and allow or deny traffic accordingly. On the other hand, extended access lists consider additional parameters such as destination addresses, protocols, and port numbers. This flexibility makes extended access lists more powerful and more complex to configure.

To configure IPv6 access lists, administrators use commands specific to their network devices, such as routers or switches. This involves defining access list entries, specifying permit or deny actions, and applying the access list to the desired interface or network. Proper configuration requires a clear understanding of the network topology and security requirements.

Example:IPv6 over IPv4 GRE with IPSec

Understanding IPv6 over IPv4 GRE

IPv6 over IPv4 GRE is a tunneling technique that allows the transmission of IPv6 packets over an IPv4 network. It encapsulates IPv6 packets within IPv4 packets, enabling seamless communication between IPv6-enabled devices across IPv4 networks. This elegant solution bridges the gap between IPv6 and IPv4, ensuring interoperability and smooth transition.

IPSec, a suite of protocols for securing IP communications, plays a crucial role in implementing IPv6 over IPv4 GRE. By providing authentication, encryption, and integrity checks, IPSec ensures the confidentiality and integrity of the encapsulated IPv6 traffic. This additional layer of security safeguards the data transmitted over the tunnel, mitigating potential security risks.

Implementing IPv6 over IPv4 GRE with IPSec has been found relevant in various real-world scenarios. One prominent use case is in enterprise networks, where organizations mix IPv6 and IPv4 devices. By utilizing this technology, these organizations can establish a unified network infrastructure, enabling efficient communication between all devices. Another application is in service provider networks, where IPv6 connectivity is required over existing IPv4 networks. With IPv6 over IPv4 GRE with IPSec, service providers can deliver IPv6 services to their customers without replacing their entire network infrastructure.

Related: For pre-information, you may find the following post helpful:

  1. Dynamic Workload Scaling
  2. Stateless Networking
  3. Cisco Secure Firewall
  4. Data Center Security 
  5. Network Connectivity
  6. Distributed Systems Observability
  7. Zero Trust Security Strategy
  8. Data Center Design Guide



Network Security Components

Key Network Security Components Discussion points:


  • Point solutions or integrated devices.

  • Network security challenges.

  • Recommended starting points.

  • Firewall types and load balancers.

  • Endpoint security and packet brokers.

Knowledge Check: Network Security Components

♦ Introducing the network security components

Network security is a critical aspect of any organization’s IT infrastructure. It involves safeguarding the network from unauthorized access, data breaches, and other security threats. Implementing various network security components is required to achieve this goal.

1. Firewalls:

Firewalls are one of the most essential network security components. They monitor and control incoming and outgoing network traffic based on predefined security rules. Firewalls can be hardware-based or software-based and are designed to prevent unauthorized access to the network.

Firewalls act as the first line of defense in network security. They monitor and control incoming and outgoing network traffic based on predetermined security rules. By filtering out unauthorized access attempts and malicious traffic, firewalls help prevent unauthorized access to the network infrastructure.

2. Intrusion Detection and Prevention Systems (IDPS):

IDPS is a security system that monitors network traffic for signs of unauthorized access, misuse, or malicious activity. It can detect and prevent network attacks by analyzing traffic, identifying suspicious activity patterns, and responding to security threats.

An Intrusion Detection System detects and alerts network administrators about any unauthorized or suspicious activities within a network. It monitors network traffic, analyzes patterns, and compares potential security breaches against known attack signatures or behavior anomalies.

Network Security 

Firewalls

Intrusion Detection and Prevention

Virtual Private Networks


Network Access Control

Anti Virus

Anti Malware 


SSL and TLS

Access Control

Data Loss Prevention

Network Segmentation

SIEM Systems

Effective Security Policy

3. Virtual Private Networks (VPNs):

VPNs establish secure connections between remote users and the corporate network. They use encryption and tunneling protocols to ensure that data transmitted between the remote user and the network is secure and cannot be intercepted by unauthorized users.

VPNs provide secure remote connectivity by creating a private and encrypted connection over a public network. By encrypting data and establishing secure tunnels, VPNs ensure the confidentiality and integrity of transmitted information, making them essential for secure remote access and site-to-site connectivity.

1st Lab Guide: IPsec Site-to-Site VPN

IPsec VPN

Site-to-site IPsec VPNs are used to “bridge” two distant LANs together over the Internet. Generally, on the LAN, we use private addresses, so the two LANs cannot communicate without tunneling. In the following lab guide, I have configured IKEv1 IPsec between two Cisco ASA firewalls to bridge two LANs.

Note:

In the pkts encapsulated and encapsulated, we have incriminating packets. This is from the ping ( IMCP ) traffic. We also lost the first packet because ARP performs its role in the background when the ping is sent from R1.

Site to Site VPN

We can also have a VPN with MPLS. Now, this is common in the service-provided environment. Again, we have a combination of protocols such as BGP, LDP, and an IGP. The P nodes in the MPLS network below have no information on the CE routes. However, the CE routers are reachable and can ping each other. This provides a BGP-free core enabling VPN across the service provider infrastructure.

MPLS VPN
Diagram: MPLS VPN

4. Network Access Control (NAC):

NAC is a security solution that controls network access based on predefined policies. It ensures that only authorized users and devices can access the network and comply with the organization’s security policies.

5. Antivirus and Antimalware Software:

Antivirus and antimalware software are essential network security components. They protect the network from malware, viruses, and other malicious software by scanning for and removing any threats detected on the network.

Antivirus and antimalware software protect against malicious software (malware) that can compromise network security. These software solutions scan files and applications for known malware signatures or suspicious behavior, enabling proactive detection and removal of potential threats.

6. Secure Sockets Layer/Transport Layer Security (SSL/TLS):

SSL/TLS protocols provide secure communication over the internet by encrypting data exchanged between a client and a server. These protocols ensure that data transmitted between the two parties remain confidential and tamper-proof, making them vital for secure online transactions and communication.

7. Access Control Systems:

Access control systems regulate and manage user access to network resources. By implementing authentication mechanisms, such as usernames, passwords, or biometric authentication, access control systems ensure that only authorized individuals can access sensitive information, reducing the risk of unauthorized access.

8. Data Loss Prevention (DLP) Systems:

DLP systems monitor and prevent the unauthorized transfer or disclosure of sensitive data. By identifying and classifying sensitive information, DLP systems enforce policy-based controls to prevent data breaches, ensuring compliance with data protection regulations,

9. Network Segmentation:

Network segmentation involves dividing a network into multiple smaller subnetworks to isolate and contain potential security threats. By limiting the impact of an attack on a specific segment, network segmentation enhances security and reduces the risk of lateral movement within a network.

micro segmentation technology

10. Security Information and Event Management (SIEM) Systems:

SIEM systems collect, analyze, and correlate security event logs from various network devices, servers, and applications. By providing real-time monitoring and threat intelligence, SIEM systems enable early detection and response to security incidents, enhancing overall network security posture.

11. Security Policies and Procedures:

Comprehensive security policies and procedures are crucial for maintaining a secure network environment. These policies define acceptable use, access controls, incident response, and other security practices that guide employees in adhering to best security practices.

2nd Lab Guide: Port Scanning

Port Scanning with Netcat

In the following guide, we will look at Netcat, which can be used for security scanning. Netcat, often called “nc,” is a command-line tool that facilitates data connection, transfer, and manipulation across networks. Initially developed for Unix systems, it has since been ported to various operating systems, including Windows. Netcat operates in a client-server model, allowing users to establish connections between two or more machines.

Note:

To familiarize yourself with the configuration and commands, type nc -h to display the manual. In the following screenshot, you can see the options that are available to you. This shows the various choices you can use with the tool and the command syntax to invoke it.

Test Netcat to ensure connectivity between the Ubuntu Desktop and the Target Machine. The target’s IP address is 192.168.18.131, another Ubuntu test network host. Type nc -vz 192.168.18.131 22 to attempt to open a connection from the Ubuntu Desktop to the Target Machine over port 22.

port scan

Next, we will create a script to make it more dynamic. Essentially, we are creating a port scanning with a bash script. The script now asks you to type in the IP address to scan manually. This allows you to use the same script and give it different inputs each time it’s run instead of modifying the script contents for each scan conducted.

Take note of the two scripts created below.

Back to Basics: Security Components

The value of network security 

Network security is essential to any company or organization’s data management strategy. It is the process of protecting data, computers, and networks from unauthorized access and malicious attacks. Network security involves various technologies and techniques, such as firewalls, encryption, authentication, and access control.

Firewalls help protect a network from unauthorized access by preventing outsiders from connecting to it. Encryption protects data from being intercepted by malicious actors. Authentication verifies a user’s identity, and access control manages who has access to a network and their access type.

Understanding Encryption

Encryption is a method of encoding information so that only authorized parties can access and understand it. It involves transforming plain text into a scrambled form called ciphertext using complex algorithms and a unique encryption key.

The Role of Encryption in Data Security

Encryption is a robust shield that protects our data from unauthorized access and potential threats. It ensures that even if data falls into the wrong hands, it remains unreadable and useless without the corresponding decryption key.

Types of Encryption Algorithms

Various encryption algorithms are used to secure data, each with its strengths and characteristics. From the widely-used Advanced Encryption Standard (AES) to the asymmetric encryption of RSA, these algorithms employ different mathematical techniques to encrypt and decrypt information.

Understanding Authentication

Authentication, at its core, is the process of verifying the identity of an individual or entity. It serves as a gatekeeper, granting access only to authorized users. By confirming a user’s authenticity, businesses and organizations can protect against unauthorized access and potential security breaches.

The Importance of Strong Authentication

In an era of rising cyber threats, weak authentication measures can leave individuals and organizations vulnerable to attacks. Strong authentication is a crucial defense mechanism, ensuring only authorized users can access sensitive information or perform critical actions. It prevents unauthorized access, data breaches, identity theft, and other malicious activities.

Common Authentication Methods

There are several widely used authentication methods, each with its strengths and weaknesses. Here are a few examples:

1. Password-based authentication: This is the most common method where users enter a combination of characters as their credentials. However, it is prone to vulnerabilities such as weak passwords, password reuse, and phishing attacks.

2. Two-factor authentication (2FA): This method adds an extra layer of security by requiring users to provide a second form of authentication, such as a unique code sent to their mobile device. It significantly reduces the risk of unauthorized access.

3. Biometric authentication: Leveraging unique physical or behavioral traits like fingerprints, facial recognition, or voice patterns, biometric authentication offers a high level of security and convenience. However, it may raise privacy concerns and can be susceptible to spoofing attacks.

Enhancing Authentication with Multi-factor Authentication (MFA)

Multi-factor authentication (MFA) combines multiple authentication factors to strengthen security further. By utilizing a combination of something the user knows (password), something the user has (smartphone or token), and something the user is (biometric data), MFA provides an additional layer of protection against unauthorized access.

Understanding Authorization

Authorization is the gatekeeper of access control. It determines who has the right to access specific resources within a system. By setting up rules and permissions, organizations can define which users or groups can perform certain actions, view specific data, or execute particular functions. This layer of security ensures that only authorized individuals can access sensitive information, reducing the risk of unauthorized access or data breaches.

Granular Access Control

One key benefit of authorization is the ability to apply granular access control. Rather than providing unrestricted access to all resources, organizations can define fine-grained permissions based on roles, responsibilities, and business needs. This ensures that individuals only have access to the necessary resources to perform their tasks, minimizing the risk of accidental or deliberate misuse of data.

Role-Based Authorization

Role-based authorization is a widely adopted approach that simplifies access control management. Organizations can streamline the process of granting and revoking access rights by assigning roles to users. Roles can be structured hierarchically, allowing for easy management of permissions across various levels of the organization. This not only enhances security but also simplifies administrative tasks, as access rights can be managed at a group level rather than individually.

Authorization Policies and Enforcement

Organizations need to establish robust policies that govern access control to enforce authorization effectively. These policies define the rules and conditions for granting or denying resource access. They can be based on user attributes, such as job title or department, and contextual factors, such as time of day or location. By implementing a comprehensive policy framework, organizations can ensure access control aligns with their security requirements and regulatory obligations.

3rd Lab Guide: Generic Firewalling

Firewall and Cisco ACI

The following is a typical firewalling setup. I’m using Cisco ASA; however, all firewalls, regardless of vendor, work with security zones. We will have internal, external, and DMZ in a distinctive firewall design. R1 is internal, R3 is DMZ, and R2 is external. This does direct traffic flow as R2 cannot communicate with R1 and R3 by default. However, it can communicate with R3 and R2.

Components of network security
Diagram: Default Firewall Inspection.

Note:

The Cisco ASA Firewall uses so-called “security levels” that indicate how trusted an interface is compared to another. The higher the security level, the more trusted the interface is. Each interface on the ASA is a security zone, so using these security levels gives us different trust levels for our security zones.

ASA Failover

An interface with a high-security level can access an interface with a low-security level. Still, the other way around is impossible unless we configure an access list that permits this traffic. In the screenshot below, we have NAT configured, and the internal address of R1 is translated to 192.168.2.196. This is known as Dynamic NAT, and it is configured with ASA Object Groups.

Components of network security
Diagram: Firewall traffic flow and NAT

Firewall security policy

A firewall is an essential part of an organization’s comprehensive security policy. A security policy defines the goals, objectives, and procedures of security, all of which can be implemented with a firewall. There are many different firewalling modes and types.

However, generally, firewalls can focus on the packet header, the packet payload (the essential data of the packet), or both, the session’s content, the establishment of a circuit, and possibly other assets. Most firewalls concentrate on only one of these. The most common filtering focus is on the packet’s header, with the packet’s payload a close second.

Firewalls come in various sizes and flavors. The most typical firewall is a dedicated system or appliance that sits in the network and segments an “internal” network from the “external” Internet.

The primary difference between these two types of firewalls is the number of hosts the firewall protects. Within the network firewall type, there are primary classifications of devices, including the following:

    • Packet-filtering firewalls (stateful and nonstateful)
    • Circuit-level gateways
    • Application-level gateways
Firewall types
Diagram: Displaying the different firewall types.

3rd Lab Guide: Dynamic NAT on ASA Firewall

In this lab guide, I will address Dynamic NAT on the ASA firewall. Below, I am using the Cisco Modeling lab. In the middle, we have our ASA; its G0/0 interface belongs to the inside, and the G0/1 interface belongs to the outside. I’m using routers so that I have something to connect to.

Note: Unlike dynamic PAT, which is dynamic NAT with overload, dynamic NAT features no overload functionality in its most basic form. Whereby each global IP address is mapped to a single local IP address. Firstly, we have Dynamic NAT without fallback and Dynamic NAT with fallback. In this diagram below, if we use Dynamic NAT without fallback when all hosts on the 192.168.1.0 subnet try to access the outside network, we will run out of IP addresses in the public pool. The router R1 has several loopbacks, and I will telnet from each loopback as the source interface.

ASA Dynamic NAT

You can enable NAT fallback if you want. This means that when the public pool runs out of IP addresses, we will use the IP address on the outside interface (192.168.2.254) for translation. 

The result is that when the packet passes through the ASA, the port fields are left untouched, and only the IP addresses are translated. This has significant consequences for matching traffic. You could quickly run out of IP addresses in the translation pool.

Dynamic NAT

Network security operating at different network security layers

We have several network security components from the endpoints to the network edge, be it a public or private cloud. Policy and controls are enforced at each network security layer, giving adequate control and visibility of threats that may seek to access, modify, or break a network and its applications. Firstly, network security is provided from the network: your IPS/IDS, virtual firewalls, and distributed firewalls technologies.

Second, some network security, known as endpoint security, protects the end applications. Of course, you can’t have one without the other, but if you were to pick a favorite, it would be endpoint security.

Remember that most of the network security layers in the security architecture I see in many consultancies are distinct. There may even be a different team looking after each component. This has been the case for a while, but there needs to be some integration between the layers of security to keep up with the changes in the security landscape.

network security components
Diagram: Network security components.

WAN security with Cisco DMVPN

DMVPN: A Routing Technique.

Cisco DMVPN (Dynamic Multipoint Virtual Private Network) is a widely used technology connecting multiple sites and remote users to a central location. While DMVPN offers many benefits, such as scalability, flexibility, and ease of deployment, it is also essential to consider security.
Here are some best practices for DMVPN security:

    • Authentication: DMVPN should always use authentication to ensure that only authorized users can access the network. Authentication mechanisms such as passwords, digital certificates, and tokens can secure the network.
    • Encryption: Encryption algorithms such as AES and 3DES should be used to protect data transmitted over DMVPN.
    • Firewall: DMVPN should be deployed with a firewall to prevent unauthorized access to the network. The firewall should be configured to allow only necessary traffic to pass through.
    • Access Control: Access control should be implemented to restrict access to sensitive data. Mechanisms such as role-based access control (RBAC) can ensure that only authorized users can access sensitive data.
    • Logging and Monitoring: Logging and monitoring are critical to detect and respond to security incidents. DMVPN should be configured to log all network traffic and events, and monitoring tools should be used to detect any unusual activity.

4th Lab Guide: DMVPN

DMVPN Network

In the following lab guide, we have a DMVPN network. The DMVPN network has created a group of technologies working together, such as GRE for tunneling and NHRP and mapping interfaces to tunnel endpoints.  In our case, we are running an earlier version of DMVPN with DMVPN phase 1.

We know this as we have a point-to-point GRE tunnel. DMVPN phase 3, which allows dynamic spoke-to-spoke tunnels from R2 and R3, would use mGRE. By default, DMVPN does not have built-in security. Security can be provided with IPsec. Here, you will see the command on the spoke sites: tunnel protection ipsec profile DMVPN_IPSEC_PROFILE.

DMVPN configuration
Diagram: DMVPN Configuration.

Network Security Challenges

Multi-cloud

The applications now are diverse. We have container based virtualization that can be hosted in both on-premises and cloud locations, enabling hybrid and multi-cloud environments that need to be protected. Native security controls in the public cloud are insufficient. For a start, security groups (SGs) in one public cloud do not span multiple clouds without some other technologies set that can sit in front of the two clouds, enabling a secure multi-cloud. 

Multi cloud Terraform

The challenge with the cloud is that dynamic infrastructure means infinite volume. However, multi-cloud deployments add complexity because each provider has its interfaces, tools, and workflows. You may have the option to deploy across multiple clouds consistently with Terraform. Terraform lets you use the same workflow to manage multiple providers and handle cross-cloud dependencies. This simplifies management and orchestration for large-scale, multi-cloud infrastructures.

Changes in perimeter location and types

We also know this new paradigm spreads the perimeter, potentially increasing the attack surface with many new entry points. For example, if you are protecting a microservices environment, each unit of work represents a business function that needs security. So we now have many entry points to cover, moving security closer to the endpoint.

microservices development

 A recommended starting point: Enforcement with network security layers

So, we need a multi-layered approach to network security that can implement security controls at different points and network security layers. With this approach, we are ensuring a robust security posture regardless of network design. Therefore, the network design should become irrelevant to security. The network design can change; for example, adding a different cloud should not affect the security posture. The remainder of the post will discuss the standard network security component.

security components
Diagram: Security components.

Network Security Components

Step1: Access control 

Firstly, we need some access control. This is the first step to security. Bad actors are not picky about location when launching an attack. An attack can come from literally anywhere and at any time. Therefore, network security starts with access control carried out with authentication, authorization, accounting (AAA), and identity management.

Authentication proves that the person or service is who they say they are. Authorization allows them to carry out tasks related to their role. Identity management is all about managing the attributes associated with the user, group of users, or another identity that may require access. The following figure shows an example of access control. More specifically, network access control.

Access Control 802.1x
Diagram: Example of access control. Source Portnox

Identity-centric access control

It would be best to have an identity based on logical attributes, such as the multi-factor authentication (MFA), transport layer security (TLS) certificate, the application service, or a logical label/tag. Be careful when using labels/tags when you have cross-domain security.

So, policies are based on logical attributes rather than using IP addresses to base policies you may have used. This ensures an identity-centric design around the user identity, not the IP address.

Once initial security controls are passed, a firewall security device ensures that the users can only access services they are allowed to. These devices decide who gets access to which parts of the network. The network would be divided into different zones or micro-segments depending on the design. Adopting micro-segments is more granular regarding the difference between micro-segmentation and micro-segmentation.

Dynamic access control

Access control is the most critical component of an organization’s cybersecurity protection. For too long, access control has been based on static entitlements. Now, we are demanding dynamic access control, with decisions made in real-time. Access support must support an agile IT approach with dynamic workloads across multiple cloud environments.

A pivotal point to access control is that it is dynamic and real-time, constantly accessing and determining the risk level. Thereby preventing unauthorized access and threats like a UDP scan. We also have zero trust network design tools, such as single packet authentication (SPA), that can keep the network dark until all approved security controls are passed. Once security controls are passed, access is granted.

identity centric access control
Diagram: Identity-centric access control.

Network Security Components | Network Security Layers

Step2: The firewall and firewall design locations

A firewalling strategy can offer your environment different firewalls, capabilities, and defense-in-depth levels. Each firewall type positioned in other parts of the infrastructure forms a security layer, providing a defense-in-depth and robust security architecture. At a high level, there are two firewalling types: internal, which can be distributed among the workloads, and border-based firewalling.

Firewalling at the different network security layers

The different firewall types offer capabilities that begin with basic packet filters, reflexive ACL, stateful inspection, and next-generation features such as micro-segmentation and dynamic access control. These can take the form of physical or virtualized.

Firewalls purposely built and designed for a particular role should not be repurposed to carry out the functions that belong to and are intended to be offered by a different firewall type. The following diagram lists the different firewall types. Around nine firewall types work at different layers in the network.

Firewall types
Diagram: Displaying the different firewall types. Source Javatpoint.

The Edge Firewall

Macro segmentation

The firewall monitors and controls the incoming and outgoing network traffic based on predefined security rules. It establishes a barrier between the trusted network and the untrusted network. The firewall commonly inspects Layer 3 to Layer 4 at the network’s edge. In addition, to reduce hair pinning and re-architecture, we have internal firewalls. We can put an IPD/IDS or an AV on an edge firewall.

In the classic definition, the edge firewall performs access control and segmentation based on IP subnets, known as macro segmentation. Macro segmentation is another term for traditional network segmentation. It is still the most prevalent segmentation technique in most networks and can have benefits and drawbacks.

Same segment, same sensitivity level 

It is easy to implement but ensures that all endpoints in the same segment have or should have the same security level and can talk freely, as defined by security policy. We will always have endpoints of similar security levels, and macro segmentation is a perfect choice. Why introduce complexity when you do not need to?

Micro-segmentation

The same edge firewall can be used to do more granular segmentation; this is known as micro-segmentation. In this case, the firewall works at a finer granularity, logically dividing the data center into distinct security segments down to the individual workload level, then defining security controls and delivering services for each unique segment. So, each endpoint has its segment and can’t talk outside that segment without policy. However, we can have a specific internal firewall to do the micro-segmentation.

Cisco ACI and microsegmentation

Some micro-segmentation solutions could be Endpoint Groups (EPGs) with the Cisco ACI and ACI networks. ACI networks are based on ACI contracts that have subjects and filters to restrict traffic and enable the policy. Within the Endpoint Groups, traffic is unrestricted; however, we need an ACI contract for traffic to cross EPGs.

Internal Firewalls 

Internal firewalls inspect higher up in the application stack and can have different types of firewall context. They operate at a workload level, creating secure micro perimeters with application-based security controls. The firewall policies are application-centric, purpose-built for firewalling east-west traffic with layer 7 network controls with the stateful firewall at a workload level. 

Diagram: Firewall design locations.

Virtual firewalls and VM NIC firewalling

I often see virtualized firewalls here, and the rise of virtualization internal to the network has introduced the world of virtual firewalls. Virtual firewalls are internal firewalls distributed close to the workloads. For example, we can have the VM NIC firewall. In a virtualized environment, the VM NIC firewall is a packet filtering solution inserted between the VM Network Interfaces card of the Virtual Machines (VM) and the virtual hypervisor switch. All traffic that goes in and out of the VM has to pass via the virtual firewall.

Web application firewalls (WAF)

We could use web application firewalls (WAF) for application-level firewalls. These devices are similar to reverse proxies that can terminate and initiate new sessions to the internal hosts. The WAF has been around for quite some time to protect web applications by inspecting HTTP traffic.

However, they have the additional capability to work with illegal payloads that can better identify destructive behavior patterns than a simple VM NIC firewall.

WAFs are good at detecting static and dynamic threats. They protect against common web attacks, such as SQL injection and cross-site scripting, using pattern-matching techniques against the HTTP traffic. Active threats have been the primary source of threat and value a WAF can bring.

Network Security Components

Step3: The load balancer

A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across several servers. This allows organizations to ensure that their resources are used efficiently and that no single server is overburdened. This can improve the running applications’ performance, scalability, and availability.

Load balancing and load balancer scaling refer to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or pool. For security, a load balancer has some capability and can absorb many attacks, such as a volumetric DDoS attack. Here, we can have an elastic load balancer running in software.

Gateway Load Balancer Protocol
Diagram: Gateway Load Balancer Protocol (GLBP)

So it can run in front of a web property and load balance between the various front ends, i.e., web servers. If it sees an attack, it can implement specific techniques. So, it’s doing a function beyond the load balancing function and providing a security function.

 

Network Security Components

Step4: The IDS 

Traditionally, the IDS consists of a sensor installed on the network that monitors traffic for a set of defined signatures. The signatures are downloaded and applied to network traffic every day. Traditional IDS systems do not learn from behaviors or other network security devices over time. The solution only looks at a specific time, lacking an overall picture of what’s happening on the network.

They operate from an island of information, only examining individual packets and trying to ascertain whether there is a threat. This approach results in many false positives that cause alert fatigue. Also, when a trigger does occur, there is no copy of network traffic to do an investigation. Without this, how do you know the next stage of events? Working with IDS, security professionals are stuck with what to do next.

  • A key point: IPS/IDS  

Then we have the IPS/IDS. An example would be IDS IPS Azure.

An intrusion detection system (IDS) is a security system that monitors and detects unauthorized access to a computer or network. It also monitors communication traffic from the system for suspicious or malicious activity and alerts the system administrator when it finds any. An IDS aims to identify and alert the system administrator of any malicious activities or attempts to gain unauthorized access to the system.

An IDS can be either a hardware or software solution or a combination. It can detect various malicious activities, such as viruses, worms, and malware. It can also see attempts to access the system, steal data, or change passwords. Additionally, an IDS can detect any attempts to gain unauthorized access to the system or other activities that are not considered standard.

The IDS uses various techniques to detect intrusion. These techniques include signature-based detection, which compares the incoming traffic against a database of known attacks; anomaly-based detection, which looks for any activity that deviates from normal operations; and heuristic detection, which uses a set of rules to detect suspicious activity.

Firewalls and static rules

Firewalls use static rules to limit network access to prevent access but don’t monitor for malicious activity. An IPS/IDS examines network traffic flows to detect and prevent vulnerability exploits. The classic IPS/IDS is typically deployed behind the firewall and does protocol analysis and signature matching on various parts of the data packet.

The protocol matching is, in some sense, a compliance check against the publicly declared spec of the protocol. We are doing basic protocol checks if someone abuses some of the tags. Then, the IPS/IDS uses signatures to prevent known attacks. For example, an IPS/IDS uses a signature to prevent you from doing SQL injections. 

Move security to the workload.

Like the application-based firewalls, the IPS/IDS functionality at each workload ensures comprehensive coverage without blind spots. So, as you can see, the security functions are moving much closer to the workloads, bringing the perimeter from the edge to the workload.

Network Security Components

Step5: Endpoint Security

Endpoint security is an integral part of any organization’s security strategy. It involves the protection of endpoints, such as laptops, desktops, tablets, and smartphones, from malicious activity. Endpoint security protects data stored on devices and the device itself from malicious code or activity.

Endpoint security includes various measures, including antivirus and antimalware software, application firewalls, device control, and patch management. Antivirus and antimalware software detect and remove malicious code from devices. Application firewalls protect by monitoring incoming and outgoing network traffic and blocking suspicious activity. Device control ensures that only approved devices can be used on the network. Finally, patch management ensures that devices are up-to-date with the latest security patches.

Network detection and response 

Then, we have the network detection and response solutions. The Network detection and response (NDR) solutions are designed to detect cyber threats on corporate networks using machine learning and data analytics. They can help you discover evidence on the network and cloud of malicious activities that are in progress or have already occurred.

Some of the analyses promoting the NDR tools are “Next-Gen IDS.”  One significant difference between NDR and old IDS tools is that NDR tools use multiple Machine Learning (ML) techniques to identify normal baselines and anomalous traffic rather than static rules or IDS signatures, which have trouble handling dynamic threats. The following figure shows an example of a typical attack lifecycle.

Attack lifecycle
Diagram: Example of an attack lifecycle. The source is Paloaltonnetworks.

Anti-malware gateway

Anti-malware gateway products have a particular job. They look at the download, then take the file and try to open it. Files are put through a sandbox to test whether they contain anything malicious—the bad actors who develop malware test against these systems before releasing the malware. Therefore, the gateways often lag one step behind. Also, anti-malware gateways are limited in scope and not focused on anything but malware.

Endpoint detection and response (EDR) solutions look for evidence and effects of malware that may have slipped past EPP products. EDR tools also detect malicious insider activities such as data exfiltration attempts, left-behind accounts, and open ports. Endpoint security has the best opportunity to detect several threats. It is the closest to providing a holistic offering. It is probably the best point solution, but remember, it is just a point solution. 

  • A key point: DLP security 

By monitoring the machine and process, endpoint security is there for the long haul instead of assessing a file on a once-off basis. It can see when malware is executing and then implement DLP. Data Loss Prevention (DLP) solutions are security tools that help organizations ensure that sensitive data such as Personally Identifiable Information (PII) or Intellectual Property (IP) does not get outside the corporate network or to a user without access. However, endpoint security does not take sophisticated use cases into account. For example, it doesn’t care what you print or what Google drives you share. 

  • A key point: Endpoint security and correlation?

In general, endpoint security does not do any correlation. For example, let’s say there is a .exe that connects to the database; there is nothing on the endpoint to say that it is a malicious connection. Endpoint security finds distinguishing benign from legitimate hard unless there is a signature. Again, it is the best solution, but it is not a managed service or has a holistic view. 

Endpoint security
Diagram: Endpoint security.

The issue with point solutions

The security landscape is constantly evolving. To have any chance, security solutions also need to grow. There needs to be a more focused approach, continually developing security in line with today’s and tomorrow’s threats. For this, it is not to continuously buy more point solutions that are not integrated but to make continuous investments to ensure the algorithms are accurate and complete. So, if you want to change the firewall, you may need to buy a physical or virtual device.

Complex and scattered

Something impossible to do with the various point solutions designed with complex integration points scattered through the network domain. It’s far more beneficial to, for example, update an algorithm than to update the number of point solutions dispersed throughout the network. The point solution addresses one issue and requires a considerable amount of integration. You must continuously add keys to the stack, managing overhead and increased complexity. Not to mention license costs.

Would you like to buy a car or all the parts?

Let’s consider you are searching for a new car. Would you prefer to build the car with all the different parts or buy the already-built car? If we examine security, the way it has been geared up is provided in detail.

So I have to add this part here and that part there, and none of these parts connect. Each component must be carefully integrated with another. It’s your job to support, manage, and build the stack over time. For this, you must be an expert in all the different parts.

Example: Log management

Let’s examine a log management system that needs to integrate numerous event sources such as firewalls, proxy servers, endpoint detection, and behavioral response solutions. We also have the SIEM. The SIEM collects logs from multiple systems. They present challenges to deploying and require tremendous work to integrate into existing systems. How do logs get into the SIEM when the device is offline?

How do you normalize the data, write the rules to detect suspicious activity, and investigate if there are legitimate alerts? The results you gain from the SIEM are poor, considering the investment you have to make. Therefore, considerable resources are needed to pull it off successfully.

  • A keynote: Security controls from the different vendors 

As a final note, consider how you may have to administer the security controls from the different vendors. How do you utilize the other security controls from other vendors, and more importantly, how do you use them adjacent to one another? For example, Palo Alto operates an App-ID, a patented traffic classification system only available in Palo Alto Networks firewalls.

In a network, different vendors will not support this feature. This poses the question: how do I utilize next-generation features from vendors adjacent to devices that don’t support it? Your network needs the ability to support features from one product across the entire network and then consolidate them into one. How do I use all the next-generation features without having one vendor?

  • A keynote: Use of a packet broker

However, changing an algorithm that can affect all firewalls in your network would be better. That would be an example of an advanced platform controlling all your infrastructures. Another typical example is a packet broker that can sit in the middle of all these tools. Fetch the data from the network and endpoints and then send it back to our existing security tools. Essentially, this ensures that there are no blind spots in the network.

This packet broker tool should support any workload and be able to send to any existing security tools. Now, we are bringing information from the network into your existing security tools and adopting a network-centric approach to security.

Summary: Network Security Components

This blog post delved into the critical components of network security, shedding light on their significance and how they work together to protect our digital realm.

Section 1: Firewalls – The First Line of Defense

Firewalls are the first line of defense against potential threats. Acting as gatekeepers, they monitor incoming and outgoing network traffic, analyzing data packets to determine their legitimacy. By enforcing predetermined security rules, firewalls prevent unauthorized access and protect against malicious attacks.

Section 2: Intrusion Detection Systems (IDS) – The Watchful Guardians

Intrusion Detection Systems play a crucial role in network security by detecting and alerting against suspicious activities. IDS monitors network traffic patterns, looking for any signs of unauthorized access, malware, or unusual behavior. With their advanced algorithms, IDS helps identify potential threats promptly, allowing for swift countermeasures.

Section 3: Virtual Private Networks (VPNs) – Securing Data in Transit

Virtual Private Networks establish secure connections over public networks like the Internet. VPNs create a secure tunnel by encrypting data traffic, preventing eavesdropping and unauthorized interception. This secure communication layer is vital when accessing sensitive information remotely or connecting branch offices securely.

Section 4: Access Control Systems – Restricting Entry

Access Control Systems are designed to manage user access to networks, systems, and data. Through authentication and authorization mechanisms, these systems ensure that only authorized individuals can gain entry. Organizations can minimize the risk of unauthorized access and data breaches by implementing multi-factor authentication and granular access controls.

Section 5: Security Incident and Event Management (SIEM) – Centralized Threat Intelligence

SIEM systems provide a centralized platform for monitoring and managing security events across an organization’s network. SIEM enables real-time threat detection, incident response, and compliance management by collecting and analyzing data from various security sources. This holistic approach to security empowers organizations to stay one step ahead of potential threats.

Conclusion:

Network security is a multi-faceted discipline that relies on a combination of robust components to protect against evolving threats. Firewalls, IDS, VPNs, access control systems, and SIEM collaborate to safeguard our digital realm. By understanding these components and implementing a comprehensive network security strategy, organizations can fortify their defenses and ensure the integrity and confidentiality of their data.

data center design

Open Networking

Open Networking

In today's digital age, connectivity is at the forefront of our lives. From smart homes to autonomous vehicles, the demand for seamless and reliable network connectivity continues to grow. This is where Open Networking comes into play. In this blog post, we will explore the concept of Open Networking, its benefits, and its impact on the future of technology.

Open Networking refers to separating hardware and software components of a network infrastructure. Traditionally, network equipment vendors provided closed, proprietary systems that limited flexibility and innovation.

However, with Open Networking, organizations can choose the hardware and software components that best suit their needs, fostering greater interoperability and driving innovation.

Table of Contents

Highlights: Open Networking

The Role of Transformation

To undertake an effective SDN data center transformation strategy, we must accept that demands on data center networks come from internal end-users, external customers, and considerable changes in the application architecture. All of which put pressure on traditional data center architecture.

Dealing effectively with these demands requires the network domain to become more dynamic, potentially introducing Open Networking and Open Networking solutions. We must embrace digital transformation and the changes it will bring to our infrastructure for this to occur. Unfortunately, keeping current methods is holding back this transition.

Modern Network Infrastructure

In modern network infrastructures, as has been the case on the server side for many years, customers demand supply chain diversification regarding hardware and silicon vendors. This diversification reduces the Total Cost of Ownership because businesses can drive better cost savings. In addition, replacing the hardware underneath can be seamless because the software above is standard across both vendors.

Further, as architectures streamline and spine leaf architecture increases from the data center to the backbone and the Edge, a typical software architecture across all these environments brings operational simplicity. This perfectly aligns with the broader trend of IT/OT convergence.  

Related: For pre-information, you may find the following posts helpful:

  1. OpenFlow Protocol
  2. Software-defined Perimeter Solutions
  3. Network Configuration Automation
  4. SASE Definition
  5. Network Overlays
  6. Overlay Virtual Networking



Open Networking Solutions

Key Open Networking Discussion points:


  • Popularity of Spine Leaf architecture.

  • Lack of fabric-wide automation.

  • Automation and configuration management.

  • Open networking vs open protocols.

  • Challenges with integrated vendors.

Back to Basics: Open Networking

SDN and an SDN Controller

SDN’s three concepts are:

  • Programmability.
  • The separation of the control and data planes.
  • Managing a temporary network state in a centralized control model, regardless of the degree of centralization.

So, we have an SDN controller. In theory, an SDN controller provides services that can realize a distributed control plane and abet temporary state management and centralization concepts. 

open networking
Diagram: Open Networking for a data center topology.

The Role of Zero Trust

Zero Trust Security Strategy

Zero Trust Security Main Components

  • Zero trust security is a paradigm shift in the way organizations approach their cybersecurity.

  • Every user, device, or application, regardless of its location, must undergo strict verification and authorization processes.

  • Organizations can fortify their defenses, protect sensitive data, and mitigate the risks associated with modern cyber threats.

♦ Benefits of Open Networking:

1. Flexibility and Customization: Open Networking enables organizations to tailor their network infrastructure to their specific requirements. By decoupling hardware and software, businesses can choose the best-of-breed components and optimize their network for performance, scalability, and cost-effectiveness.

2. Interoperability: Open Networking promotes interoperability by fostering open standards and compatibility between different vendors’ equipment. This allows organizations to build multi-vendor networks, reducing vendor lock-in and enabling seamless integration of network components.

3. Cost Savings: With Open Networking, organizations can lower their networking costs by leveraging commodity hardware and open-source software. This reduces capital expenditures and allows for more efficient network management and more effortless scalability.

4. Innovation and Collaboration: Open Networking encourages collaboration and innovation by providing a platform for developers to create and contribute to open-source networking projects. The community’s collective effort drives continuous improvements, leading to faster adoption of new technologies and features.

Open Networking in Practice:

Open Networking is already making its mark across various industries. Cloud service providers, for example, rely heavily on Open Networking principles to build scalable and flexible data center networks. Telecom operators also embrace Open Networking to deploy virtualized network functions, enabling them to offer services more efficiently and adapt to changing customer demands.

Moreover, adopting Software-Defined Networking (SDN) and Network Functions Virtualization (NFV) further accelerates the realization of Open Networking’s benefits. SDN separates the control plane from the data plane, providing centralized network management and programmability. NFV virtualizes network functions, allowing for dynamic provisioning and scalability.

Open Networking in Practice

Cloud service providers

Virtualized Network Function

Virtual Private Networks

Software-Defined Networking (SDN

Network Function Virtualization

Dynamic Provisioning and Scalability

Open-source network operating systems (NOS)

Leveraging White-box Switches

Reducing Vendor Lock-in

Freedom to choose best-of-breed components

Intent-based Networking

Network Virtualization

 

Open Networking Solutions

Open networking solutions: Data center topology

Now, let’s look at the evolution of data centers to see how we can achieve this modern infrastructure. So, to evolve and to be in line with current times, you should use technology and your infrastructure as practical tools. You will be able to drive the entire organization to become digital.

Of course, the network components will play a key role. Still, the digital transformation process is an enterprise-wide initiative focusing on fabric-wide automation and software-defined networking.

Open networking solutions: Lacking fabric-wide automation

One central pain point I have seen throughout networking is the necessity to dispense with manual work lacking fabric-wide automation. In addition, it’s common to deploy applications by combining multiple services that run on a distributed set of resources. As a result, configuration and maintenance are much more complex than in the past. You have two options to implement all of this.

First, you can connect up these services by, for example, manually spinning up the servers, installing the necessary packages, SSHing to each one, or you can go down the path of open network solutions with automation, in particular, Ansible automation with Ansible Engine or Ansible Tower with automation mesh. As automation best practice, use Ansible variables for flexible playbook creation that can be easily shared and used amongst different environments.  

Agility and the service provider

For example, in the case of a service provider that has thousands of customers, it needs to deploy segmentation to separate different customers. Traditionally, the technology of choice would be VRFs or even full-blown MPLS, which requires administrative touchpoints for every box.

As I was part of a full-blown MPLS design and deployment for a more significant service provider, the costs and time were extreme. Even when it is finally done, the design lacks agility compared to what you could have done with Open Networking.

This would include Provider Edge (PE) Edge routers at the Edge, to which the customer CPE would connect. And then, in the middle of the network, we would have what is known as P ( Provider ) routers that switch the traffic based on a label.

Although the benefits of label switching were easy to implement IPv6 with 6PE ( 6PE is a technique that provides global IPv6 reachability over IPv4 MPLS ) that overcomes many IPv6 fragmentation issues, we could not get away from the manual process without investing heavily again. It is commonly a manual process.

Fabric-wide automation and SDN

However, deploying a VRF or any technology, such as an anycast gateway, is a dynamic global command in a software-defined environment. We now have fabric-wide automation and can deploy with one touch instead of numerous box-by-box configurations.

Essentially, we are moving from a box-by-box configuration to the atomic programming of a distributing fabric of a single entity. The beauty is that we can carry out deployments with one configuration point quickly and without human error.

fabric wide automation
Diagram: Fabric wide automation.

Open networking solutions: Configuration management

Manipulating configuration files by hand is a tedious and error-prone task. Not to mention time-consuming. Equally, performing pattern matching to make changes to existing files is risky. The manual approach will result in configuration drift, where some servers will drift from the desired state.

Configuration drift is caused by inconsistent configuration items across devices, usually due to manual changes and updates and not following the automation path. Ansible architecture can maintain the desired state across various managed assets.

The managed assets that can range from distributed firewalls to Linux hosts are stored in what’s known as an inventory file, which can be static or dynamic inventory. Dynamic inventories are best suited for a cloud environment where you want to gather host information dynamically. Ansible is all about maintaining the desired state for your domain.

ansible automation
Diagram: Ansible automation.

The issue of Silos

To date, the networking industry has been controlled by a few vendors. We have dealt with proprietary silos in the data center, campus/enterprise, and service provider environments. The major vendors will continue to provide a vertically integrated lock-in solution for most customers. They will not allow independent, 3rd party network operating system software to run on their silicon.

Typically, these silos were able to solve the problems of the time. The modern infrastructure needs to be modular, open, and straightforward. Vendors need to allow independent, 3rd party network operating systems to run on their silicon to break from being a vertically integrated lock-in solution.

Cisco has started this for the broader industry regarding open networking solutions with the announcement of the Cisco Silicon ONE. 

network overlay
Diagram: The issue of vendor lock-in.

The Rise of Open Networking Solutions

New data center requirements have emerged; therefore, the network infrastructure must break the silos and transform to meet these trending requirements. One can view the network transformation as moving from a static and conservative mindset that results in cost overrun and inefficiencies to a dynamic routed environment that is simple, scalable, secure, and can reach the far edge. For effective network transformation, we need several stages. 

Firstly, transition to a routed data center design with a streamlined leaf-spine architecture. Along with a standard operating system across cloud, Edge, and 5G networks. A viable approach would be for all of this to be done with open standards, without proprietary mechanisms. Then, we need good visibility.

The need for visibility

As part of the transformation, the network is no longer considered a black box that needs to be available and provide connectivity to services. Instead, the network is a source of deep visibility that can aid a large set of use cases: network performance, monitoring, security, and capacity planning, to name a few. However, visibility is often overlooked with an over-focus on connectivity and not looking at the network as a valuable source of information.

Network management
Diagram: The requirement for deep visibility.

Monitoring a network: Flow level

For efficient network management, we must provide deep visibility for the application at a flow level on any port and device type. Today, you would deploy a redundant monitoring network if you want anything comparable. Such a network would consist of probes, packet brokers, and tools to process the packet for metadata.

The traditional network monitoring tools, such as packet brokers, require life cycle management. A more viable solution would integrate network visibility into the fabric and would not need many components. This enables us to do more with the data and aids with agility for ongoing network operations.

There will always be some requirement for application optimization or a security breach, where visibility can help you quickly resolve these issues.

Monitoring is used to detect known problems and is only valid with pre-defined dashboards with a problem you have seen before, such as capacity reaching its limit. On the other hand, we have the practices of Observability that can detect unknown situations and is used to aid those in getting to the root cause of any problem, known or unknown: Observability vs Monitoring

Evolution of the Data Center

We are transitioning, and the data center has undergone several design phases. Initially, we started with layer 2 silos, suitable for the north-to-south traffic flows. However, layer 2 designs hindered east-west communication traffic flows of modern applications and restricted agility, which led to a push to break network boundaries.

Hence, there is a move to routing at the Top of the Rack (ToR) with overlays between ToR to drive inter-application communication. This is the most efficient approach, which can be accomplished in several ways. 

The leaf spine “clos” popularity

The demand for leaf and spine “clos” started in the data center and spread to other environments. A clos network is a type of non-blocking, multistage switching architecture. This network design extends from the central/backend data center to the micro data centers at the EdgeEdge. Various parts of the edge network, PoPs, central offices, and packet core have all been transformed into leaf and spine “clos” designs. 

leaf spine
Diagram: Leaf Spine.

The network overlay

Building a complete network overlay is common to all software-defined technologies when increasing agility. An overlay is a solution that is abstracted from the underlying physical infrastructure. This means separating and disaggregating the customer applications or services from the network infrastructure. Think of it as a sandbox or private network for each application that is on an existing network.

More often, the network overlay will be created with VXLAN. The Cisco ACI uses an ACI network of VXLAN for the overlay, and the underlay is a combination of BGP and IS-IS. The overlay abstracts a lot of complexity, and Layer 2 and 3 traffic separation is done with a VXLAN network identifier (VNI).

The VXLAN overlay

VXLAN uses a 24-bit network segment ID, called a VXLAN network identifier (VNI), for identification. This is much larger than the 12 bits used for traditional VLAN identification. The VNI is just a fancy name for a VLAN ID, but it now supports up to 16 Million VXLAN segments. 

This is considerably more than the traditional 4094 supported endpoints with VLANs. Not only does this provide more hosts, but it enables better network isolation capabilities, having many little VXLAN segments instead of one large VLAN domain.

The VXLAN network has become the de facto overlay protocol and brings many advantages to network architecture regarding flexibility, isolation, and scalability. VXLAN effectively implements an Ethernet segment virtualizing a thick Ethernet cable.

VXLAN unicast mode

Traditional policy deployment

Traditionally, deploying an application to the network involves propagating the policy to work through the entire infrastructure. Why? Because the network acts as an underlay, segmentation rules configured on the underlay are needed to separate different applications and services.

This creates a rigid architecture that cannot react quickly and adapt to changes, therefore lacking agility. The applications and the physical network are tightly coupled. Now, we can have a policy in the overlay network with proper segmentation per customer.

How VXLAN works: ToR

What is VXLAN? Virtual networks and those built with VXLAN are built from servers or ToR switches. Either way, the underlying network transports the traffic and doesn’t need to be configured to accommodate the customer application. That’s all done in the overlay, including the policy. Everything happens in the overlay network, which is most efficient when done in a fully distributed manner.

Overlay networking
Diagram: Overlay Networking with VXLAN

Now, application and service deployment occurs without touching the physical infrastructure. For example, if you need to have Layer 2 or Layer 3 paths across the data center network, you don’t need to tweak a VLAN or change routing protocols.

Instead, you add a VXLAN overlay network. This approach removes the tight coupling between the application and network, creating increased agility and simplicity in deploying applications and services.

the network overlay
Diagram: The VXLAN overlay network.

Extending from the data center

Edge computing creates a fundamental disruption among the business infrastructure teams. We no longer have the framework where IT only looks at the backend software, such as Office365, and OT looks at the routing and switching product-centric elements. There is convergence.

Therefore, you need a lot of open APIs. The edge computing paradigm brings processing closer to the end devices. This reduces the latency and improves the end-user experience. It would help if you had a network that could work with this model to support this. Having different siloed solutions does not work. 

Common software architecture

So the data center design went from the layer 2 silo to the leaf and spine architecture with routing to the ToR. However, there is another missing piece. We need a standard operating software architecture across all the domains and location types for switching and routing to reduce operating costs. The problem remains that even on one site, there can be several different operating systems.

Through recent consultancy engagements, I have experienced the operational challenge of having many Cisco operating systems on one site. For example, I had an IOS XR for service provider product lines, IOS XE for enterprise, and NS OX for the data center, all on a single site.

Open networking solutions and partially open-source 

Some major players, such as Juniper, started with one operating system and then fragmented significantly. It’s not that these are not great operating systems. Instead, it would be best if you partitioned into different teams, often a team for each operating system.

Standard operating system software provides a seamless experience across the entire environment. Therefore, your operational costs go down, your ability to use software for the specific use cases you want goes up, and you can reduce the cost of ownership. In addition, this brings Open Networking and partially open source.

What Is Open Networking

The traditional integrated vendor

Traditionally, networking products were a combination of hardware and software that had to be purchased as an integrated solution. Open networking, on the other hand, disaggregates hardware from software. They were allowing IT to mix and match at will.

With Open Networking, we are not reinventing how packets are forwarded or routers communicate. With Open Networking solutions, you are never alone and never the only vendor. The value of software-defined networking and Open Networking is doing as much as possible in software so you don’t depend on delivering new features from a new generation of hardware. If you want a new part, it’s quickly implemented in software without swapping the hardware or upgrading line cards.

Move intelligence to software.

You want to move as much intelligence as possible into software, thus removing the intelligence from the physical layer. You don’t want to build in hardware features; you want to use the software to provide the new features. This is a critical philosophy and is the essence of Open Networking. Software becomes the central point of intelligence, not the hardware; this intelligence is delivered fabric-wide.

As we have seen with the rise of SASE. From the customer’s point of view, they get more agility as they can move from generation to generation of services without having hardware dependency and don’t have the operational costs of constantly swapping out the hardware.

SDN network

Open Networking Solutions and Open Networking Protocols

Some vendors build into the hardware the differentiator of the offering. For example, with different hardware, you can accelerate the services. With this design, the hardware level is manipulated to make improvements but does not use standard Open Networking protocols. 

When you look at your hardware to accelerate your services, the result is that you are 100% locked and unable to move as the cost of moving is too much. You could have numerous generations of, for example, line cards, and all have different capabilities, resulting in a complex feature matrix.

It is not that I’m against this, and I’m a big fan of the prominent vendors, but this is the world of closed networking, which has been accepted as the norm until recently. So you must adapt and fit; we need to use open protocols.

Open networking is a must; open source is not.

The proprietary silo deployments led to proprietary alternatives to the prominent vendors. This meant that the startups and options offered around ten years ago were playing the game on the same pitch as the incumbents. Others built their software and architecture by, for example, saying the Linux network subsystem and the OVS bridge are good enough to solve all data center problems.

With this design, you could build small PoPs with layer 2. But the ground shifts as the design requirements change to routing. So, let’s glue the Linux kernel and Quagga FRRouting (FRR) and devise a routing solution. Unfortunately, many didn’t consider the control plane architecture or the need for multiple data center use cases.

Limited scale

Gluing the operating system and elements of open-source routing provides a limited scale and performance and results in operationally intensive and expensive solutions. The software is built to support the hardware and architectural demands.

Now, we see a lot of open-source networking vendors tackling this problem from the wrong architectural point of view, at least from where the market is moving to. It is not composable, microservices-based, or scalable from an operational viewpoint.

There is a difference between open source and Open Networking. The open-source offerings (especially the control plane) have not scaled because of sub-optimal architectures. 

On the other hand, Open Networking involves building software from first principles using modern best practices, with Open API (e.g., OpenConfig/NetConf) for programmatic access without compromising on the massive scale-up and scale-out requirements of modern infrastructure.

SDN Network Design Options

We have both controller and controllerless options. With a controllerless solution, setup is faster, increases agility, and provides robustness in single-point-of-failure, particularly for out-of-band management, i.e., connecting all the controllers.

A controllerless architecture is more self-healing; anything in the overlay network is also part of the control plane resilience. An SDN controller or controller cluster may add complexity and impede resiliency. Since the network depends on them for operation, they become a single point of failure and can impact network performance. The intelligence kept in a controller can be a point of attack.

So, there are workarounds where the data plane can continue forward without an SDN controller but always avoid a single point of failure or complex ways to have a quorum in a control-based architecture.

software defined architecture
Diagram: Software defined architecutre.

Software Defined Architecture & Automation

We have two main types of automation to consider. Day 0 and days 1-2. First and foremost, day 0 automation simplifies and reduces human error when building the infrastructure. Days 1-2 touch the customer more. This may include installing services quickly on the fabric, e.g., VRF configuration and building Automation into the fabric. 

Day 0 automation

As I said, day 0 automation builds basic infrastructures, such as routing protocols and connection information. These stages need to be carried out before installing VLANs or services. Typical tools software-defined networking uses are Ansible or your internal applications to orchestrate the build of the network.

These are known as fabric automation tools. Once the tools discover the switches, the devices are connected in a particular way, and the fabric network is built without human intervention. It simplifies traditional automation, which is helpful in day 0 automation environments.

Configuration Management

Ansible is a configuration management tool that can help alleviate manual challenges. Ansible replaces the need for an operator to tune configuration files manually and does an excellent job in application deployment and orchestrating multi-deployment scenarios.  

Ansible configuration
Diagram: Ansible Configuration

Pre-deployed infrastructure

Ansible does not deploy the infrastructure; you could use other solutions like Terraform that are best suited for this. Terraform is infrastructure as a code tool. Ansible is often described as a configuration management tool and is typically mentioned along the same lines as Puppet, Chef, and Salt. However, there is a considerable difference in how they operate.

Most notably, the installation of agents. Ansible automation is relatively easy to install as it is agentless. The Ansible architecture can be used in large environments with Ansible Tower using the execution environment and automation mesh. I have recently encountered an automation mesh, a powerful overlay feature that enables automation closer to the network’s edge.

Current and desired stage [ YAML playbooks, variables ]

Ansible ensures that the managed asset’s current state meets the desired state. Ansible is all about state management. It does this with Ansible Playbooks, more specifically, YAML playbooks. A playbook is a term Ansible uses for a configuration management script and ensuring the desired state is met. Essentially, playbooks are Ansible’s configuration management scripts. 

open networking solutions
Diagram: Configuration management.

Day 1-2 automation

With day 1-2 automation, SDN does two things.

Firstly, the ability to install or provision services automatically across the fabric. With one command, human error is eliminated. The fabric synchronizes the policies across the entire network. It automates and disperses the provisioning operations across all devices. This level of automation is not classical, as this strategy is built into the SDN infrastructure. 

Secondly, it integrates network operations and services with virtualization infrastructure managers such as OpenStack, VCenter, OpenDaylight, or, at an advanced level, OpenShift networking SDN. How does the network adapt to the instantiation of new workloads via the systems? The network admin should not even be in the loop if, for example, a new virtual machine (VM) is created. 

There should be a signal that a VM with specific configurations should be created, which is then propagated to all fabric elements. You shouldn’t need to touch the network when the virtualization infrastructure managers provide a new service. This represents the ultimate in agility as you are removing the network components. 

The first steps of creating a software-defined data center

It is agreed that agility is a necessity. So, what is the prime step? One critical step is creating a software-defined data center that will allow the rapid deployment of computing and storage for workloads. In addition to software-defined computing and storage, the network must be automated and not be an impediment. 

The five critical layers of technology

To achieve software-defined agility for the network, we need an affordable solution that delivers on four essential layers of technology:

  1. Comprehensive telemetry/granular visibility into endpoints and traffic traversing the network fabric for performance monitoring and rapid troubleshooting.
  2. Network virtualization overlay, like computer virtualization, abstracts the network from the physical hardware for increased agility and segmentation.
  3. Software-defined networking (SDN) involves controlling and automating the physical underlay to eliminate the mundane AND error-prone box-by-box configuration.
  4. Open network underlay is a cost-effective physical infrastructure with no proprietary hardware lock-in that can leverage open source.
  5. Open Networking solutions are a must, as understanding the implications of open source in large, complex data center environments is essential.

The Future of Open Networking:

Open Networking will be crucial in shaping the future as technology evolves. The rise of 5G, the Internet of Things (IoT), and artificial intelligence (AI) will require highly agile, scalable, and intelligent networks. Open Networking’s flexibility and interoperability will meet these demands and enable a connected future.

Summary: Open Networking

Networking is vital in bringing people and ideas together in today’s interconnected world. Traditional closed networks have their limitations, but with the emergence of open networking, a new era of connectivity and collaboration has dawned. This blog post explored the concept of open networking, its benefits, and its impact on various industries and communities.

Section 1: What is Open Networking?

Open networking uses open standards, open-source software, and open APIs to build and manage networks. Open networking promotes interoperability, flexibility, and innovation, unlike closed networks that rely on proprietary systems and protocols. It allows organizations to customize and optimize their networks based on their unique requirements.

Section 2: Benefits of Open Networking

2.1 Enhanced Scalability and Agility

Open networking enables organizations to scale their networks more efficiently and adapt to changing needs. Decoupling hardware and software makes adding or removing network components easier, making the network more agile and responsive.

2.2 Cost Savings

With open networking, organizations can choose hardware and software components from multiple vendors, promoting competition and reducing costs. This eliminates vendor lock-in and allows organizations to use cost-effective solutions without compromising performance or reliability.

2.3 Innovation and Collaboration

Open networking fosters innovation by encouraging collaboration among vendors, developers, and users. Developers can create new applications and services that leverage the network infrastructure with open APIs and open-source software. This leads to a vibrant ecosystem of solutions that continually push the boundaries of what networks can achieve.

Section 3: Open Networking in Various Industries

3.1 Telecommunications

Open networking has revolutionized the telecommunications industry. Telecom operators can now build and manage their networks using standard hardware and open-source software, reducing costs and enabling faster service deployments. It has also paved the way for adopting virtualization technologies like Network Functions Virtualization (NFV) and Software-Defined Networking (SDN).

3.2 Data Centers

In the world of data centers, open networking has gained significant traction. Data center operators can achieve greater agility and scalability by using open standards and software-defined networking. Open networking also allows for better integration with cloud platforms and the ability to automate network provisioning and management.

3.3 Enterprise Networks

Enterprises are increasingly embracing open networking to gain more control over their networks and reduce costs. Open networking solutions offer greater flexibility regarding hardware and software choices, enabling enterprises to tailor their networks to meet specific business needs. It also facilitates seamless integration with cloud services and enhances network security.

Conclusion:

Open networking has emerged as a powerful force in today’s digital landscape. Its ability to promote interoperability, scalability, and innovation makes it a game-changer in various industries. Whether revolutionizing telecommunications, transforming data centers, or empowering enterprises, open networking connects the world in ways we never thought possible.

micro segmentation technology

Zero Trust Security Strategy

 

zero trust security strategy

 

Zero Trust Security Strategy

In this fast-paced digital era, where cyber threats are constantly evolving, traditional security measures alone are no longer sufficient to protect sensitive data. This is where the concept of Zero Trust Security Strategy comes into play. In this blog post, we will delve into the principles and benefits of implementing a Zero Trust approach to safeguard your digital assets.

Zero Trust Security is a comprehensive and proactive security model that challenges the traditional perimeter-based security approach. Instead of relying on a trusted internal network, Zero Trust operates on the principle of “never trust, always verify.” It requires continuous authentication, authorization, and strict access controls to ensure secure data flow throughout the network.

Highlights: Zero Trust Security Design

Networks are Complex

Today’s networks are complex beasts, and considering yourself an entirely zero trust network design is a long journey. It means different things to different people. Networks these days are heterogeneous, hybrid, and dynamic. Over time, technologies have been adopted, from punch card coding to the modern-day cloud, container-based virtualization, and distributed microservices.

This complex situation leads to a dynamic and fragmented network along with fragmented processes. The problem is that enterprises over-focus on connectivity without fully understanding security. Just because you connect does not mean you are secure.

Rise in Security Breaches

Unfortunately, this misconception may allow the most significant breaches. As a result, those who can move towards a zero-trust environment with a zero-trust security strategy provide the ability to enable some new techniques that can help prevent breaches, such as zero trust and microsegmentation, zero trust networking along with Remote Browser Isolation technologies that render web content remotely. 

 

Related: For pre-information, you may find the following posts helpful:

  1. Identity Security
  2. Technology Insight For Microsegmentation
  3. Network Security Components

 



Zero Trust and Microsegmentation

Key Zero Trust Security Strategy Discussion points:


  • People overfocus on connectivity and forget security.

  • Control vs visibilty.

  • Starting a data-centric model.

  • Automation and Orchestration.

  • Starting a Zero Trust security journey.

 

Back to basics with the Zero Trust Security Design

Traditional perimeter model

The security zones are formed with a firewall/NAT device between the internal network and the internet. There is the internal “secure” zone, the DMZ (also known as the demilitarized zone), and the untrusted zone (the internet). If this organization needed to interconnect with another at some point in the future, a device would be placed on that boundary similarly. The neighboring organization will likely become a new security zone, with particular rules about traffic going from one to the other, just like the DMZ or the secure area.

 

 Key Components of Zero Trust

To effectively implement a Zero Trust Security Strategy, several crucial components need to be considered. These include:

1. Identity and Access Management (IAM): Implementing strong IAM practices ensures that only authenticated and authorized users can access sensitive resources.

2. Microsegmentation: By dividing the network into smaller segments, microsegmentation limits lateral movement and prevents unauthorized access to critical assets.

3. Least Privilege Principle: Granting users the least amount of privileges necessary to perform their tasks minimizes the risk of unauthorized access and potential data breaches.

Advantages of Zero Trust Security

Adopting a Zero Trust Security Strategy offers numerous benefits for organizations:

1. Enhanced Security: Zero Trust ensures a higher level of security by continually verifying and validating access requests, reducing the risk of insider threats and external breaches.

2. Improved Compliance: With stringent access controls and continuous monitoring, Zero Trust aids in meeting regulatory compliance requirements.

3. Reduced Attack Surface: Microsegmentation and strict access controls minimize the attack surface, making it harder for cybercriminals to exploit vulnerabilities.

Challenges and Considerations

While Zero Trust Security Strategy offers great potential, its implementation comes with challenges. Some factors to consider include:

1. Complexity: Implementing Zero Trust can be complex, requiring careful planning, collaboration, and integration of various security technologies.

2. User Experience: Striking a balance between security and user experience is crucial. Overly strict controls may hinder productivity and frustrate users.

 

Zero trust and microsegmentation 

The concept of zero trust and micro segmentation security allows organizations to execute a Zero Trust model by erecting secure micro-perimeters around distinct application workloads. Organizations can eliminate zones of trust that increase their vulnerability by acquiring granular control over their most sensitive applications and data. It enables organizations to achieve a zero-trust model and helps ensure the security of workloads regardless of where they are located.

 

Control vs. visibility

Zero trust and microsegmentation overcome this with an approach that provides visibility over the network and infrastructure to ensure you follow security principles such as least privilege. Essentially, you are giving up control but also gaining visibility. This provides the ability to understand all the access paths in your network. 

For example, within a Kubernetes environment, administrators probably don’t know how the applications connect to your on-premises data center or get Internet connectivity visibility. Hence, one should strive to give up control for visibility to understand all the access paths. Once all access paths are known, you need to review them consistently in an automated manner.

 

zero trust security strategy
Diagram: Zero trust security strategy. The choice of control over visibility.

 

Zero Trust Security Strategy

The move to zero trust security strategy can assist in gaining the adequate control and visibility needed to secure your networks. However, it consists of a wide spectrum of technologies from multiple vendors. For many, embarking on a zero trust journey is considered a data- and identity-centric approach to security instead of what we initially viewed as a network-focused journey.  

 

Zero Trust Security Strategy: Data-Centric Model

Zero trust and microsegmentation

In pursuit of zero trust and microsegmentation, abandoning traditional perimeter-based security and focusing on the zero trust reference architecture and its data is recommended. One that understands and maps data flows can then create a micro perimeter of control around their sensitive data assets to gain visibility into how they use data. Ideally, you need to identify your data and map its flow. Many claims that zero trust starts with the data. And the first step to building a zero trust security architecture is identifying your sensitive data and mapping its flow.

We understand that you can’t protect what you cannot see; gaining the correct visit of data and understanding the data flow is critical. However, securing your data, even though it is the most crucial step, may not be your first zero trust step. Why? It’s a complex task.

 

zero trust environment
Diagram Data: Zero trust environment. The importance of data.

 

Start a zero trust security strategy journey

For a successful Zero Trust Network ZTN, I would start with one aspect of zero trust as a project recommendation. And then work your way out from there. When we examine implementing disruptive technologies that are complex to implement, we should focus on outcomes, gain small results and then repeat and expand.

 

  • A key point. Zero trust automation

This would be similar to how you may start an automation journey. Rolling out automation is considered risky. It brings consistency and a lot of peace of mind when implemented correctly. But simultaneously, if you start with advanced automation use cases, there could be a large blast radius.

As a best practice, I would start your automation journey with config management and continuous remediation. And then move to move advanced use cases throughout your organization. Such as edge networking, full security ( Firewall, PAM, IDPS, etc.), and CI/CD integration.

 

  • A key point: You can’t be 100% zero trust

It is impossible to be 100% secure. You can only strive to be as secure as possible without hindering agility. It is similar to that of embarking on a zero-trust project. It is impossible to be 100% zero trust as this would involve turning off everything and removing all users from the network. We could use single-packet authorization without sending the first packet! 

 

Do not send a SPA packet

When doing so, we would keep the network and infrastructure dark without sending the first SPA packet to kick off single-packet authentication. However, lights must be on, services must be available, and users must access the services without too much interference. Users expect some downtime. Nothing can be 100% reliable all of the time.

Then you can balance velocity and stability with practices such as Chaos Engineering Kubernetes. But users don’t want to hear of a security breach.

 

zero trust journey
Diagram: Zero trust journey. What is your version of trust?

 

  • A key point. What is trust?

So the first step toward zero trust is to determine a baseline. This is not a baseline for network and security but a baseline of trust. And zero trust is different for each organization, and it boils down to the level of trust; what level does your organization consider zero trust?  What mechanism do you have in place?

There are many avenues of correlation and enforcement to reach the point where you can call yourself a zero trust environment. It may never become a zero trust environment but is limited to certain zones, applications, and segments that share a standard policy and rule base.

 

  • A key point: Choosing the vendor

Also, can zero trust security vendors be achieved with a single vendor regarding vendor selection? No one should consider implementing zero trust with one vendor solution. However, many zero trust elements can be implemented with a SASE definition known as Zero Trust SASE.

In reality, there are too many pieces to a zero-trust project, and not one vendor can be an expert on them. Once you have determined your level of trust and what you expect from a zero-trust environment, you can move to the main zero-trust element and follow the well-known zero-trust principles. Firstly, automation and orchestration. You need to automate, automate and automate.

 

zero trust reference architecture
Diagram: Zero trust reference architecture.

 

Zero Trust Security Strategy: The Components

Automation and orchestration

Zero trust is impossible to maintain without automation and orchestration. Firstly, you need to have identification of data along with access requirements. All of this must be defined along with the network components and policies. So if there is a violation, here is how we reclaim our posture without human interventionThis is where automation comes to light; it is a powerful tool in your zero trust journey and should be enabled end-to-end throughout your enterprise.

An enterprise-grade zero trust solution must work quickly with the scaling ability to improve the automated responses and reactions to internal and external threats. The automation and orchestration stage defines and manages the micro perimeters to provide the new and desired connectivity. Ansible architecture consists of Ansible Tower and the Ansible Core based on the CLI for a platform approach to automation.

 

Zero trust automation

With the matrix of identities, workloads, locations, devices, and data continuing to grow more complicated, automation provides a necessity. And you can have automation in different parts of your enterprise and at different levels. 

You can have pre-approved playbooks stored in a Git repository that can be version controlled with a Source Control Management system (SCM). Storing playbooks in a Git repository puts all playbooks under source control, so everything is better managed.

Then you can use different security playbooks already approved for different security use cases. Also, when you bring automation into the zero-trust environments, the Ansible variables can separate site-specific information from the playbooks. This will be your playbooks more flexible. You can also have a variable specific to the inventory known as the Ansible inventory variable.

 

  • Schedule zero trust playbooks under version control

For example, you can kick off a playbook to run at midnight daily to check that patches are installed. If there is a deviation from a baseline, the playbook could send notifications to relevant users and teams.

 

Ansible Tower: Delegation of Control

I use Ansible Tower, which has a built-in playbook, scheduling, and notifications for many of my security baselines. I can combine this with the “check” feature so less experienced team members can run playbook “sanity” checks and don’t have the need or full requirement to perform change tasks.

Role-based access control can be tightly controlled for even better delegation of control. You can integrate Ansible Towers with your security appliances for advanced security uses. Now we have tight integration with security and automation. Integration is essential; unified automation approaches require integration between your automation platform and your security technologies. 

 

Security integration with automation

For example, we can have playbooks that automatically collect logs for all your firewall devices. These can be automatically sent back to a log storage backend for analysts, where machine learning (ML) algorithms can perform threat hunting and examine for any deviations.

Also, I find Ansible Towers workflow templates handy and can be used to chain different automation jobs into one coherent workflow. So now we can chain different automation events together. Then you can have actions based on success, failure, or always.

 

  • A key point – Just alert and not block

You could just run a playbook to raise an alert. It does not necessarily mean you should block. I would only block something when necessary. So we are using automation to instantiate a playbook to bring those entries that have deviated from the baseline back into what you consider to be zero trust. Or we can automatically move an endpoint into a sandbox zone. So the endpoint can still operate but with less access. 

Consider that when you first implemented the network access control (NAC), you didn’t block everything immediately; you allowed it to bypass and log in for some time. From this, you can then build a baseline. I would recommend the same thing for automation and orchestration. When I block something, I recommend human approval to the workflow.

 

zero trust automation
Diagram: Zero trust automation. Adaptive access.

 

Zero Trust Least Privilege, and Adaptive Access

Enforcement points and flows

As you build out the enforcement points, it can be yes or no. Similar to the concept of the firewall’s binary rules, they are the same as some of the authentication mechanisms work. However, it would be best to monitor anomalies regarding things like flows. You must stop trusting packets as if they were people. Instead, they must eliminate the idea of trusted and untrusted networks. 

 

Identity centric design

Rather than using IP addresses to base policies on, zero trust policies are based on logical attributes. This ensures an identity-centric design around the user identity, not the IP address. This is a key component of zero trust, how you can have adaptive access for your zero trust versus a simple yes or no. Again, following a zero trust identity approach is easier said than done. 

 

  • A key point: Zero trust identity approach

With a zero trust identity approach, the identity should be based on logical attributes, for example, the multi-factor authentication (MFA), transport layer security (TLS) certificate, the application service, or the use of a logical label/tag. Tagging and labeling are good starting points as long as those tags and labels make sense when they flow across different domains. Also, consider the security controls or tagging offered by different vendors.

How do you utilize the different security controls from different vendors, and more importantly, how do you use them adjacent to one another? For example, Palo Alto utilizes an App-ID, a patented traffic classification system. Please keep in mind vendors such as Cisco have end-to-end tagging and labeling when you integrate all of their products, such as the Cisco ACI and SD-Access.

Zero trust environment and adaptive access

Adaptive access control uses policies that allow administrators to control user access to applications, files, and network features based on multiple real-time factors. Not only are there multiple factors to consider, but these are considered in real-time. What we are doing is responding to potential threats in real-time by continually monitoring user sessions for a variety of factors. We are not just looking at IP or location as an anchor for trust.

 

  • Pursue adaptive access

Anything tied to an IP address is useless. Adaptive access is more of an advanced zero trust technology, likely later in the zero trust journey. Adaptive access is not something you would initially start with.

 

 Micro segmentation and zero trust security
Diagram: Micro segmentation and zero trust security.

 

Zero Trust and Microsegmentation 

VMware introduced the concept of microsegmentation to data center networking in 2014 with VMware NSX micro-segmentation. And it has grown in usage considerably since then. It is challenging to implement and requires a lot of planning and visibility.

Zero trust and microsegmentation security enforce the security of a data center by monitoring the flows inside the data center. The main idea is that in addition to network security at the perimeter, data center security should focus on the attacks and threats from the internal network.

 

Small and protected isolated sections

With zero trust and microsegmentation security, the traffic inside the data center is differentiated into small isolated parts, i.e., micro-segments depending on the traffic type and sensitivity level. A strict micro-granular security model that ties security to individual workloads can be adopted.

Security is not simply tied to a zone; we are going to the workload level to define the security policy. By creating a logical boundary between the requesting resource and protected assets, we have minimized lateral movement elsewhere in the network, gaining east-west segmentation.

 

Zero trust and microsegmentation

It is often combined with micro perimeters. By shrinking the security perimeter of each application, we can control a user’s access to the application from anywhere and any device without relying on large segments that may or may not have intra-segment filtering.

 

  • Use case: Zero trust and microsegmentation:  5G

Micro segmentation is the alignment of multiple security tooling along with aligning capabilities with certain policies. One example of building a micro perimeter into a 5G edge is with containers. The completely new use cases and services included in 5G bring large concerns as to the security of the mobile network. Therefore, require a different approach to segmentation.

 

Micro segmentation and 5G

In a 5G network, a micro segment can be defined as a logical network portion decoupled from the physical 5G hardware. Then we can chain several micro-segments chained together to create end-to-end connectivity that maintains application isolation. So we have end-to-end security based on micro segmentation, and each micro segment can have fine-grained access controls.

 

  • A key point: Zero trust and microsegmentation: The solutions

A significant proposition for enabling zero trust is micro segmentation and micro perimeters. Their use must be clarified upfront. Essentially, their purpose is to minimize and contain the breach (when it happens). Rather than using IP addresses to base segmentation policies, the policies are based on logical constructs. Not physical attributes. 

 

Monitor flows and alert

Ideally, favor vendors with micro segmentation solutions that monitor baseline flows and alert on anomalies. These should also assess the relative level of risk/trust and alert on anomalies.  They should also continuously assess the relative level of risk/trust on the network session behavior observed. This may include unusual connectivity patterns, excessive bandwidth, excessive data transfers, and communication to URLs or IP addresses with a lower level of trust. 

 

Micro segmentation in networking

The level of complexity comes down to what you are trying to protect. This can be something on the edges, such as a 5G network point, IoT, or something central to the network. Both of which may need physical and logical separation. A good starting point for your micro segmentation journey is to build a micro segment but not in enforcement mode. So you are starting with the design but not implementing it fully. The idea is to watch and gain insights before you turn on the micro segment.

 

Containers and Zero Trust

Let us look at a practical example of applying the zero trust principles to containers. There are many layers within the container-based architecture to which you can apply zero trust. For communication with the containers, we have two layers. Nodes and services in the containers with a service mesh type of communication with a mutual TLS type of solutions. 

The container is already a two-layer. We have the nodes and services. The services communicate with an MTLS solution to control the communication between the services. Then we have the application. The application overall is where you have the ingress and egress access points. 

Docker container security

 

The OpenShift secure route

OpenShift networking SDN is similar to a routing control platform based on Open vSwitch that operates with the OVS bridge programmed with OVS rules. OVS networking has what’s known as a route construct. These routes provide access to specific services. Then, the service acts as a software load balancer to the correct pod. So we have a route construct that sits in front of the services. This abstraction layer and the OVS architecture bring many benefits to security.

 

openshift sdn
Diagram: Openshift SDN.

 

The service is the first level of exposing applications, but they are unrelated to DNS name resolution. To make servers accepted by FQDN, we use the OpenShift route resource, and the route provides the DNS. In Kubernetes’s words, we use Ingress, which exposes services to the external world. However, in Openshift, it is a best practice to use a routing set. Routes are an alternative to Ingress.

 

OpenShift security: OpenShift SDN and the secure route 

One of the advantages of the OpenShift route construct is that you can have secure routes. Secure routes provide advanced features that might not be supported by standard Kubernetes Ingress controllers, such as TLS re-encryption, TLS passthrough, and split traffic for blue-green deployments. 

Securing containerized environments is considerably different from securing the traditional monolithic application because of the inherent nature of the microservices architecture. A monolithic application has few entry points, for example, ports 80 and 443. 

Not every monolithic component is exposed to external access and must accept requests directly. Now with a secure openshift route, we can implement security where it matters most and at any point in the infrastructure. 

 

Context Based Authentication

For zero trust, it depends on what you can do with the three different types of layers. The layer you want to apply zero trust depends on the context granularity. For context-based authentication, you need to take in as much context as possible to make access decisions, and if you can’t, what are the mitigating controls?

You can’t just block. We have identity versus the traditional network-type parameter of controls. If you cannot rely on the identity and context information, you rely on and shift to network-based controls as we did initially. Network-based controls have been around for decades and create holes in the security posture. 

However, suppose you are not at a stage to implement access based on identity and context information. In that case, you may need to keep the network-based control and look deeper into your environment where you can implement zero trust to regain a good security posture. This is a perfect example of why you implement zero trust in isolated areas.

 

  • Examine zero trust layer by layer.

So it would help if you looked layer by layer for specific use cases and then at what technology components you can apply zero trust principles. So it is not a question of starting with identity or micro segmentation. The result should be a combination of both. However, identity is the critical jewel to look out for and take in as much context as possible to make access decisions and keep threats out. 

 

Take a data-centric approach. Zero trust data

Gaining visibility into the interaction between users, apps, and data across many devices and locations is imperative. This allows you to set and enforce policies irrespective of location. A data-centric approach takes location out of the picture. It comes down to “WHAT,” which is always the data. What are you trying to protect? So you should build out the architecture method over the “WHAT.”

 

Zero Trust Data Security

  • Step 1: Identify your sensitive data 

You can’t protect what you can’t see. Everything managed desperately within a hybrid network needs to be fully understood and consolidated into a single console. Secondly, once you know how things connect, how do you ensure they don’t reconnect through a broader definition of connectivity?

You can’t just rely on IP addresses anymore to implement security controls. So here, we need to identify and classify sensitive data. By defining your data, you can identify sensitive data sources to protect. Next, simplify your data classification. This will allow you to segment the network based on data sensitivity. When creating your first zero trust micro perimeter, start with a well-understood data type or system.

 

  • Step2: Zero trust and microsegmentation

Micro segmentation software that segments the network based on data sensitivity  

Secondly, you need to segment the network based on data sensitivity. Here we are defining a micro perimeter around sensitive data. Once you determine the optimal flow, identify where to place the micro perimeter.  Remember that virtual networks are designed to optimize network performance; they can’t prevent malware propagation, lateral movement, or unauthorized access to sensitive data. Like the VLAN, it was used for performance but became a security tool.

 

A final note: Firewall micro segmentation

Enforce micro perimeter with physical or virtual security controls. There are multiple ways to enforce micro perimeters. For example, we have NGFW from a vendor like Check Point, Cisco, Fortinet, or Palo Alto Networks.  If you’ve adopted a network virtualization platform, you can opt for a virtual NGFW to insert into the virtualization layer of your network. You don’t always need an NGFW to enforce network segmentation; software-based approaches to microsegmentation are also available.

 

Conclusion:

In conclusion, Zero Trust Security Strategy is an innovative and robust approach to protect valuable assets in today’s threat landscape. By rethinking traditional security models and enforcing strict access controls, organizations can significantly enhance their security posture and mitigate risks. Embracing a Zero Trust mindset is a proactive step towards safeguarding against ever-evolving cyber threats.

 

Cisco ACI

ACI Cisco

Cisco ACI Components

In today's rapidly evolving digital landscape, businesses constantly seek innovative solutions to streamline their network infrastructure. Enter Cisco ACI (Application Centric Infrastructure), a groundbreaking technology that promises to revolutionize how networks are designed, deployed, and managed.

In this blog post, we will delve into the intricacies of Cisco ACI, its key features, and the benefits it brings to organizations of all sizes.

Cisco ACI is an advanced software-defined networking (SDN) solution that enables organizations to build and manage their networks in a more holistic and application-centric manner. By abstracting network policies and services from the underlying hardware, ACI provides a unified and programmable approach to network management, making it easier to adapt to changing business needs.

Table of Contents

Highlights: Cisco ACI Components

Hardware-based Underlay

In ACI, hardware-based underlay switching offers a significant advantage over software-only solutions due to specialized forwarding chips. Furthermore, thanks to Cisco’s ASIC development, ACI brings many advanced features, including security policy enforcement, microsegmentation, dynamic policy-based redirect (inserting external L4-L7 service devices into the data path), or detailed flow analytics—besides the vast performance and flexibility.

The Legacy data center

The legacy data center topologies have a static infrastructure that specifies the constructs to form the logical topology. We must configure the VLAN, Layer 2/Layer 3 interfaces, and the protocols we need on the individual devices. Also, the process we used to define these constructs was done manually. We may have used Ansible playbooks to backup configuration or check for specific network parameters, but we generally operated with a statically defined process.

  • Poor Resources

The main roadblock to application deployment was the physical bare-metal server. It was chunky and could only host one application due to the lack of process isolation. So, the network has one application per server to support and provide connectivity. This is the opposite of how ACI Cisco, also known as Cisco SDN ACI networks operate.

Related: For pre-information, you may find the following helpful:

  1. Data Center Security 
  2. VMware NSX



Cisco SDN ACI 

Key ACI Cisco Discussion points:


  • Birth of virtualization and SDN.

  • Cisco ACI integrations.

  • ACI design and components.

  • VXLAN networking and ECMP.

  • Focus on ACI and SD-WAN.

Back to Basics: Cisco ACI components

Key Features of Cisco ACI

a) Application-Centric Policy Model: Cisco ACI allows administrators to define and manage network policies based on application requirements rather than traditional network constructs. This approach simplifies policy enforcement and enhances application performance and security.

b) Automation and Orchestration: With Cisco ACI, network provisioning and configuration tasks are automated, reducing the risk of human error and accelerating deployment times. The centralized management framework enables seamless integration with orchestration tools, further streamlining network operations.

c) Scalability and Flexibility: ACI’s scalable architecture ensures that networks can grow and adapt to evolving business demands. Spine-leaf topology and VXLAN overlay technology allow for seamless expansion and simplify the deployment of multi-site and hybrid cloud environments.

Cisco Data Center

Cisco ACI

Key Features

  • Application-Centric Policy Model

  • Automation and Orchestration

  • Scalability and Flexibility

  • Built-in Security 

Cisco Data Center

Cisco ACI 

Key Advantages

  • Enhanced Security

  • Agility and Time-to-Market

  • Simplified Operations

  • Open software flexibility for DevOps teams.

Benefits of Cisco ACI

a) Enhanced Security: By providing granular microsegmentation and policy-based controls, Cisco ACI helps organizations strengthen their security posture. Malicious lateral movement within the network can be mitigated, reducing the attack surface and preventing data breaches.

b) Agility and Time-to-Market: The automation capabilities of Cisco ACI significantly reduce the time and effort required for network provisioning and changes. This agility enables organizations to respond faster to market demands, launch new services, and gain a competitive edge.

c) Simplified Operations: The centralized management and policy-driven approach of Cisco ACI simplify network operations, leading to improved efficiency and reduced operational costs. The intuitive user interface and comprehensive analytics provide administrators with valuable insights, enabling proactive troubleshooting and optimization.

The Cisco ACI SDN Solution

Cisco ACI is a software-defined networking (SDN) solution that integrates with software and hardware. With the ACI, we can create software policies and use hardware for forwarding, an efficient and highly scalable approach offering better performance. The hardware for ACI is based on the Cisco Nexus 9000 platform product line. The APIC centralized policy controller drives the software, which stores all configuration and statistical data.

Nexus Family

To build the ACI underlay, you must exclusively use the Nexus 9000 family of switches. You can choose from modular Nexus 9500 switches or fixed 1U to 2U Nexus 9300 models. Specific models and line cards are dedicated to the spine function in ACI fabric; others can be used as leaves, and some can be used for both purposes. You can combine various leaf switches inside one fabric without any limitations.

Spine and Leaf

For Nexus 9000 switches to be used as an ACI spine or leaf, they must be equipped with powerful Cisco CloudScale ASICs manufactured using 16-nm technology. The following figure shows the Cisco ACI based on the Nexus 9000 series. Cisco Nexus 9300 and 9500 platform switches support Cisco ACI. As a result, organizations can use them as the spine or leaf switches to fully utilize an automated, policy-based systems management approach. 

Cisco ACI Components
Diagram: Cisco ACI Components. Source is Cisco
  • A key point: The birth of virtualization

Server virtualization helped to a degree where we could decouple workloads from the hardware, making the compute platform more scalable and agile. However, the server is not the main interconnection point for network traffic. So, we need to look at how we could virtualize the network infrastructure in a way similar to the agility gained from server virtualization.

This is carried out with software-defined networking and overlays that could map network endpoints and be spun up and down as needed without human intervention. In addition, the SDN architecture includes an SDN controller and an SDN network that enables an entirely new data center topology.

server virtualization
Diagram: The need for virtualization and software-defined networking.

ACI Cisco: Integrations

Routing Control Platform

Then came along Cisco SDN ACI, the ACI Cisco, which operates differently from the traditional data center with an application-centric infrastructure. The Cisco application-centric infrastructure achieves resource elasticity with automation through standard policies for data center operations and consistent policy management across multiple on-premises and cloud instances.

It uses a Software-Defined Networking (SDN) architecture like a routing control platform. The Cisco SDN ACI also provides a secure networking environment for Kubernetes. In addition, it integrates with various other solutions, such as Red Hat OpenShift networking.

Cisco ACI: Integration Options

What makes the Cisco ACI interesting is its several vital integrations. I’m not talking about extending the data center with multi-pod and multi-site, for example, with AlgoSec, Cisco AppDynamics, and SD-WAN. AlgoSec enables secure application delivery and policy across hybrid network estates, while AppDynamic lives in a world of distributed systems Observability. SD-WAN enabled path performance per application with virtual WANs.

Cisco ACI Components: ACI Cisco and Multi-Pod

Cisco ACI Multi-Pod is part of the “Single APIC Cluster / Single Domain” family of solutions, as a single APIC cluster is deployed to manage all the interconnected ACI networks. These separate ACI networks are named “pods,” Each looks like a regular two-tier spine-leaf topology. The same APIC cluster can manage several pods, and to increase the resiliency of the solution, the various controller nodes that make up the cluster can be deployed across different pods.

ACI Multi-Pod
Diagram: Cisco ACI Multi-Pod. Source Cisco.

Cisco ACI Components: ACI Cisco and AlgoSec

With AlgoSec integrated with the Cisco ACI, we can now provide automated security policy change management for multi-vendor devices and risk and compliance analysis. The AlgoSec Security Management Solution for Cisco ACI extends ACI’s policy-driven automation to secure various endpoints connected to the Cisco SDN ACI fabric.

These simplify the network security policy management across on-premises firewalls, SDNs, and cloud environments. It also provides the necessary visibility into the security posture of ACI, even across multi-cloud environments. 

Cisco ACI Components: ACI Cisco and AppDynamics 

Then, with AppDynamics, we are heading into Observability and controllability. Now, we can correlate app health and network for optimal performance, deep monitoring, and fast root-cause analysis across complex distributed systems with numbers of business transactions that need to be tracked. This will give your teams complete visibility of your entire technology stack, from your database servers to cloud-native and hybrid environments. In addition, AppDynamics works with agents that monitor application behavior in several ways. We will examine the types of agents and how they work later in this post.

Cisco ACI Components: ACI Cisco and SD-WAN 

SD-WAN brings a software-defined approach to the WAN. These enable a virtual WAN architecture to leverage transport services such as MPLS, LTE, and broadband internet services. So, SD-WAN is not a new technology; its benefits are well known, including improving application performance, increasing agility, and, in some cases, reducing costs.

The Cisco ACI and SD-WAN integration makes active-active data center design less risky than in the past. The following figures give a high-level overview of the Cisco ACI and SD-WAN integration. For pre-information generic to SD-WAN, go here: SD-WAN Tutorial

SD WAN integration
Diagram: Cisco ACI and SD-WAN integration

The Cisco SDN ACI and SD-WAN Integration

The Cisco SDN ACI with SD-WAN integration helps ensure an excellent application experience by defining application Service-Level Agreement (SLA) parameters. Cisco ACI releases 4.1(1i) and adds support for WAN SLA policies. This feature enables admins to apply pre-configured policies to specify the packet loss, jitter, and latency levels for the tenant traffic over the WAN.

When you apply a WAN SLA policy to the tenant traffic, the Cisco APIC sends the pre-configured policies to a vManage controller. The vManage controller, configured as an external device manager that provides SDWAN capability, chooses the best WAN link that meets the loss, jitter, and latency parameters specified in the SLA policy.

Cisco ACI Components: Openshift and Cisco SDN ACI

OpenShift Container Platform (formerly known as OpenShift Enterprise) or OCP is Red Hat’s offering for the on-premises private platform as a service (PaaS). OpenShift is based on the Origin open-source project and is a Kubernetes distribution, the defacto for container-based virtualization. The foundation of the OpenShift networking SDN is based on Kubernetes and, therefore, shares some of the same networking technology along with some enhancements, such as the OpenShift route construct.

Cisco ACI Components: Other data center integrations

Cisco SDN ACI has another integration with Cisco DNA Center/ISE that maps user identities consistently to endpoints and apps across the network, from campus to the data center. Cisco Software-Defined Access (SD-Access) provides policy-based automation from the edge to the data center and the cloud.

Cisco SD-Access provides automated end-to-end segmentation to separate user, device, and application traffic without redesigning the network. This integration will enable customers to use standard policies across Cisco SD-Access and Cisco ACI, simplifying customer policy management using Cisco technology in different operational domains.

Let us recap before we look at the ACI integrations in more detail.

The Cisco SDN ACI Design  

Introduction to leaf and spine

The Cisco SDN ACI works with a Clos architecture, a fully meshed ACI network. Based on a spine leaf architecture. As a result, every Leaf is physically connected to every Spine, enabling traffic forwarding through non-blocking links. Physically, we have a set of Leaf switches creating a Leaf layer attached to the Spines in a full BIPARTITE graph.

This means that each Leaf is connected to each Spine, and each Spine is connected to each Leaf.  The ACI uses a horizontally elongated Leaf and Spine architecture with one hop to every host in an entirely messed ACI fabric, offering good throughput and convergence needed for today’s applications.

Cisco ACI
Diagram: Cisco ACI: Improving application performance.

The ACI fabric: Aggregate

A key point to note in the spine-and-leaf design is the fabric concept, which is like a stretch network. And one of the core ideas around a fabric is that they do not aggregate traffic. This does increase data center performance along with a non-blocking architecture. With the spine-leaf topology, we are spreading a fabric across multiple devices.

The result of the fabric is that each edge device has the total bandwidth of the fabric available to every other edge device. This is one big difference from traditional data center designs; we aggregate the traffic by either stacking multiple streams onto a single link or carrying the streams serially.

SDN data center
Diagram: Cisco ACI fabric checking.

The issues with oversubscription

With the traditional 3-tier design, we aggregate everything at the core, leading to oversubscription ratios that degrade performance. With the ACI Leaf and Spine design, we spread the load across all devices with equidistant endpoints. Therefore, we can carry the streams parallel.

Horizontal scaling load balancing

Then, we have horizontal scaling load balancing.  Load balancing with this topology uses multipathing to achieve the desired bandwidth between the nodes. Even though this forwarding paradigm can be based on Layer 2 forwarding ( bridging) or Layer 3 forwarding ( routing), the ACI leverages a routed approach to the Leaf and Spine design, and we have Equal Cost Multi-Path (ECMP) for both Layer 2 and Layer 3 traffic. 

Highlighting the overlay and underlay

Mapping Traffic

So you may be asking how we can have Layer 3 routed core and pass Layer 2 traffic. This is done using the overlay, which can map different traffic types to other overlays. So, we can have Layer 2 traffic mapped to an overlay over a routed core. ACI links between the Leaf and the Spine switches are L3 active-active links. Therefore, we can intelligently load balance and traffic steer to avoid issues. And we don’t need to rely on STP to block links or involve STP to fix the topology.

When networks were first developed, there was no such thing as an application moving from one place to another while it was in use. So the original architects of IP, the communication protocol used between computers, used the IP address to mean both the identity of a device connected to the network and its location on the network.  Today, in the modern data center, we need to be able to communicate with an application or application tier, no matter where it is.

Overlay Encapsulation

One day, it may be in location A and the next in location B, but its identity, which we communicate with, is the same on both days. An overlay is when we encapsulate an application’s original message with the location to which it needs to be delivered before sending it through the network.

Once it arrives at its final destination, we unwrap it and deliver the original message as desired. The identities of the devices (applications) communicating are in the original message, and the locations are in the encapsulation, thus separating the place from the identity. This wrapping and unwrapping is done per-packet basis and, therefore, must be done quickly and efficiently.

Overlay and underlay components

The Cisco SDN ACI has a concept of overlay and underlay, forming a virtual overlay solution. The role of the underlay is to glue together devices so the overlay can work and be built on top. So, the overlay, which is VXLAN, runs on top of the underlay, which is IS-IS. In the ACI, the IS-IS protocol provides the routing for the overlay, which is why we can provide ECMP from the Leaf to the Spine nodes. The routed underlay provides an ECMP network where all leaves can access Spine and have the same cost links. 

ACI overlay
Diagram: Overlay. Source Cisco

Example: 

Let’s take a simple example to illustrate how this is done. Imagine that application App-A wants to send a packet to App-B. App-A is located on a server attached to switch S1, and App-B is initially on switch S2. When App-A creates the message, it will put App-B as the destination and send it to the network; when the message is received at the edge of the network, whether a virtual edge in a hypervisor or a physical edge in a switch, the network will look up the location of App-B in a “mapping” database and see that it is attached to switch S2.

It will then put the address of S2 outside of the original message. So, we now have a new message addressed to switch S2. The network will forward this new message to S2 using traditional networking mechanisms. Note that the location of S2 is very static, i.e., it does not move, so using traditional mechanisms works just fine.

Upon receiving the new message, S2 will remove the outer address and thus recover the original message. Since App-B is directly connected to S2, it can easily forward the message to App-B. App-A never had to know where App-B was located, nor did the network’s core. Only the edge of the network, specifically the mapping database, had to know the location of App-B. The rest of the network only had to see the location of switch S2, which does not change.

Let’s now assume App-B moves to a new location switch S3. Now, when App-A sends a message to App-B, it does the same thing it did before, i.e., it addresses the message to App-B and gives the packet to the network. The network then looks up the location of App-B and finds that it is now attached to switch S3. So, it puts S3’s address on the message and forwards it accordingly. At S3, the message is received, the outer address is removed, and the original message is delivered as desired.

The movement of App-B was not tracked by App-A at all. The address of App-B identified App-B, while the address of the switch, S2 or S3, identified App-B’s location. App-A can communicate freely with App-B no matter where App-B is located, allowing the system administrator to place App-B in any location and move it as desired, thus achieving the flexibility needed in the data center.

Multicast Distribution Tree (MDT)

We have a Multicast Distribution Tree MDT tree on top that is used to forward multi-destination traffic without having loops. The Multicast distribution tree is dynamically built to send flood traffic for specific protocols. Again, it does this without creating loops in the overlay network. The tunnels created for the endpoints to communicate will have tunnel endpoints. The tunnel endpoints are known as the VTEP. The VTEP addresses are assigned to each Leaf switch from a pool that you specify in the ACI startup and discovery process.

Normalize the transports

VXLAN tunnels in the ACI fabric are used to normalize the transports in the ACI network. Therefore, traffic between endpoints can be delivered using the VXLAN tunnel, resulting in any transport network regardless of the device connecting to the fabric. 

Building the VXLAN tunnels 

So, using VXLAN in the overlay enables any network, and you don’t need to configure anything special on the endpoints for this to happen. The endpoints that connect to the ACI fabric do not need special software or hardware. The endpoints send regular packets to the leaf nodes they are connected to directly or indirectly. As endpoints come online, they send traffic to reach a destination.

Bridge domain and VRF

Therefore, the Cisco SDN ACI under the hood will automatically start to build the VXLAN overlay network for you. The VXLAN network is based on the Bridge Domain (BD), or VRF ACI constructs deployed to the leaf switches. The Bridge Domain is for Layer 2, and the VRF is for Layer 3. So, as devices come online and send traffic to each other, the overlay will grow in reachability in the Bridge Domain or the VRF. 

Horizontal scaling load balancing
Diagram: Horizontal scaling load balancing.

Routing for endpoints

Routing within each tenant, VRF is based on host routing for endpoints directly connected to the Cisco ACI fabric. For IPv4, the host routing is based on the /32, giving the ACI a very accurate picture of the endpoints. Therefore, we have exact routing in the ACI.

In conjunction, we have a COOP database that runs on the Spines that offers remarkably optimized fabric in terms of knowing where all the endpoints are located. To facilitate this, every node in the fabric has a TEP address, and we have different types of TEPs depending on the role of the device. The Spine and the Leaf will have TEP addresses but will differ from each other.

COOP database
Diagram: COOP database

The VTEP and PTEP

The Leaf’s nodes are the Virtual Tunnel Endpoints (VTEP). In ACI, this is known as PTEP, the physical tunnel endpoints. These PTEP addresses represent the “WHERE” in the ACI fabric that an endpoint lives in.

Cisco ACI uses a dedicated VRF and a subinterface of the uplinks from the Leaf to the Spines as the infrastructure to carry VXLAN traffic. In Cisco ACI terminology, the transport infrastructure for VXLAN traffic is known as Overlay-1, which is part of the tenant “infra.” 

The Spine TEP

The Spines also have a PTEP and an additional proxy TEP. This is used for forwarding lookups into the mapping database. The Spines have a global view of where everything is, which is held in the COOP database synchronized across all Spine nodes. All of this is done automatically for you.

For this to work, the Spines have an Anycast IP address known as the Proxy TEP. The Leaf can use this address if they do not know where an endpoint is, so they ask the Spine for any unknown endpoints, and then the Spine checks the COOP database. This brings many benefits to the ACI solution, especially for traffic optimizations and reducing flooded traffic in the ACI. Now, we have an optimized fabric for better performance.

Cisco ACI
Diagram: Routing control platform.

The ACI optimizations

Mouse and elephant flows

This provides better performance for load balancing different flows. For example, in most data centers, we have latency-sensitive flows, known as mouse flows, and long-lived bandwidth-intensive flows, known as elephant flows. 

The ACI has more precisely load-balanced traffic using algorithms that optimize mouse and elephant flows and distribute traffic based on flow lets: flow let load-balancing. Within a Leaf, Spine latency is low and consistent from port to port. The max latency of a packet from one port to another in the architecture is the same regardless of the network size. So you can scale the network without degrading performance. Scaling is often done on a POD-by-POD basis. For more extensive networks, each POD would be its Leaf and Spine network.

ARP optimizations: Anycast gateways

The ACI comes by default with a lot of traffic optimizations. Firstly, instead of using an ARP and broadcasting across the network, that can hamper performance. The Leaf can assume that the Spine will know where the destination is ( and it does via the COOP database ), so there is no need to broadcast to everyone to find a destination.

If the Spine knows where the endpoint is, it will forward it to the other Leaf. If not, it will drop the traffic.

Fabric anycast addressing

This again adds performance benefits to the ACI solution as the table sizes on the Leaf switches can be kept smaller than they would if they needed to know where all the destinations were, even if they were not or never needed to communicate with them. On the Leaf, we have an Anycast address too.

These fabric anycast addresses are available for Layer 3 interfaces. On the Leaf ToR, we can establish an SVI that uses the same MAC address on every ToR; therefore, when an endpoint needs to route to a ToR. It doesn’t matter which ToR you use. The Anycast Address is spread across all ToR leaf switches. 

Pervasive gateway

Now we have predictable latency to the first hop, and you will use the local route VRF table within that ToR instead of traversing the fabric to a different ToR. This is the Pervasive Gateway feature that is used on all Leaf switches. The Cisco ACI has many advanced networking features, but the pervasive gateway is my favorite. It does take away all the configuration mess we had in the past.

The Cisco SDN ACI Integrations

OpenShift and Cisco ACI

  • OpenSwitch virtual network

OpenShift does this with an SDN layer and enhances Kubernetes networking to have a virtual network across all the nodes. It is created with the Open Switch standard. For OpenShift SDN, this pod network is established and maintained by the OpenShift SDN, configuring an overlay network using a virtual switch called the OVS bridge, forming an OVS network that gets programmed with several OVS rules. The OVS is a popular open-source solution for virtual switching.

Openshift sdn
Diagram: OpenShift SDN.

OpenShift SDN plugin

We mentioned that you could tailor the virtual network topology to suit your networking requirements, which can be determined by the OpenShift SDN plugin and the SDN model you select. With the default OpenShift SDN, there are several modes available. This level of SDN mode you choose is concerned with managing connectivity between applications and providing external access to them. Some modes are more fine-grained than others. The Cisco ACI plugins offer the most granular.

Integrating ACI and OpenShift platform

The Cisco ACI CNI plugin for the OpenShift Container Platform provides a single, programmable network infrastructure, enterprise-grade security, and flexible micro-segmentation possibilities. The APIC can provide all networking needs for the workloads in the cluster. Kubernetes workloads become fabric endpoints, like Virtual Machines or Bare Metal endpoints.

The Cisco ACI CNI plugin extends the ACI fabric capabilities to OpenShift clusters to provide IP Address Management, networking, load balancing, and security functions for OpenShift workloads. In addition, the Cisco ACI CNI plugin connects all OpenShift Pods to the integrated VXLAN overlay provided by Cisco ACI.

The Cisco SDN ACI and AppDynamics

AppDynamis overview

So, you have multiple steps or services for an application to work. These services may include logging in and searching to add something to a shopping cart. These services will invoke various applications, web services, third-party APIs, and databases, known as business transactions.

The user’s critical path

A business transaction is the essential user interaction with the system and is the customer’s critical path. Therefore, business transactions are the things you care about. If they start to go, it will cause your system to degrade. So, you need ways to discover your business transactions and determine if there are any deviations from baselines. This should also be done automated, as learning baseline and business transitions in deep systems is nearly impossible using the manual approach.

So, how do you discover all these business transactions?

AppDynamics automatically discovers business transactions and builds an application topology map of how the traffic flows. A topology map can view usage patterns and hidden flows, acting as a perfect feature for an Observability platform.

Cisco AppDynamics
Diagram: Cisco AppDynamics.

AppDynamic topology

AppDynamics will discover the topology for all of your application components. All of this is done automatically for you. It can then build a performance baseline by capturing metrics and traffic patterns. This allows you to highlight issues when services and components are slower than usual.

AppDynamics uses agents to collect all the information it needs. The agent monitors and records the calls that are made to a service. This is from the entry point and follows executions along its path through the call stack. 

Types of Agents for Infrastructure Visibility

If the agent is installed on all critical parts, you can get information about that specific instance. This can help you build a global picture. So we have an Application Agent, Network Agent, and Machine Agent for Server visibility and Hardware/OS.

  • App Agent: This agent will monitor apps and app servers, and example metrics will be slow transitions, stalled transactions, response times, wait times, block times, and errors.  
  • Network Agent: This agent monitors the network packets, TCP connection, and TCP socket. Example metrics include performance impact Events, Packet loss and retransmissions, RTT for data transfers, TCP window size, and connection setup/teardown.
  • Machine Agent Server Visibility: This agent monitors the number of processes, services, caching, swapping, paging, and querying. Example Metrics include hardware/software interrupts, virtual memory/swapping, process faults, and CPU/DISK/Memory utilization by the process.
  • Machine Agent: Hardware/OS – disks, volumes, partitions, memory, CPU. Example metrics: CPU busy time, MEM utilization, and pieces file.

Automatic establishment of the baseline

A baseline is essential, a critical step in your monitoring strategy. Doing this manual is hard, if not impossible, with complex applications. Having this automatically done for you is much better. You must automatically establish the baseline and alert yourself about deviations from the baseline. This will help you pinpoint the issue faster and resolve issues before the problem can be affected. Platforms such as AppDynamics can help you here. Any malicious activity can be seen from deviations from the security baseline and performance issues from the network baseline.

Summary: Cisco ACI Components

In the ever-evolving world of networking, organizations are constantly seeking ways to enhance their infrastructure’s performance, security, and scalability. Cisco ACI (Application Centric Infrastructure) presents a cutting-edge solution to these challenges. By unifying physical and virtual environments and leveraging network automation, Cisco ACI revolutionizes how networks are built and managed.

Section 1: Understanding Cisco ACI Architecture

At the core of Cisco ACI lies a robust architecture that enables seamless integration between applications and the underlying network infrastructure. The architecture comprises three key components:

1. Application Policy Infrastructure Controller (APIC):

The APIC serves as the centralized management and policy engine of Cisco ACI. It provides a single point of control for configuring and managing the entire network fabric. Through its intuitive graphical user interface (GUI), administrators can define policies, allocate resources, and monitor network performance.

2. Nexus Switches:

Cisco Nexus switches form the backbone of the ACI fabric. These high-performance switches deliver ultra-low latency and high throughput, ensuring optimal data transfer between applications and the network. Nexus switches provide the necessary connectivity and intelligence to enable the automation and programmability features of Cisco ACI.

3. Application Network Profiles:

Application Network Profiles (ANPs) are a fundamental aspect of Cisco ACI. ANPs define the policies and characteristics required for specific applications or application groups. By encapsulating network, security, and quality of service (QoS) policies within ANPs, administrators can streamline the deployment and management of applications.

Section 2: The Power of Network Automation

One of the most compelling aspects of Cisco ACI is its ability to automate network provisioning, configuration, and monitoring. Through the APIC’s powerful automation capabilities, network administrators can eliminate manual tasks, reduce human errors, and accelerate the deployment of applications. With Cisco ACI, organizations can achieve greater agility and operational efficiency, enabling them to rapidly adapt to evolving business needs.

Section 3: Security and Microsegmentation with Cisco ACI

Security is a paramount concern for every organization. Cisco ACI addresses this by providing robust security features and microsegmentation capabilities. With microsegmentation, administrators can create granular security policies at the application level, effectively isolating workloads and preventing lateral movement of threats. Cisco ACI also integrates with leading security solutions, enabling seamless network enforcement and threat intelligence sharing.

Conclusion:

Cisco ACI is a game-changer in the realm of network automation and infrastructure management. Its innovative architecture, coupled with powerful automation capabilities, empowers organizations to build agile, secure, and scalable networks. By leveraging Cisco ACI’s components, businesses can unlock new levels of efficiency, flexibility, and performance, ultimately driving growth and success in today’s digital landscape.