Kentik - Network Observability
Back to Blog

Understanding Network Traffic Blockages in AWS

Phil Gervasi
Phil GervasiDirector of Tech Evangelism
feature-aws-mesh

Summary

In this post, explore the challenges of diagnosing network traffic blockages in AWS due to the complex and dynamic nature of cloud networks. Learn how Kentik addresses these issues by integrating AWS flow data, metrics, and security policies into a single view, allowing engineers to quickly identify the source of blockages enhancing visibility and speeding up the resolution process.


The complexity of cloud networks can sometimes lead to unexpected traffic blockages or denials, which affect application availability and performance and frustrate engineering teams working at the application layer. Platform, networking, and reliability teams are often tasked with identifying and resolving these issues quickly to unblock teams and keep the packets flowing. Because of the scale and nature of cloud networking components, this can be challenging.

The challenge of identifying traffic blockages in AWS

AWS networks are built on a combination of VPCs, subnets, various internet gateways, routing tables, security groups, and ACLs. In many ways, this isn’t so different from traditional networking, except that our only visibility in the cloud is from what cloud providers give us. Engineers feel this lack of visibility acutely because cloud workloads change rapidly, spinning up and down rapidly as new applications are built and demand fluctuates. While cloud networking components offer flexibility and control, they also introduce complexity when troubleshooting network traffic issues.

 

For example, a misconfigured security group or an ACL that’s too restrictive could inadvertently block legitimate traffic. That’s difficult enough to pinpoint in traditional networking, but finding the root cause of these blockages in the cloud is even more tedious, especially in large-scale environments.

In a traditional network, we have the ability to log into devices, run show commands, look at firewall logs, and so on. However, many conventional network monitoring tools fall short in cloud environments due to their lack of visibility into cloud-native components and inability to correlate different data types.

To solve this, Kentik offers a comprehensive solution that leverages cloud flow data, metrics, and metadata to provide context and a clear picture of where and why network traffic is blocked or denied.

Leveraging flow data for insight

Flow data is an essential component in understanding network traffic. In AWS, VPC flow logs tell us information about the IP traffic going to and from network interfaces within VPCs. However, these logs can become overwhelming due to the sheer volume of data generated, especially in dynamic and large-scale cloud environments.

Kentik ingests flow data from AWS VPC flow logs and enhances it with enriched metadata, providing meaningful insights that go beyond raw log data. Kentik was built to operate at an enormous scale and accommodate the amount of data generated by some of the largest organizations in the world.

Then Kentik puts that flow data into context by adding information such as application tags, geographic identifiers, hostname, and eni/IP address tag, etc. By filtering and analyzing this enriched flow data, Kentik can identify patterns that indicate traffic blockages, such as a sudden drop in traffic volume, an increase in denied connections, or specific IP addresses or ports repeatedly being blocked.

In the image below from Data Explorer, you can see a very simple filter for VPC, application, firewall action, and flow direction. Filtering for specific dimensions allows us to immediately focus on what’s important to us; in this case, all blocked traffic is denoted in AWS by a REJECT firewall action.

Data Explorer filter showing blocked traffic

Kentik’s advanced analysis platform also correlates flow data with other metrics, helping pinpoint the blockage’s exact location. This can include identifying whether the blockage is due to a security group rule, a routing issue, or perhaps a problem with DNS.

Analyzing metrics and metadata

While flow data provides a wealth of information, it’s crucial to analyze metrics and metadata in conjunction with flow data to truly understand where and why traffic is being blocked.

Kentik collects a wide range of cloud metrics from AWS, including latency, packet loss, and throughput. By correlating these metrics with flow data, Kentik can help identify performance-related blockages.

Public cloud mesh - hover for details

Correlating firewall policies and security groups

AWS security groups and ACLs are essential components for controlling network traffic within your cloud environment. Still, they can also be a source of traffic blockages, especially if they’re misconfigured or too restrictive, which can easily happen when workloads are quickly productionalized.

Kentik’s platform allows for the correlation of flow data with firewall policies and security group rules. By mapping traffic patterns to the associated security policies, Kentik can quickly identify which rules are causing traffic to be blocked.

For example, Kentik can analyze flow data and correlate it with the security group rules applied to a specific instance or subnet. If traffic is being denied, Kentik can filter for the specific rule within the security group that is causing the denial, allowing for quick remediation. This correlation extends beyond just identifying the rule; Kentik also provides insights into why the rule blocks traffic, such as determining whether the traffic is coming from an unexpected source or using a non-standard port. Using the Connectivity Checker (below), we can trace network traffic in AWS and see where traffic is blocked and by which security group. This same data can be used to audit security groups to determine when a policy is more permissive than intended.

Detail showing where traffic is blocked

Keep in mind that with Kentik, this can be completely self-service in that a network, platform, or SRE team doesn’t necessarily have to be the one to fix the issue. Since Kentik is built on a single database, an engineer of almost any type can use the Kentik Map, Connectivity Checker, or Data Explorer to pinpoint the problem. This means an engineer less comfortable with traditional networking can just as easily find the root cause of a problem.


In the world of AWS networking, identifying and resolving traffic blockages requires more than just basic monitoring tools. Kentik offers a powerful solution that integrates flow data, metrics, metadata, and security policies into a single platform. This approach allows network and cloud engineers to quickly identify where and why traffic is being blocked, ensuring that cloud networks remain secure, performant, and reliable.

Explore more from Kentik

We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.