The Benefits and Drawbacks of SNMP and Streaming Telemetry
Summary
Is SNMP on life support, or is it as relevant today as ever? The answer is more complicated than a simple yes or no. SNMP is reliable, customizable, and very widely supported. However, SNMP has some serious limitations, especially for modern network monitoring — limitations that streaming telemetry solves. In this post, learn about the advantages and drawbacks of SNMP and streaming telemetry and why they should both be a part of a network visibility strategy.
Today, people consume most applications over a network. The system of routers, switches, firewalls, load balancers, wireless access points, network services, and so on all work together as a complex application delivery system. Each device type provides valuable insight into the health of the system and why an application is performing the way it is.
To ensure reliable and high-quality application performance, we must understand how the network is functioning, and to do that, we gather data. Since 1988, that usually meant SNMP; however, some vendors are beginning to support various forms of streaming telemetry, which solves several glaring problems with SNMP.
Naturally, some in the networking industry have claimed that SNMP is no longer relevant in light of streaming telemetry, and there’s an argument to be made there. Streaming telemetry does collect more accurate and complete data, so shouldn’t streaming telemetry be the default for everyone?
But before we throw the baby out with the bath water, it’s important to understand that streaming telemetry has not replaced SNMP. Not yet, at least. Perhaps it will one day, but today and in the foreseeable future, a solid network visibility strategy must be able to ingest and analyze SNMP and streaming telemetry together.
SNMP: veteran of network monitoring
SNMP has been a cornerstone in network monitoring for decades, and for good reason. Despite its shortcomings, it’s a powerful network management and visibility mechanism.
Benefits of SNMP
SNMP has widespread compatibility. One of SNMP’s greatest strengths is its universal support across network devices (as well as many non-network devices), making it a reliable standard for basic monitoring tasks. Most network devices support it, regardless of the device type or vendor.
For example, most brand new $300,000 data center chassis switches support SNMP, and so does that 20-year-old router still in production for out-of-band management.
Also, SNMP is relatively simple and cost-effective. It’s easy to set up and manage, which is a massive bonus for networks that don’t require deep granularity in monitoring and don’t have the staff to set up and manage more advanced forms of telemetry. It doesn’t require extensive training or specialized knowledge, making it accessible to a broad range of IT professionals.
And though age is often looked down upon in tech, SNMP, one of the oldest protocols used in network management, is mature and proven reliable over time. This long history means that it has been thoroughly tested, resulting in a stable and dependable tool in network operations.
From a cost perspective, many SNMP tools are available for free or at a low cost. This makes SNMP a cost-effective solution for network monitoring, especially for smaller organizations or those with limited budgets.
It’s also important to remember that SNMP makes it possible to poll a device and get a data point, and to poll metadata, and SNMP traps offer event notification. While this may sound like a basic feature, it’s critical in proactive network management and quickly responding to issues.
Drawbacks of SNMP
There are drawbacks, however.
Though SNMP makes it possible to poll devices and get a data point, and traps do offer event notification, alerts, notifications, and thresholding all occur in the network monitoring system, not within SNMP.
Also, SNMP struggles to provide the more in-depth data that modern networks often need. For example, it doesn’t easily provide sub-minute information, at least without overloading a typical device. And sub-minute monitoring is required for some modern network operations activities.
That means SNMP can produce a misleading output. For example, when polling a device every five minutes, the resultant graph will show the information taken at that moment every five minutes, an eternity in the networking world. If there are significant changes in the metrics between those five-minute intervals, such as spikes or drops in CPU utilization, traffic, device memory, interface errors, and so on, SNMP wouldn’t report it.
Notice in the image below from a presentation at NANOG 73 which shows spikes in bandwidth usage higher than the interface is capable of which is clearly incorrect.
Ultimately, infrequent polling leads to averaging, which leads to not seeing spikes. SNMP is also not timestamped at the source, and the time it takes for SNMP to be sent, received, and recorded is variable. So when we poll every five minutes, we actually get back results at slightly different times. This means we’re making an educated guess as to what the truth is. This uncertainty introduces additional artifacts in the data, which isn’t the case with streaming telemetry, which timestamps at the source.
Notice in the graph below also from NANOG 73 that incorrect timestamps create false spikes. The graph on the left shows how SNMP can produce incorrect results, in this case spikes in traffic that never occurred. On the right we can see the result of streaming telemetry which produces smoother data.
Next, SNMP simply makes the current status and metrics available. Consider that every engineer in the world looks at charts. This occurs because, even though SNMP isn’t built for it, engineers are all doing the same thing: asking a question repeatedly at a consistent interval so we can plot the answers on a chart. This is at the heart of why SNMP is inefficient and how streaming telemetry makes data collection more efficient. SNMP is made for the NMS to ask the router a question and for the router to answer.
Conversely, streaming telemetry is made for the NMS to ask the router a question and subscribe to the answer forever. Then, the question can be answered without repeatedly asking. Additionally, the router can schedule the preparation and sending of that data in a way that is efficient for itself.
Also, let’s face it, SNMP doesn’t scale well. And by scale, it isn’t necessarily the number of devices we’re polling but the number of interfaces multiplied by the number of metrics by the polling interval.
For example, imagine polling a single switch with multiple linecards full of interfaces and hundreds of sub-interfaces, each with 20 metrics, and once per one-minute interval (or even faster). This can overwhelm the switch and create an incredible amount of noise on the network as the NMS is, in essence, asking for the same information over and over, usually with the same answer every time from the switch.
As network complexity and size increase, SNMP’s effectiveness tends to decrease in a direct inverse proportion. This may not be an issue in small and medium-sized networks, but this can be a significant stumbling block for larger organizations.
Streaming telemetry provides a modern approach
Streaming telemetry is often touted as the next step in network monitoring evolution. It differs in how it works and not necessarily in the information it provides. So, it’s not precisely that streaming telemetry is more customizable or robust in the information it can provide over SNMP. Instead, streaming telemetry solves some of the problems we face when using SNMP.
Benefits of streaming telemetry
The various forms of streaming telemetry, such as gNMI, structured data approaches (like YANG), and other proprietary methods, offer near-real-time data from network devices. Rather than waiting minutes for the subsequent polling to occur, network administrators get information about their devices in (almost) real-time.
Because data is pushed in real-time from the devices and not pulled from an NMS, streaming telemetry can provide much higher-resolution data compared to SNMP. That is an important advantage, especially in complex and high-performance networks that need visibility at a sub-minute or better timeframe.
A push-based model is generally more efficient than SNMP. Streaming telemetry processing often happens in hardware at the ASIC itself instead of the CPU. Streaming telemetry can, therefore, scale in more extensive networks without affecting the devices’ performance.
Drawbacks of streaming telemetry
Arguably, the biggest drawback of streaming telemetry by far isn’t with the technology itself but because it isn’t widely supported yet. Many vendors don’t support any forms of streaming telemetry, or if they do, it’s often on only a select few platforms.
Some network vendors use a proprietary form of streaming telemetry, requiring the vendor’s monitoring system to ingest and analyze. Vendor-specific constraints are a major stumbling block to widespread adoption.
Next, there’s a learning curve that must be overcome when implementing streaming telemetry. Setting it up on devices and monitoring systems can be complex and may require specialized skills. For example, traditional network engineers may need to learn how to use APIs to query various network devices for the first time.
Lastly, if not configured properly, streaming telemetry can send so much information that it negatively impacts bandwidth utilization and the storage capacity of a network monitoring system. Proper tuning is essential for any visibility technology, but it’s especially crucial with streaming telemetry.
Balancing today’s needs with tomorrow’s tech
To throw SNMP out entirely likely means replacing many of the devices in a network. In time, more and more devices will support streaming telemetry, but that just isn’t the case as of today.
It’s fair to say that streaming telemetry will slowly gain in usage as vendors choose to support it on their platforms, but until then, SNMP is everywhere, reliable, and well-understood.
Especially for smaller, simpler networks that don’t need incredibly granular data resolution, SNMP may work just fine for years to come. In the real world of network operations, there’s no one-size-fits-all solution. SNMP and streaming telemetry offer different benefits and cater to different needs, and in this case, the answer isn’t to throw out one protocol in favor of another.
The answer is a visibility strategy that utilizes both SNMP and streaming telemetry to collect information from devices already deployed and those that will be in the future.