Transforming NetOps with Big Data
Summary
Looking ahead to tomorrow’s economy, today’s savvy companies are transitioning into the world of digital business. In this post — the second of a three-part series — guest contributor Jim Metzler examines the key role that Big Data can play in that transformation. By revolutionizing how operations teams collect, store, access, and analyze network data, a Big Data approach to network management enables the agility that companies will need to adapt and thrive.
Enabling a Data-driven Approach for Better Network Management
Across virtually every sector of the economy today, companies face a common imperative: integrate digital technologies and practices or risk falling by the wayside. As we discussed in an earlier post, From NetFlow Analysis to Business Outcome, a Big Data approach to collecting, storing, accessing, and analyzing management data enables the level of collaboration that’s required for an organization to exhibit the key characteristics of a digital business. Those characteristics can be identified as:
- Agile business models and rapid innovation;
- An agile IT function.
In this post we’ll continue our focus on the intelligent use of network management data to enable companies to transform. Building on the first post, we’ll look at the role of network data in enabling this transformation, particularly how a Big Data approach to network management data provides an IT organization with the needed depth of insight to achieve data-driven network operations.
Traffic Growth Stresses Traditional Management Tools
The volume of data traffic across today’s networks is large and growing. Driven in part by the digital business movement, this growth is also propelled by factors such as the increasing use of streaming, the growing number of mobile users and connected devices, and the emerging adoption of the Internet of Things (IoT). Every indication is that the growth will continue. In May of 2015, for example, the Cisco VNI Global IP Traffic Forecast projected (see graph) that global IP traffic will almost triple in volume between 2014 and 2019, from about 60,000 to nearly 168,000 Exabytes per month.
As pointed out in the previous blog, the traditional approaches to network management don’t provide the visibility required to identify security incidents or to troubleshoot problems. In part that’s because legacy systems discard most raw network data, storing only small fraction for the long term. The strong growth in data traffic and the associated growth in management data means that on a going-forward basis these traditional approaches will likely store an even smaller percentage of management data and hence be even less able to provide the required detailed visibility.
One of the key characteristics of network management is that there are multiple sources of management data. Each source has advantages and disadvantages. Flow data (NetFlow, sFlow, J-Flow, and IPFIX), for example, is critical because it provides pervasive coverage and details on the traffic. But the tradeoff for this detail is that there is a huge volume of data to collect, store, and analyze. And flow data also doesn’t provide insight into all aspects of performance.
Another source of management data comes from routing protocols such as BGP. One advantage of BGP is that it provides details on the end-to-end traffic paths. But it lacks any awareness of the traffic that transits those paths. Additional sources of network data are GeoIP databases — which map IP addresses to specific geographic locations such as region, country and city — and packet capture (PCAP), which can give insight into application and network performance.
Traditional management systems use one type of data source. Unfortunately, since network organizations typically don’t know in advance the source of the problem they are trying to troubleshoot, they won’t know if a particular data source is the best, or even an adequate source of insight. A much more powerful approach is to fuse multiple data types to create a multi-dimensional view of network operations. To do that effectively, enabling ad hoc querying across multiple data types, you need to be able to retain, in detail, all of the relevant types of management data. That’s not feasible within the constraints of traditional approaches.
Big Data for Network Management
As described by the Gartner IT Glossary, Big Data involves high-volume, high-velocity, and/or high-variety information assets that utilize cost-effective, innovative forms of information processing to enable enhanced insight, decision making, and process automation. Wikipedia, meanwhile, defines Big Data as a term for data sets that are so large or complex that traditional data processing applications are inadequate. To the extent that Big Data enables more accurate analysis, decisions can be made with greater confidence, and better decisions can result in greater operational efficiency, cost reduction, and reduced risk.
Based on these definitions, network management data is a perfect application for Big Data. As noted, the velocity and volume of network management data is so large that in current approaches most of the raw data must be discarded and only rollups of predefined aggregates can be kept. Further, the need to combine multiple types of network data for effective analysis meets the “high-variety” criterion established by Gartner.
A Big Data solution for network management enables ingesting, storing, and querying of massive amounts of management data. While the batch-based processing of Big Data systems commonly used for business intelligence (BI) analytics may require hours to run queries, a Big Data solution for network operations demands a much faster query response timeframe. The OODA (observe, orient, decide, act) loop for network operations requires actionable information within minutes. This means that Big Data queries must complete in seconds, since even knowledgeable engineers will need to query data repeatedly to observe conditions sufficiently to orient, decide, and execute a course of action.
The good news is that Big Data technologies have been advancing at a rapid rate, enabling them to support the multiple data types, use cases, and OODA timeframes required for network management. As a result, Big Data network management insight is poised to enable data-driven network operations. In our next post we’ll look at how that plays a key role in helping companies transform into the agile digital businesses that are most likely to thrive in tomorrow’s economy.