Network Functions Virtualization (NFV): the phrase has been around for years, seeming a market always on the verge of explosive growth, but never a done deal. It is time to ask what NFV really is, and whether it is really going to happen. Perhaps more importantly, it is time—in this day of cloud services and edge computing—to ask what implications NFV has outside the rather closed worlds of networking and communications.
Let’s start with a definition. NFV is the process of reimplementing in software the tasks that go on inside networking hardware boxes, and running that software on commodity servers in a data center.
This definition leads quickly to an observation that may cut through a lot of marketing hype and media confusion. NFV is not a market, and it is not an application. It is a technique for implementing networks (Figure 1). And since there are networks in all kinds of systems today, from the Internet to 5G cellular to enterprises, data centers, edge-computing boxes, and embedded systems, NFV is relevant to a wide range of systems across an equally huge array of applications.
That is not to say that all these diverse applications are advancing in-phase toward a virtualized future. Each area has its own incentives, priorities, and market realities for what functions to virtualize, and its own signature of bandwidth, memory, and computing requirements for each function. Accordingly, the level of interest, degree of progress, and level of optimism or skepticism about NFV varies across the engineering world as well.
A comprehensive survey across applications, functions, and implementations would be impossible here. So, to present the concepts, we will look at a few specific functions important to one of the furthest-developed applications and generalize from there.
The application that has moved fastest to implement NFV has been 5G telecommunications. A convergence of factors—potentially 10 Gbps data rates to individual mobile devices, rapidly shifting geographic distribution of clients and services, intense local hot spots, and the infeasibility of provisioning the entire metro-area network for peak local workloads, plus the desire to match capital investment against new revenues in the early years of 5G deployment—all argue in favor of software-defined networks based on NFV rather than fixed-function hardware boxes.
But what functions to virtualize? Rather than do a top-down system analysis of a hypothetical 5G network to derive a set of kernel functions—the way math or application libraries are defined—NFV developers have taken a more pragmatic approach. The functions they have virtualized correspond closely to the boxes and products that already existed in the networking world prior to NFV. This allowed developers of the virtualized network functions (VNFs) to pitch their products as almost drop-in replacements for existing networking hardware and software, with no new concepts to propound to system administrators, at least early in the discussion.
Accordingly, the 5G service providers focused on functions that were already key to their 4G LTE networks. Those included broadband network gateways, routers, IPsec functions, the Evolved Packet Core (EPC), and control-plane processes for connection and resource management.
Some of these functions—notably the control-plane tasks—have already been done in software for a long time, so virtualization is just a matter of moving code from one execution environment to another. But other functions have traditionally been implemented in dedicated hardware using ASICs or FPGAs, and these require considerably more work to convert to VNFs.
One example that figures prominently in most discussions of 5G NFV is the Broadband Network Gateway (BNG). This box sits between the network’s clients and external Internet-protocol networks, in effect a gateway between the cellular service provider’s baseband boxes and the public Internet. Examined closely, the BNG is not a single function, but rather a cluster of functions necessary to provide a controlled gateway. At its simplest, the BNG provides address mapping between local and Internet addresses, packet routing in both directions, and control-plane interface functions that allow Management and Orchestration (MANO) blocks to set up address maps and routing tables.
In a fuller implementation, the BNG would also include session-level authentication—a cryptographic task—and accounting, some level of security, and at packet level, policy enforcement and quality-of-service management (Figure 2). Unlike routing, which generally just involves extracting address fields from packets, doing a table look-up, and forwarding the packets to appropriate buffers, accounting, security, policy, and QoS can require inspection of packet payloads, use of regular-expression processors, and fast, very deep buffers in order to grant priority to privileged packets.
In principle there is no problem implementing any of these tasks in software in a data center. At modest packet rates, the data-plane tasks like address mapping, inspection, policy application, and routing can be distributed across many server CPU cores and handled in parallel. Until you get to more advanced methods of content inspection for security purposes, there is no linkage between packets that would prevent parallel processing. But as packet rates increase into the range envisioned for a large 5G network—let alone for the speeds of the networks inside data centers—scaling becomes an issue.
Doing a function in software is necessarily slower and more energy-consuming than doing it in purpose-built hardware: often dramatically slower. And as system integrators compose a virtual BNG (VBNG, if you will) out of software functions, another issue arises. In hardware BNG data planes, functional blocks are pipelined with only shallow buffers between stages. But in a virtualized system, moving packets between tasks often means writing them from one core’s cache to a shared L2 or L3, or worse, to DRAM, or much worse, across the backplane network to another server card’s DRAM, and then into another core’s L1 cache. What had been a local FIFO write and read in hardware becomes a cascading burst of cache, DRAM, and even network activity, with concomitant risk of saturating delicate DRAM channels or sending caches into thrashing. And these issues will be traffic-dependent, not readily predictable at design time. This has led many implementers to look at hardware accelerators as alternatives to pure software implementations of data-plane functions.
The first question with hardware acceleration is what to accelerate. And the easiest answer is usually “everything in the data plane.” Some developers of VBNG software have taken exactly this approach, linking their VBNG code to a commercial hardware BNG box to handle the most demanding packet streams. This can make sense for an edge routing application, but it scales less gracefully to a full-blown 5G central office.
Another approach would be to select some specific, particularly demanding tasks and direct their data flows through a chip-level accelerator. FPGAs are filling this role today, because they are already used in hardware packet processing—often in the very hardware boxes the VNF developer is trying to virtualize—they have already been integrated into data centers as accelerator chips, and they can be quickly reconfigured to accelerate different tasks. And the approach scales: distributing FPGAs through a data center as offload engines can actually reduce total cost of ownership compared to scaling out to more server cards.
That leaves another vital question: where to put the accelerator. The chip can be closely coupled to the server CPU via a dedicated coherent port that allows the FPGA access to CPU-chip caches. It can be used as a smart network interface controller, terminating the backplane network on the server card and communicating with the CPU caches via PCI Express* (PCIe*). Or the accelerator chips can have their own high-speed network, shadowing the CPUs on the backplane network with fast gateways between the two layers. Each approach requires a different level of architectural involvement, and each has its own level of ability to deal with data bottlenecks and to offer agility as workloads shift.
The Packet Core
Another key block for the 5G world is the Evolved Packet Core (EPC). The EPC was defined by the 3GPP standards organization as the set of core functions by which 4G LTE networks would unify both data networking and voice telephony in a single packet-switched Internet Protocol (IP) network.
The EPC block is a superset of a BNG, and contains four major functions, according to the 3GPP definition (Figure 3). Two of the functions are gateways that combine to form a BNG between the radio access network’s baseband processing units and external IP networks such as the carrier’s metro network or the Internet. A third function works through this BNG to handle signaling between client devices and the network control plane. This block is responsible for mobility processing: tracking client devices as they appear at different base stations, paging them when they are idle, and ensuring that connections are secure and authorized properly. This block is assisted by a fourth function, essentially a database of subscriber data and a supporting player in managing mobility, set-up, and authentication.
Some VNF vendors have implemented full virtualized EPCs (vEPCs, naturally). Another approach would be to implement a 5G-directed VBNG, and provide application programming interfaces (APIs) to software modules for the other two EPC functions. This should be effective, as most of the operations done in these two supporting blocks are happening at connection and tracking speeds, not packet speeds.
However, VNF developers and system integrators choose to implement the 5G core—as an atomic VEPC, as a VBNG integrated with supporting VNFs, or as an integration of many smaller functions–virtualization also requires some new blocks that have no direct equivalent in hardware networking equipment. These involve MANO: the management and orchestration of the VNFs. The European Telecommunications Standards Institute (ETSI) has thoughtfully grouped these blocks into three virtualized components: an NFV Orchestrator that keeps track of available VNFs, their active instances, and the infrastructure resources available in the data center for them; a VNF Manager that oversees VNF instances from creation to deletion; and a Virtualized Infrastructure Manager that works with the data-center operating system and hypervisor to get and release the CPU, accelerator, memory, network, and storage resources the VNFs use. The framework also defines the connections between these components and between them and the outside world in a set of open APIs.
A full set of VNFs, including those we have looked at, another vital category that virtualizes the radio access network’s baseband functions, and miscellaneous other blocks, constitute the elements of a 5G central-office network. A system integrator would select VNFs—integrators are rarely VNF developers except by necessity—and interface them together with network management and administration functions in a framework comprising MANO functions, the data center operating system (OS) and hypervisors, and a development environment. This is not a hypothetical scenario. Communications service providers are working with integrators today, especially in China, to implement and field-test portions of this virtualized network, with every intention of having it ready for the first 5G client deployments late next year or early in 2020.
But this is only one rather specialized use for the NFV concept. Component pieces—the little blocks that do routing, inspection, policy enforcement, crypto functions, firewalling, and so on—will have wide application across other markets. And the 5G movement’s growing focus on open-source APIs and VNFs will make it far easier for developers and integrators in other applications to import these basic components into their own virtualized networks.
One obvious application is enterprise networking. Most large enterprises already have extensive networks and large data centers, making the shift to virtualized networks natural. But in enterprise networks the clients are relatively static and the traffic fairly predictable compared to a full 5G network of the future. So, there is no need for the mobility functions in the vEPC. A software-configurable collection of router, policy-enforcer, IPsec, and BNG instances, supported by MANO functions, should be more than adequate. The transition may require added CPU and DRAM resources and upgrading of the data-center’s internal data networks. And particularly intense traffic hotspots or challenging workloads will want hardware acceleration. But all of this is feasible under the umbrella of the work being done for 5G.
The rise of edge computing, to the extent that it puts miniature data centers at the network edge, extends this idea further down-market. The ability to just connect all the local aggregation points such as WiFi routers on a site directly into an edge-computing rack and virtualize all the local network functions right there could be quite appealing—especially if MANO can be handled remotely from the big enterprise data center. But in this limited environment, the ability to parallelize away bottlenecks by adding CPU cores may be more limited, necessitating greater reliance on accelerator chips.
We can take this progression even further. Just as enterprises are shedding private data centers in favor of cloud computing, some vendors are proposing networking as a service (NaaS): linking aggregation points via the internet or private connections to a public cloud data center, where a service provider would operate a virtualized network core.
This could require careful attention to the VNFs, which would now have to conform to the cloud service provider’s rules and would have to be developed, to execute, to use accelerators, and to communicate through the cloud service provider’s platform. But once again the basic suite of functions should be adaptable from those developed for enterprise networks.
Finally, a further extension. As cable companies shift from pushing video toward being internet access providers, their physical infrastructure is looking more and more like a conventional, if quite asymmetric, data network. The head end, once a mass of disk drives, video cables, and switches, is looking more and more like a data center. So, the transition to a virtualized network implemented by VNFs in the head end is entirely natural. This shift is still in the investigative stage, and will almost certainly require additional, domain-specific VNFs. But cable companies and their jointly-sponsored R/D facility CableLabs are busy laying the groundwork.
We have seen that many applications that rely on networks, or in some cases, on interconnections that merely resemble data networks, are moving to NFV. Are there implications outside the networking space?
Certainly, one observation is that if your system design relies on a private IP network—whether it is a plant-wide industrial Ethernet or just an IP spine linking the sensors in a box—you need to keep an eye on NFV. It may be about to offer you some interesting alternatives.
But beyond that, the underlying lesson is that more and more of today’s hardware functions—across many domains, not just networking—are ending up in software. As they go through the migration, the experience of NFV can offer some important ideas.
First, in defining functions, hardware blocks don’t just map into CPUs. It is vitally important to consider the data flows between functions, especially when the data is streaming. Connections that start out as wires or small FIFO buffers in hardware can end up as shared memory structures in the virtualized system, and can put crushing loads on server cache and DRAM bandwidths.
Second, hardware acceleration of the virtualized function may be necessary to match legacy hardware performance. Just parallelizing the code and adding CPUs may hit diminishing returns—either through creating shared-memory bottlenecks or simply through the ill will of Amdahl’s Law–before you reach high enough throughput.
Finally, moving to virtualized functions in a data center requires creation of new kinds of blocks—for orchestration, function management, and working with data-center resources—that aren’t present in hardware-centric designs. And code development for data centers is culturally very different from embedded-system programming.
There are new opportunities, new challenges, and new cultures here. It is not going to be a boring future yet.
For Further Reading
See an overview of Intel® FPGAs in NFV.