You lusted after the early demonstration units. You watched with growing anticipation as more vendors announced products. As prices began to fall, you laid your plans. And then at the perfect moment, you pounced! A 4K ultra-high definition (UHD) TV, with a screen the size of a dining table, a two-page list of features, and a final price well under $1000 was yours.
You’ve taken delivery, unpacked all the bits, and several volumes of easy instructions later, you’ve set it up. Now let’s watch some UHD.
The answer to the question you are about to ask is a multi-part story involving standards, psychology, politics, economics, and a fascinating new chapter in digital microelectronics. And it ends with you actually getting to watch true 4K content on your new TV, eventually.
One of the issues delaying the arrival of UHD content has been the breadth of needs the new standard must span. In the transition from standard-definition to high-definition, just about everybody agreed that even lowly 720-line progressing-scan (720p) looked a lot better than standard 525-line interlaced-scan broadcast video. But now we are starting to see diminishing returns—or to not see them might be a more accurate statement. In the transition from high definition (HD)—which is now often 1080p—to 4K UHD there is less universal agreement about what will make a noticeable difference in the viewing experience, and consequently where the effort should go first.
“The Society of Motion Picture and Television Engineers (SMPTE) did a survey,” relates Altera Corporation strategic marketing manager, Raemin Wang. “They asked members—who pretty much are the experts on TV quality—what was the most important thing to improve for the next generation: resolution, color depth (the number of bits used to represent color in each pixel) or frame rate. The respondents chose color depth, followed by frame rate, with resolution a distant last.”
And there is yet another independent variable: refresh rate. There are arguments of near-religious intensity for frame rates all the way from 24 frames per second (FPS) to upwards of 300 FPS. But for most people, the whole point of 4K is increased resolution, and they associate that number with a vastly improved viewing experience. They have no idea how much fine print comes after the 4K.
In the manner of standards bodies everywhere, the International Telecommunications Union, keepers of the UHD definition, have tried to accommodate nearly everyone. The result is a remarkable array of possibilities. Resolution can be 3840 by 2160 (4K) or 7680 by 4320 (8K) pixels per frame. The frame rate can be up to 120 FPS. The color space, as defined by ITU Recommendation 2020, may be significantly larger than for HD video—although a given program may or may not use all of it—and may be represented by 10 bit or 12 bit chroma values in any of a number of formats.
A UHD video stream, then, can employ many different combinations of parameters. And those parameters are influenced by three major external issues. One is the source of the content, another is the capability of the production studio, and the last is the distribution medium through which it will travel to the viewer. The interaction of these issues helps determine what actually goes into the video stream, what goes on in the production studio, and what will be required of perhaps the most critical piece of our story, the video compression process. Finally, it will fall to the consumer electronics at the end of the chain to turn what it receives into the best possible experience (Figure 1). Each of these issues contributes to answering our original question: where is my content?
The source of content for UHD programming can influence what happens in the steps after capture and, ultimately, your viewing experience. Today, a good portion of newly-created material is captured in native 4K UHD. Most new professional cameras today support 4K over a range of frame rates. So a fair amount of news, studio programming, sports, and direct-to-video production is starting out in 4K, often in a fairly high frame rate to suppress motion artifacts. But don’t get excited yet: it probably won’t reach you in that format.
Older material is quite a different matter. Existing video, even from fairly recent productions, will be in one or another of the various HD formats, or even in legacy standard-definition format. Movies present yet another set of issues. If the studio has a good print and 4K scanning equipment, the input will be a true 4K UHD stream, usually at 24 FPS. Without a print, the studio is stuck with whatever video format they have as an input.
The typical UHD studio then has to accept a wide range of input formats and transcode them—either into a common internal working format, or directly into whatever format the distribution head-end requires. This transcoding is not a purely mathematical process of interpolation. There is a good deal of judgment, and of art, in tasks such as preserving the geometry and texture of objects when you are expanding one scan line of source material into two or four lines of output. This is especially the case with human facial features, of which humans are amazingly perceptive. Changing frame rates can be an equally fraught issue. Some experts, for example, insist that movies be at 24 FPS, even if that means preserving—or digitally recreating—the slight jerkiness of motion and the blurring of fast-moving objects that make up part of the unconscious experience of watching film projected in a theater. Many of these decisions are at least partly aesthetic, and make up the experiential signature of a particular studio.
In many studios carefully planned transcoding will be the first processing step after content comes in. From there, a typical program might go through editing, video processing for special effects, post-processing, compression, and forwarding to distribution. At each stage the transition to UHD forces a significant increase in processing rate and local storage capacity. In some cases, UHD will also require new algorithms. All of these upgrades must be in place before 4K UHD programming can begin its journey from the studio to your new TV.
Most of the video processing—especially for near-real-time content like live news feeds or sports—is in effect composed from fairly simple Boolean or arithmetic merge operations. Such operations can be done in flow-through pipelines, and often allow segmenting of the frame so that several pipelines can run in parallel. Thus for most common video switch operations the main impact of 4K on the switch is the need for faster processing and larger buffer memory. At 2160p, we are talking about around half a gigapixels per second, so the speed requirements are not trivial. But with some use of parallel processing and careful attention to memory use the task is within the reach of modern FPGAs, ASICs, and even multicore server software.
Of course there are studio effects, such as dynamic conformal mapping, that require per-pixel matrix operations and can consume very large numbers of operations per pixel at 4K. Such effects are often done off-line, but equipment vendors have to be aware of which functions studio engineers will expect—and which directors will want—to be executed in real time.
An extreme example would be the mapping used to wrap the face of actor Paul Walker onto body doubles to complete the movie Fast and Furious 7 after Walker’s untimely death. Today such operations are painstaking off-line tasks and only available in a few specialist shops. But could synthesizing an actor’s face be done in-studio during a commercial break, or in time for a simulated instant replay? Such questions put a commercial value on computing headroom, even as 4K resolution is gobbling up the machine cycles.
While equipment vendors and studio engineers plan their moves to 4K, the landscape is shifting beneath their feet. Dedicated hardware is giving way to virtualized functions—software running on servers. This evolution is abated by the reorganization of studios from hardware silos into networks spreading around a central media-storage system (Figure 2). Adding to the change, those networks are moving from dedicated video transport links to 10G Ethernet, using the SMPTE 2022 protocol to encapsulate the UHD video for transport over Internet Protocol (IP). Some experts say that all video transport in studios will be over IP within a few years.
Thus studio engineers face a multivariate timing problem, in which miscalculation can mean having to rip out and replace good equipment well before its normal end of life. They must produce 4K content from a variety of sources, as soon as possible. To do this they must purchase new equipment and software—just as the nature of the network is changing and functions are moving from hardware to software.
This challenge is making studio managers understandably pensive about their spending commitments. “The move to 4K is just filtering into the studio,” says Wang. “Video capture is pretty much there. But 4K transport and processing capabilities are more rare.”
One function in particular illustrates the technical challenges faced by equipment developers and the hard choices faced by studio engineers. That function is the H.265 High-Efficiency Video Codec (HEVC).
The raw 4K video stream requires four times the bandwidth of today’s 1080p HDTV. Inside the studio, that means faster networks and more storage capacity, both aided by new lossless compression algorithms, and it means more processing power. But out in the distribution network—the satellites, cable wiring, Internet connections, and Blu-Ray players that actually move signals to consumers—quadrupling the available bandwidth is not economically feasible. Nor is greater compression using the existing H.264 Advanced Video Codec a viable answer. Viewers will attest that H.264 is already being stretched to—angry cable customers in the US would say well beyond—its limits for acceptable picture quality, as distributors try to force ever more bits through existing infrastructure. For 4K we need stronger compression.
The HEVC standard delivers exactly that: half the bit rate of H.264 for the same perceived picture quality. That won’t make up for a factor of four—4K will still demand more infrastructure build-out for service providers—but it at least makes a solution feasible.
To accomplish this feat, the encoder not only demands more processing than H.264, but it introduces new functions to the encoding process. As a result, the encoder could require more than an order of magnitude more operations per pixel than did H.264. Through very careful design the developers of the codec managed to keep the decoder significantly less demanding. Consequently HEVC decoders can be done in software, even on relatively small processors such as ARM® cores. But real-time encoders will require new hardware.
The encoder employs all the familiar tricks H.264 uses: discrete cosine transform compression of blocks of pixels, elimination of information that is unchanged between frames using motion-compensated prediction, and prediction within a frame from pixels already decoded. The best of H.264’s coding techniques for the remaining signal using a context-adaptive binary arithmetic coder (CABAC) are extended for more efficient compression. But then HEVC adds new techniques, based, according to Altera senior manager Neal Forse, on limitations of human visual perception.
H.265 improves these existing tools, and adds more–with new filters defined to reduce errors in the coded image, more accurate transforms in more sizes, and much more efficient ways to encode motion from one frame to the next. Cabac has been overhauled to remove some of its bottlenecks. The most important addition is allowing a wide range of block sizes, ranging from 4×4 pixels to 64×64. The encoder has much more freedom than before to choose block sizes and types within a region to achieve the desired combination of compression ratio and image quality for a given frame. The best choice cannot be computed analytically, and when you factor in inter-frame encoding and motion estimation, even where to look for the best choice may not be at all obvious. So the encoder must search for the best choices of partition sizes from a vast range of possibilities by applying heuristics and then trying likely alternatives—a process that will yield incrementally better compression in exchange for rapidly increasing computing resources invested. But it is a process based on experience and lots of viewing hours, not purely on mathematical analysis.
In order to make the huge increase in computing—four times more pixels and many times more options per pixel to consider—feasible, HEVC introduces two ways of breaking the image up for parallel processing: tiles and wavefronts. Neither approach entirely eliminates the need for interprocessor communication or for restitching the image at the termination of the parallel threads, so they have different implications on different kinds of processors. What works well for a multicore CPU may not work as well for a GPU. And an FPGA or ASIC design may forego breaking up the image altogether in favor of fast pipelined operation.
In general HEVC presents the designer with much room for creativity. The specification defines an output bitstream and a set of transforms that may be used. It doesn’t specify algorithms for choosing the parameters or implementing the transforms. So long as you have complied with the bitstream format, there are no right or wrong answers: codec quality is based on subjective judgment of video clips as rated by golden eyes in armchairs, presumably with large buckets of buttered popcorn.
“The HEVC algorithms require iterative development,” says Altera senior manager Neal Forse. “The longer you work at it, the better your codec can be.” This in turn means content providers depend on HEVC encoder developers to achieve the bit-rate/quality point they need for their business objectives.
It is that interaction of algorithms, subjective judgment, and business goals that will determine the fate of 4K. Even with HEVC compression a 4K bitstream will be twice the size of its 1080p equivalent if you are going to maintain the 4x advantage in image quality. But that is a big if. Blu-Ray disks can in principle deliver enough bits per second so that with HEVC compression they can deliver a better viewing experience from 4K than they did from 1080p. The Blu-Ray consortium is frantically working to get the necessary electronics into the market for the Christmas, 2015 selling season. Some satellite and over-the-top Internet streaming services are already providing 4K service, as is at least one US cable TV company. But the exact bit rates, compression ratios, and image qualities of these services are not public information. Some of the Internet streaming services have said that the user needs to have at least 25 megabits per second (Mbps) Internet access, which gives a hint.
In these early days of limited infrastructure and high compression ratios, a great deal of responsibility falls on the HEVC encoder and decoder. Obviously, the encoder has to work as hard as it possibly can to deliver a viewing experience better than 1080p through a channel that is not twice as wide. And it must provide enough redundant information to allow the decoder to correct errors quickly. But the decoder—far more limited in computing resources—will bear the greatest responsibility: it will have to take a signal with compression artifacts, missing packets, and non-deterministic timing and from it reconstruct a great-looking experience. Especially in the beginning, much of the subjective quality of the 4K experience will depend on the proprietary image-improvement and error-recovery algorithms in lowly set-top boxes and TV receivers.
Peril lies to either side of perfect strategy for the service providers. If a wide selection of 4K content takes too long to emerge—now that the TV set manufacturers have started the clock running with steep price drops—viewers may get tired of waiting and flock to some other next big thing. But if service providers push too hard on compression ratios to cram 4K content through their existing infrastructure, poor results will persuade viewers that 4K doesn’t look as good as 1080p.
Either way, get it wrong and 4K UHD ends up like 3D TV: filed under i, for irrelevant. But get it right, and rapid consumer acceptance will yield excellent returns on a new generation of equipment, from cameras to IP-based virtual studios to long-needed investment in distribution networks and Internet access infrastructure. It is a prize worth pursuing. And that proud new 4K UHD TV owner is sitting in front of his screen, waiting, for now.