Over the next two or three years, streaming producers will be increasingly tasked supplying optimized video streams to devices as disparate as cell phones and set top boxes, along with different quality versions for users accessing the content over the general Internet. While there have been multiple proprietary approaches to this problem, including Microsoft’s multiple bit rate video, one very strong candidate will be an H.264 extension called Scalable Video Coding.
In the dawn of streaming, web sites had to supply multiple streams to satisfy users connecting via different devices, and web sites with different icons for modem, ISDN and LAN connections were very common. In the case of three streams, this tripled the administrative burden of encoding and linking these files, and very much complicated distribution over content delivery networks, particularly edge networks where files were stored for local delivery.
The most prevalent streaming suppliers of the day, Real Networks and Microsoft, both developed technologies to reduce the problem, most notably Microsoft’s multiple bitrate technology, which encapsulated multiple streams into one file. Not only did this reduce the administrative burden associated with multiple files, a Microsoft Streaming server could also dynamically adjust to changing line conditions, sending a lower bit rate stream when the player reported packet loss.
Once broadband became pervasive and modems disappeared, the problem largely went away, as one 500 kbps stream could satisfy virtually all relevant classes of users. With video over cellular becoming increasingly important, and high bitrate streams to set top boxes in the living rooms also on the roadmap for many streaming producers, efficiently delivering different quality streams to multiple devices over various connection bandwidths again becomes critical.
At least three new technologies have moved into the space, Adaptive Streaming from Move Networks, Dynamic Streaming from Adobe, and Smooth Streaming for Silverlight from Microsoft. Within the context of a relatively closed system – single server to single class of player with no delivery network — all of these technologies are very sound solutions.
However, once you involve a content delivery network, or alternative form of transport or player, proprietary systems start to break down. For example, to add a CDN to the mix for any of the three mentioned technologies, it would have support the proprietary adaptive technology, which takes time and investment. For set top boxes and cellular phones to play the streams, they would also have to support the proprietary technology, which the manufacturers of these devices are loath to do.
For the streaming market to successfully expand to the living room and cellular markets, it’s very likely that a vendor-agnostic standard for adaptive streaming would have to emerge. Fortunately, there’s one handy – the Scalable Video Codec extension to H.264 (H.264 SVC) – and it offers a very efficient and elegant solution to the problem of supplying multiple streams.
H.264 Scalable Video Coding
H.264 AVC encodes video into “layers,” starting with the “base” layer, which contains the lowest level of detail spatially (resolution), temporally (frames per second) and from a quality perspective (higher detail). Additional layers can increase the quality of the stream using any or all of these variables.
For example, the base layer of a stream might be encoded at 15 frames per second, a resolution of 320×240, and a data rate of 300 kbps. Additional layers could expand that stream to 720p video at 3 mbps suitable for a set top box, with convenient stopping points for relative high quality streaming over the Internet, say at 640x480x30 fps @ 600 kbps and 720p at 2 mbps. All the layers are incorporated into a single file, reducing the administrative expense of linking and distributing via CDNs.
Figure 1. MainConcept showed three streams of SVC encoded video at NAB. Click here to read that story.
Compared to other approaches, H.264 SVC is very efficient, as the SVC encoded file should only be about 20% larger than the file size necessary to supply the highest quality representation. In other words, if you were encoding the file to send to the set top box at 3 mbps, the SVC encoded file would have an overall data rate of about 3.6 mbps. In addition, the base layer should be compatible with existing H.264 players, so no player upgrade will be necessary to view the base layer stream. With hardware encoders, streaming producers can convert their current formats to SVC compatible streams on the fly, so video publishers like CNN and ESPN won’t need to convert their entire library to leverage the new technology.
You want it when?
When will H.264 SVC become generally available? Within the context of widespread adoption, it’s going to be awhile, because all three elements – encoder, server and player – must be SVC aware to fully leverage the technology. CDN support will also be necessary for larger outlets.
The wheels are in motion, however. For example, H.264 encoding vendor MainConcept showed a technology preview at IBC 2008 in Amsterdam (see the figure above), and is currently looking for technology partners to the end-to-end technology components – encoder, decoder and “network components.” However, the technology will likely find its initial implementations in closed systems, like the Google’s Gmail Chat, which is based upon H.264 SVC technology licensed from Vidyo (http://blogs.zdnet.com/Google/?p=1176), and security applications like those offered by GE Security (http://www.university-gesecurity.eu/catalogue/VSC_SysSupport_Cert.htm).