This article compares H.264 to WebM, Google’s implementation of the VP8 codec, using three variables (encoding time, compressed quality, and CPU requirements) for playback on three personal computers. Here’s the CliffsNotes version of the results: Using Sorenson Squeeze to produce both H.264 and WebM, the latter definitely took longer, but there are techniques that you can use to reduce the spread to less than 25%, which is pretty much irrelevant. Though H.264 offers slightly higher quality than the VP8 codec used by WebM using the aggressive (e.g., very low data rate) parameters that I tested, at normal web parameters, you couldn’t tell the difference without a score card. Even compared to H.264 files produced with x264, VP8 holds its own.
The most significant difference between the technologies is the required CPU horsepower to play back the respective files, as shown in Table 1, which contains the results from four different computers. All numbers are “best case,” or the lowest CPU utilization in any of the tested browsers. More on test procedures later.
On a MacBook Pro with GPU acceleration for H.264 decoding, WebM took 38% of the total CPU to play back a 720p file, compared to 24% for H.264 played via Flash and 15% via HTML5 in Apple Safari. On an Acer Aspire One netbook without GPU acceleration for H.264, WebM was actually slightly more efficient than H.264 played back either via Flash or HTML5, though the difference wasn’t significant. Note that the tests on this small-screen netbook involved an 640×480 file, not 720p.
On an HP 8710w mobile workstation with GPU acceleration for H.264 playback, H.264 via Flash required 70% less CPU power than WebM to play back the 720p file, and H.264 via HTML5 took 47% less CPU power. On my daughter’s iMac, WebM and nonaccelerated Flash-based H.264 playback ran neck and neck, while Apple’s Safari, presumably with hardware acceleration, proved 54% more efficient than WebM.
Basically, though, there are huge swings with the individual browsers. Where GPU acceleration exists for H.264, it’s significantly more efficient than WebM; where it doesn’t, the two formats run neck and neck. At this point, between Flash Player 10.1 with hardware acceleration on supported graphics cards and platforms and Apple’s own Safari browser, there are a lot of hardware-accelerated platforms for H.264 playback and few if any for WebM, though they may come in time.
Interestingly, on the WebM website, Google says, “Note: The initial developer preview releases of browsers supporting WebM are not yet fully optimized and therefore have a higher computational footprint for screen rendering than we expect for the general releases. The computational efficiencies of WebM are more accurately measured today using the development tools in the VP8 SDKs. Optimizations of the browser implementations are forthcoming.”
Truth be told, I’m not that much of a geek, so the low-level development tools were a nonstarter for me. However, I did download the DirectShow components to my two Windows computers and played the WebM file via Windows Media Player. On the HP 8710w, CPU load during playback of the same HD WebM file was 18%, with all acceleration disabled, compared to a low of 70% on any of the tested browsers and 21% for hardware-accelerated Flash H.264 playback. On the Acer Aspire One, CPU load dropped to 24%, 30% with hardware acceleration disabled, down from a low of 51% with any of the tested browsers and compared to 53% for nonhardware accelerated Flash-based H.264 playback.
I’m from Missouri (the “Show Me” state) when it comes to all codec-related claims, so I’m not willing to assume that subsequent updates will reduce browser-based WebM playback loads to these levels. If that occurs, however, the value proposition for WebM as compared to H.264 changes to similar quality, a bit slower encode, but much lower playback requirements, which could be pretty compelling, particularly for low-powered mobile markets.
Now that you know the end of the story, let’s dive in at the beginning.
Contents
WebM: A Brief History
According to www.webmproject.org/about, WebM is a “royalty-free, media file format designed for the web.” Briefly, WebM uses the VP8 video codec that Google purchased from On2, the Vorbis audio codec, and a file structure based upon the Matroska container. Though WebM is new, the VP8 codec itself was first launched on Sept. 13, 2008, and comes with some history and some baggage. The history is its predecessor, VP6, which came to prominence when Adobe bundled it into Flash. It is still the most widely used video codec on the internet today. The baggage includes statements in the VP8 press release such as the following:
“With the introduction of On2 VP8, On2 Video now dramatically surpasses the compression performance of all other commercially available formats. For example, leading H.264 implementations require as much as twice the data to deliver the same quality video as On2 VP8 (as measured in objective peak signal to noise ratio [PSNR] testing).”
The press release continues: “In addition, the On2 VP8 bitstream requires fewer processing cycles to decode, so users do not need to have the latest and greatest PC or mobile device to enjoy On2 VP8 video quality.”
During the 11 months that followed the release, On2 never made VP8 available for testing, at least not to me or Streaming Media. And after Google signed the agreement to purchase On2 on Aug. 5, 2009, information about VP8 became tougher to find than President Obama’s fabled Kenyan birth certificate. Google closed the deal on Feb. 19, 2010, and launched WebM on May 19.
Google has never repeated the exact claims that On2 made in the initial press release. But when Google makes claims such as “highest quality real-time video delivery” and “low computational footprint,” it does draw some skepticism. It’s also important to note that VP8 is not a new technology-it’s actually been around for close to 2 years, so it doesn’t get the benefit of the doubt on initial quality or encoding-related issues. That said, you’d have to assume that all browser-related ports began after Google signed the agreement to purchase On2 (August 2009) and perhaps as late as after the deal closed in February, so there certainly could be room for improvement there.
That’s the chatty background. Let’s start looking at test results.
Encoding
For encoding, I looked at encoding speed and video quality; let’s start with the former. As mentioned, I used a prerelease version of Sorenson Squeeze 6.0.4.63 to produce both the WebM and H.264 files. I produced two test files-one SD, one HD-using long standardized encoding parameters. For the SD file, this meant a target data rate of 500Kbps (468 video/ 32Kbps audio), using two-pass VBR encoding with all quality-related settings set to the max. The Squeeze interface makes this simple, with only a few VP8-related encoding controls, such as trading off size versus complexity and compression quality versus speed (Figure 1).
Figure 1. Key VP8-related encoding options in the Squeeze interface. Note the Encoding Threads option.
Interestingly, one of the benefits that Google touts about WebM on its website is “Click and encode. Minimal codec profiles, sub-options; when possible, let the encoder make the tough choices.” I was certainly willing to do that and used Sorenson-provided presets for the most part, along with the required adjustments to meet my resolution and data rate targets.
For H.264, I used the High profile, with CABAC enabled, with three B-frames and three-reference frames, and encoding effort set to best. To complete the circle, I configured the HD test files at 720p at a VBR data rate of 800Kbps video/128Kbps audio (Figure 2).
Figure 2. The H.264-related options used in the test comparison.
When producing both WebM and H.264, Squeeze lets you improveencoding speed by using multiple CPUs-WebM via the Encoding Threads option shown in Figure 1, H.264 by using multiple slices (not shown in Figure 2). As you can see in Table 2, I tested using one thread/slice and 12 threads/slices on my HP Z800 with two six-core 3.33GHz Xeon processors, doubled to 24 total cores via hyperthreaded technology (HTT).
With one thread selected, producing the WebM file was very inefficient, with only about 4% of available CPU utilized, and producing the WebM file took almost four times longer than H.264. With 12 threads/slices selected, CPU utilization jumped to as high as 30%, and the differential dropped to less than 25% for the SD file, though WebM still took 85% longer for the HD file. Note that I tried encoding with all 24 threads enabled, and it actually slowed encoding time.
The rap against using multiple threads/slices is that it can degrade quality because the encoder doesn’t search for interframe redundancies between slices, just within each slice. However, I compared the two WebM files and saw minimal, if any, quality differential. Run your own comparison on your content to verify this. But if encoding time is a concern for you, buy a 24-core system like the Z800 and use multiple threads when producing your WebM files. The bottom line is that WebM takes longer to encode, but the differential isn’t that great and probably will only impact high-volume shops.
Quality Trials
When WebM was first announced, I compared a WebM file against an H.264 file produced by Sorenson Squish. I concluded that “H.264 still offers better quality, but the difference wouldn’t be noticeable in most applications.” Now, I’ve spent a bunch of time producing both formats, and I’ve reached the same conclusion.
Note that Sorenson Squeeze uses the MainConcept codec, which has been the highest quality commercially available codec in my comparison tests. To supplement these tests, I also produced comparison files with the x264 codec, using the QuickTime-based x264Encoder version 1.2.13 (dated 6/27/2010) set to the highest quality, slowest encode preset (Figure 3).
Figure 3. Settings used to create the x264 comparison files.
In my tests, x264 produced slightly higher quality than the H.264 files produced by Squeeze with the Main Concept encoder, which was slightly better than WebM (Figure 4). In my view, however, slight differences in quality are irrelevant if the typical viewer wouldn’t notice the difference absent side-by-side comparisons at normal data rates.
Figure 4. Three comparison images produced with x264, WebM and H.264 using the Main Concept codec (click image to see a larger version).
To explain, I produce my 720p test file at 800Kbps,which means that I’m allocating .029 bits per pixel in the 29.97 frame per second file. In comparison, YouTube produces its 720p H.264 video at about 2Mbps, which means an allocation of .072 bits per pixel, 2.5 times higher than mine. Why are my test files compressed so highly? Because if the data rate is high enough, all technologies look good, and it’s impossible to differentiate.
What’s the most relevant test? In my recent review of video files produced by a range of broadcast and corporate sites, the lowest bits per pixel allocation that I found was 0.43, with most well above .07. Apple produces its iPad advertisements at .168 bits per pixel, about five times higher than my test file, while ESPN produces at .173 and CNN at .106. The notoriously penurious Tiger Woods publishes at .136, though perhaps he’ll tighten this up after losing $750 million in his divorce.
Long story short, if YouTube produces its WebM-based 720p files at 2Mbps at 720p, only the most discriminating viewer will be able to distinguish it from H.264, and then, only with side-by-side comparisons, which, of course, viewers never have. It’s not whether one technology is better than another; it’s whether it’s sufficiently better to make a difference for the typical viewer.
I should point out that some highly respected sources don’t share this opinion. For example, the Graphics and Media Lab of Moscow State University produces a codec comparison every year; this year it included VP8. In terms of H.264 encoding quality, the report concludes that “[t]he x264 encoder demonstrates better quality on average, and MainConcept shows slightly lower quality,” which was my primary motivation for including x264 in this evaluation. Regarding VP8, the report concludes, “When comparing VP8 and x264 VP8 also shows 5-25 lower encoding speed with 20-30% lower quality at average.” I just didn’t see that.
Then there’s x264 developer Jason Garrett-Glaser’s extensive analysis of VP8 and comparison to x264, which concludes that x264 is 28% better than VP8, though his comparisons seem to focus on 1080p delivery, so it’s unclear how much you can generalize these results to streaming. In any event, Garrett-Glaser’s analysis is wonderful reading for anyone who wants to understand the inner workings of the VP8 codec and WebM spec, as well as the patent issues that WebM may be facing.
Both these comparisons rely primarily on automated quality measurements such as Peak Signal to Noise ratio (PSNR) and Structured Similarly (SSIM), which compare the encoded frame to the original and produce a comparative numerical score. I produce the files with the different technologies, making sure that they’re within 5% of the target data rate without dropped frames. I then grab frames for comparison purposes and watch the files side by side to assess the presence of motion artifacts. You can view my HD comparisons at www.doceo.com/HD_Comps.html and my SD comparisons at www.doceo.com/SD_Comps.html, and comparative frame grabs are at www.streaminglearningcenter.com. Draw your own conclusions.
Overall, I’m sure that Garrett-Glaser can coax more quality out of x264 than I can. But find comfort in the fact that the Moscow study concluded that MainConcept “shows slightly lower quality” than x264, which is consistent with my results. Certainly, if you’re using a MainConcept-based tool such as Squeeze, the quality difference between VP8 and H.264 will be meaningless at most relevant data rates.
Playback Requirements
Which takes us neatly back to playback requirements, which is where we started. Let’s introduce the computers and browsers.
Briefly, I tested on four computers. The MacBook Pro has a 3.06GHz Core 2 Duo CPU, 8GB RAM, and was running OS 10.6.2 with an NVIDIA GeForce 9600M graphic chip. The iMac has a 2.0GHz Core 2 Duo CPU, 2.5GB of RAM, and an ATI Radeon X1600 graphics chip while running OS 10.6.
The HP 8710w was running a 64-bit version of Windows 7 on a 2.2GHz Intel Core 2 Duo CPU with 2GB of RAM and an NVIDIA Quadro FX 1600M graphics controller. The Acer Aspire One is a netbook running Windows XP Home Edition with a 1.60GHz Intel Atom CPU, 1GB of RAM, and an integrated Intel 945 Express Chipset for graphics. All computers except the Aspire netbook were measured playing back the 720p file, and Table 3 shows the browsers used in the respective tests.
During testing, I followed this procedure:
• I turned off as many background processes as possible.
• I updated my graphics card drivers.
• First, I loaded the page and waited until the video completely downloaded (watching the little bar thingie on the bottom of the player).
• Next, I played the file, monitoring and recording the CPU usage on Windows Task Manager and percentage idle on the Activity Monitor. Obviously, to compute CPU utilization for the Macs, I subtracted the percentage idle from 100%.
Table 4 shows the results grouped by format and playback environment. Pulling some conclusions out of these disparate numbers is challenging, but here goes. Most obvious is the fractured nature of the HTML5 market, which still lacks a single codec that can play across all platforms. With Microsoft only supporting VP8 playback on IE 9 if the codec is otherwise installed, and Apple squarely in the H.264 camp, Google’s open sourcing of VP8 has done nothing to reduce this logjam. As you have probably heard, Adobe has announced that a future version of the Flash Player will support WebM at some point; that may end up being the only easy way to display WebM video across all relevant browsers. However, one suspects that you could buy a Tastee Freez in Hades before WebM will play on the walled garden of Apple iDevices.
If you’re a Mac owner who consumes lots of video, you can see that results vary by browser and if your Flash implementation is hardware accelerated. If it is, Safari is your best option; if not, go with Chrome. As a random thought, people watching Flash-video on a nonaccelerated Mac via Opera or Safari probably consider Flash a CPU hog-I wonder if their opinions would change if they did switch to Chrome. Ditto for HTML5-based H.264 playback on the Mac, where Safari is the efficiency king and Chrome the laggard.
Regarding WebM, on the platforms without Flash GPU acceleration (the iMac and Aspire), the most efficient WebM implementation required less CPU than the most efficient Flash implementation, though the WebM implementations varied more widely. Absent hardware acceleration, CPU utilization on playback appears to be a wash.
As I mentioned at the beginning of this article, when playing a WebM 720p file on the HP 8710w via Media Player, CPU requirements dropped to 18%, about a third of the requirement of the most efficient browser and within spitting distance of GPU-accelerated Flash playback. On the Aspire for the 640×480 file, CPU load dropped to 24%; 30% with hardware acceleration disabled, which is much lower than any other tested technology. If any of the browser vendors can enable this level of efficiency for web-based playback, WebM would have a significant competitive advantage over H.264, whether Flash- or HTML5-based. If not, given that many newer computers and mobile devices offer some measure of Flash acceleration, WebM may trail in quality, encoding speed, and playback efficiency.
Analysis
As it stands today, WebM’s value proposition is basically that it’s free and better than Ogg Theora; though it’s way too tough to decode as compared to the rapidly expanding base of GPU-accelerated H.264. If the browser vendors and Google can reduce CPU playback requirements to levels shown in Media Player, however, the story changes considerably, at least for pay-per-view and subscription-based video distributors, who currently pay a fee to deliver H.264 video. Ironically, given the lack of universal HTML5 browser support, the most efficient way to distribute WebM to your customers may be via the Flash Player, assuming that the Flash Player’s CPU playback requirements are competitive. You’ll probably still have to write checks to MPEG-LA for delivering to Apple’s iDevices, however.
As you probably know, H.264 is royalty-free for free internet distribution, at least through 2015, so there’s no financial incentive to switch. If your organization wants to migrate toward HTML5, WebM doesn’t provide that single codec solution, is still lower quality than H.264 (however small the difference), and takes longer to encode. Overall, for those not charging for their video, H.264 is still a better solution, and given the rapidly increasing size of the GPU-accelerated installed base, it will likely remain so, unless and until Google creates distribution channels that you can’t access with H.264.