I just finished testing YouTube’s new H.264-based HD video, and it is very compelling. You can see the video I uploaded at the YouTube site, click here.
Briefly, to have your video encoded in HD, all you have to do is upload the video in 720p format. Note that the usual file size (1 GB) and duration limits (10 minutes) apply. You can upload in a variety of formats; I used H.264. YouTube first creates a very ugly 320×180 video for normal viewing, then the big enchilada.
Video Encoding Parameters
What do we know about the video? Well, thanks to the information screen of the free MediaInfo video analysis tool, we know quite a bit. For starters, we can see that YouTube is producing the file at 720p (1280×720) at a video data rate of about 2 mbps, which translates to a fairly economical .072 bits per pixel. By way of comparison, Apple used .089 bits per pixel for a recent marketing video, while Cranky Geeks uses .072, pretty much identical to YouTube.
On the H.264 front, YouTube is encoding with the High profile with CABAC enabled. These are the specs I’ve been recommending for awhile, but few real world publishers have actually used them, so it’s nice to get affirmation from the largest video publisher in the world. Of course, those producing with Apple Compressor, including, presumably, Apple, can’t use either advanced parameter because Compressor doesn’t support CABAC or the High profile. In addition, only the most recent versions of the QuickTime Player could play back the High profile, so if you were producing for QuickTime playback, the Main profile made more sense.
On the other hand, the Flash Player can play all levels of the High profile, as well as Main and Baseline, so when producing H.264 for Flash, High is the way to go. Interestingly, though not shown, YouTube produced the file with an MP4 extension for playback in the Flash Player, not the F4V. Flash Player doesn’t care, it will play either format, but F4V is the extension used by most recently released encoding tools.
Audio Encoding Parameters
On the audio front, YouTube encoded with AAC LC, with two channels (e.g. stereo) to a data rate of 93 kbps. I would have used mono here to save a channel, but nobody asked. If you’ve been encoding with MP3, you probably think that the 93 kbps is miserly, but AAC LC is much higher quality compression, so should approximate at least 128 kbps MP3, if not higher.
Other Encoding Parameters
What else do we know about the file? Thanks to Inlet Semaphore, we can really dig into the file for details.
Semaphore’s useful file analysis tool.
I’ve attached Semaphore’s file analysis output to this article. From that, you can see that YouTube used a maximum of 3 reference frames (I’ve been using 4). From the frame graph above, you can see that YouTube inserted a key frame every two seconds (the little vertical lines below the time line are key frames). I prefer one every ten seconds because it should improve quality slightly and reduce pulsing, or a periodic, noticeable pulse each time a key frame updates the entire frame. YouTube did enable key frames at scene changes, which is why you see key frames at irregular locations.
I used Semaphore’s frame analysis tool to identify how frequently YouTube inserted B frames, and it looks like they used a value of 2, which means 2 B-Frames between each P or I-Frame, which is in line with my recommendations.
For better or for worse, love it or hate it, YouTube is the single largest distributor of streaming video on the planet, so their video-related actions are always noteworthy. As a video producer, I like that YouTube is willing to be my HD content delivery network, though I’m sure the quality of service guarantees are limited (har, har). Note, also, the player concerns discussed below.
YouTube’s configuration also reflects their obvious judgment that a useful number of ‘Netizens can receive video in real time at 2.1 mbps second, which is among the highest streaming data rates that I’ve seen. That raised my eyebrows. If you’re producing H.264, YouTube appears to be saying that the High profile with CABAC is the way to go. Again, I’d recommend using a longer key frame interval, but the number of B-Frames and reference frames are spot on.
I’ve embedded the video below. Note that you can access the HD version by clicking on the lower right, but that the player won’t expand like it does on YouTube. You can click the full screen button to view the video in full screen, where the full resolution is apparent.
To me, the video playback looked a touch jerky, less smooth than I’ve seen in the past. To make sure that this wasn’t related to the video encoding parameters that YouTube used, I uploaded the file I downloaded from YouTube and posted it here. If you scroll over to about 46 seconds in both videos, you’ll see a panning shot that looks much smoother in simple player that I created from a template in Flash CS3. To be clear, the same file is playing from both players; they’re just located on two different servers.
I found and played several other HD videos in YouTube and noticed that many, but not all, shared the same jerkiness of my test video. Not sure what to make of this, but if I was deploying HD video via the Flash Player, I’d be sure to test real world playback before finalizing my player. No huge relevation there, but there it is.