Figure 9. Breakeven with a top-heavy distribution.

Comparing H.264, HEVC, VP9, and AV1 in SBE: From BD-Rate to Contextual ROI

Most video engineers use a similar tool stack. MediaInfo for file data, Bitrate Viewer to view the bitrate of H.264-encoded files on Windows, Moscow State University VQMT on Windows for metrics, with a custom combination of data input scripts and Excel for RD curves and BD-Rate data.

Most tools have critical gaps. Few let you compare videos during real-time playback, which is essential to verify whether quality scores actually reflect subjective quality. Few tools below four figures show frame and GOP-related data. None helps you analyze whether a new codec or higher-quality encoding configuration actually makes economic sense.

Most quality-seeking encoding decisions involve increasing encoding cost and increasing storage requirements. AV1 might drop bitrates by 35%, but AV1 is expensive to encode, and the storage costs are additive. Moving from the medium to the very slow preset will allow you to deliver the same quality at alower bitrate but might quadruple your encoding costs. This might make sense if each file is viewed 20,000 times, probably not the number is 500 views.

Understanding these dynamics is critical to effective decision-making. All encoding decisions are, at their core, economic decisions, but there’s no easy way to get from VMAF data to ROI. If you’re using different tools for different functions, you’re often copying and pasting data from one tool to another, including tools working on different operating systems.

The Streaming Learning Center’s Bitrate Explorer (SBE) was designed to simplify and accelerate the types of analysis that help you make better encoding-related decisions. File data, metrics, still-image and frame comparisons, BD-Rate and RD-curves, and an ROI analysis you can refine to incorporate the actual delivery patterns experienced by your users. Integrated and available on Windows, Mac (Intel and ARM).

To illustrate this, I encoded a short 1080p60 football clip to codec-specific encoding ladders with H.264, HEVC, VP9, and AV1 at matched quality (~VMAF 93 at the top rung), In this article, I review the SBE’s capabilities and workflow, taking you through the various tools tab by tab. You’ll learn what each tab shows, what it means, and where the cross-codec comparison isn’t as direct as it looks.

File Intelligence

MediaInfo is pervasive because basic file details underpin any file analysis. At its core, MediaInfo runs FFprobe on the file and reports back data like resolution, codec, and bitrate. SBE does the same, but adds several features:

    • A thumbnail so you don’t have to play the file to figure out what it is
    • With x264 and x265 encodes, it analyzes the data FFprobe delivers to identify the preset used to encode the file
    • If you click the Hide Defaults check box on the right, SBE hides all default settings so you can see the ones that were actually customized. If you’ve hunted through MediaInfo HTML view to identify how a particular encode is different, you’ll find this instantly useful.
Figure 1. The File Intelligence view provides the same data as MediaInfo with several additional features. Click to view at full resolution.

File Compare

The File Compare function is next. If you’ve ever opened up and scanned multiple MediaInfo versions trying to identify how encodes differed, you’ll grasp the utility immediately.  Load multiple filesinto SBE and click the Compare button, and SBE displays the file data in columns. Click Show Deltas only, and the tool hides all common configurations, allowing you to instantly see how the files differ. 

Figure 2: File Compare with Show Deltas Only enabled, four files side by side.

Figure 2 shows the top rung for all four codecs. File sizes at the top: H.264 9.0 MB, HEVC 7.1 MB, VP9 6.8 MB, AV1 4.1 MB. Same source, same target VMAF, AV1 is less than half the size of H.264.

Below are the parameter deltas: preset, profile, GOP, B-frames (3/4/0/0), reference frames, and rate control. The B-frame counts look like the headline structural difference. The Frame Type section below explains why that comparison isn’t as direct as it looks.

In the interest of full and fair disclosure, as with MediaInfo, SBE can only report the metadata the encoder included in the file. With x264 and x265, the information is comprehensive. With VP9 and AV1, much less so. With third-party encoders like AWS MediaConvert, there’s usually even less data. 

How are the bits being spent? — Bitrate Chart

The next tab is the Bitrate Chart, a bit like Bitrate Viewer but it can handle any codec FFprobe can analyze, including AV1, VP9, and VVC, as well as H.264 and HEVC. 

Scanning Figure 3, you see that the average bitrate is a summary statistic. The shape of the bitrate curve over time tells you how each codec’s rate control actually behaves on this content — where it spikes, where it conserves, whether it tracks the scene complexity, and where it fights it. As with MediaInfo, you can also view the bitrate in GOP or frame view. 

Figure 3. The Bitrate Chart is like Bitrate Viewer, but it can analyze many more codecs.

Two patterns worth calling out.

Peak-to-average ratios. H.264 peaks at 7.53 Mbps against a 3.78 avg — roughly 2× the average. HEVC similar. VP9 and AV1 land closer to 1.6×. That ratio matters for CDN cost modeling (your peak bandwidth purchases aren’t priced on averages) and for buffer behavior on constrained networks. A codec that delivers the same average bitrate but with calmer peaks is a different deployment proposition.

Where the peaks land. All four curves spike at the same time positions, because that’s where the source content gets harder to encode. Scene complexity drives bit allocation, regardless of codec. The fact that all four track the same content moments is a sanity check that the rate control is doing its job; if one codec were peaking in completely different places, that would be a red flag worth investigating before you trusted any other number it produced.

Most codec comparisons skip this view entirely. They shouldn’t. The bitrate chart is where you catch rate control pathologies that explain the headline numbers but never appear in a VMAF average.

If you study the top toolbar above the bitrate chart you’ll see the Frame Viewer. This allows you to click and view the still frames or video during real-time playback. We’ll explore this below when studying the Metrics Tab. 

As you’ll learn, you can view a frame and then instantly swap the same frame from the other loaded files. You can also play the loaded videos, instantly switching from one codec to another while preserving real-time playback. We use VLC as the player, so if a file plays at full frame rate in VLC on your analysis computer, it will play at full frame rate in SBE as well. If you doubt that AV1 can deliver quality similar to HEVC at close to 50% of the bitrate, you can easily see for yourself in Frame Viewer, which is available on most of the tabs after this one. 

Are Your GOPS Aligned – Frame Type

The Frame Type tab shows the I/P/B distribution from each codec’s bitstream over time, with average bitrate and VMAF overlaid on the same chart. Useful for verifying GOP structure, confirming that scene-cut I-frames land where you expected, and checking that B-frame patterns match what you configured at the encoder.

Figure 4 shows the four top rungs stacked.


Figure 4. Frame Type for the four codec winners.

Read at face value, Figure 4 says VP9 and AV1 don’t use bidirectional prediction. That isn’t quite right, and it’s worth being precise about why.

H.264 and HEVC have an explicit B-frame frame type in their bitstream syntax. When the encoder uses bidirectional prediction, the bitstream marks the frame as B and FFprobe reports it. SBE shows it.

VP9 doesn’t have a B-frame frame type in its bitstream. It uses alternate reference frames — hidden frames future frames can reference — and compound prediction modes inside frames the bitstream classifies as P. A VP9 encode using bidirectional prediction extensively still shows zero B-frames in any tool reading the bitstream classification.

AV1 is the same shape. AV1 defines KEY_FRAME and INTER_FRAME — no distinct B-frame type. SVT-AV1 defaults to multi-level hierarchical reference structures functionally similar to H.264 pyramid B-frame patterns, but the frames are labeled INTER_FRAME (P) at the bitstream level.

So all four codecs use bidirectional temporal prediction on this content. Two of them surface it via the frame type field; the other two do it via prediction-mode mechanisms FFprobe doesn’t currently expose.

What the current SBE chart does reliably across all four codecs: verify regular GOP intervals, confirm scene-cut I-frame placement, check that x264 and x265 encodes returned the B-frame distribution you configured, and detect open GOPs where you specified closed. The next SBE release adds hidden-frame detection (VP9 altrefs, AV1 frames with show_frame=0) and hierarchical reference level per frame, which will make the VP9 and AV1 columns directly comparable to H.264 and HEVC.

For now, read I/P/B percentages directly across H.264 and HEVC, and treat them as single-codec only for VP9 and AV1.

Measure VMAF, PSNR, and SSIM – Metrics

The Metrics tab runs VMAF, PSNR, and SSIM against the source reference in a single pass, with per-frame curves for each metric. Results are verified to within 0.01% of MSU VQMT. You can switch between metrics with the radio buttons without re-running, export to Markdown for reports or CSV for further analysis, and jump directly into Frame Viewer to investigate any frame of interest.

Figure 5 shows the per-frame VMAF curves for the four top rungs. Averages: H.264 93.0, HEVC 93.1, VP9 93.1, AV1 93.0, confirming the matched quality. Note the poor quality at the start of the x.265 encode, which is very common with x.265. If you’re testing 5-10 second clips, this drop signficantly impacts your average score, though you wouldn’t be aware of it without a view like this.

Figure 5. Per-frame VMAF for the four winners against the source reference.

Note the shared dip near 0:19. All four codecs lose quality at the same moment, in roughly the same magnitude. That’s a sanity check that the dip is a hard moment in the source content, not a codec-specific failure. But a metric drop into the high 60s is exactly where averages stop helping. Is it perceptually significant, or noise, that the metric is over-reporting?

This is where Frame Viewer earns its keep. Click into the dip on any of the four files and SBE opens the still frame in the Frame Viewer window, with the mini chart below tracking your position. From there, you can navigate frame by frame, switch between any of the loaded files with the dropdown or a keyboard shortcut, hit play to view at normal speed (or 0.5×, 2×, 3×), and switch between encodes mid-playback. The mini chart can show bitrate, frame type, or any of the three metrics, synced to the current frame.

Figure 6 shows Frame 1154 at 19.25s on the HEVC winner at the bottom of the dip.

Figure 6. The worst frame on the HEVC winner. VMAF 68.0.

The frame is a sponsor overlay transition with motion blur on the dissolving graphic. VMAF doesn’t score graphics transitions well, so it reports low scores on frames that viewers wouldn’t notice.  A/B playback against the source confirms it; you can swap between the reference, the HEVC encode, and the AV1 encode during playback and the dip is invisible to the eye.

This is the part of the workflow that hasn’t really existed before. Frame-level still comparison is what MSU VQMT does on Windows. Real-time A/B video playback against a reference, with keyboard-shortcut switching between encodes mid-playback and a metric chart synced to the current frame, is the gap SBE was built to fill. If you’ve ever tried to verify whether a VMAF score corresponded to a real perceptual issue and ended up screen-grabbing frames into Photoshop, you know why this matters.

BD-Rate

The BD-Rate tab is next. Bjøntegaard Delta Rate is the definitive measure of codec efficiency, computing how much bitrate one codec needs to deliver the same quality as another, averaged across the full rate-distortion curve. Building it in Excel with the standard plugin is a 2-hour project per source, which is why most practitioners skip it. SBE runs it in under a minute, with results verified against the same plugin.

Load your reference, drag in the encode ladders, and SBE groups files by codec from the metadata. For non-codec comparisons, like preset variants, single- vs multi-pass, CRF ladders,  you can tag manually. Press Calculate and you get the RD curves and the BD-Rate matrix.
Figure 7 shows the result.

Figure 7. RD curves and BD-Rate matrix for the four-codec comparison.

The matrix reads row codec vs column codec. Negative values mean the row codec is more efficient, so it needs less bitrate to deliver the same quality across the ladder. Like the Excel plug-in, green is good, red is bad. Against H.264 as the baseline:

  • AV1: -59.83% (AV1 needs roughly 60% less bitrate than H.264 for the same quality)
  • VP9: -35.08%
  • HEVC: -31.48%

VP9 vs HEVC: -5.65%. VP9 is marginally more efficient than HEVC on this content. That ordering can flip on other content, which is a feature of BD-Rate rather than a bug; worth running on multiple sources before drawing a general conclusion about either codec.

Two limitations of the matrix worth flagging. First, BD-Rate weights all bitrate ranges equally, which doesn’t match any real audience’s viewing distribution. Second, it doesn’t account for the encode-time cost of getting there. So, AV1 at preset 4 might be 60% more efficient than H.264, but if it takes 8× the encoding time, the deployment decision is different than the BD-Rate matrix alone suggests. The Breakeven tab handles both.

Breakeven

The Breakeven tab takes the BD-Rate result and adds the cost side: encoding cost premium, time to ROI, net dollars over the period, and (under an audience distribution) the quality each codec actually delivers to the viewer. The point is that BD-Rate’s same-quality efficiency number doesn’t determine whether a codec is worth deploying. Encoding cost premium, audience volume, audience distribution, and quality lift all enter the decision.

The relationship between BD-Rate and audience-weighted Breakeven matters for reading the numbers. BD-Rate is a same-quality comparison by definition: how much less bitrate codec X needs to match codec Y’s quality across the full RD curve. When you apply an audience distribution, you stop holding quality constant. You’re computing the audience-weighted bitrate each codec delivers and the audience-weighted VMAF that comes with it. The savings percentages and quality deltas will differ from BD-Rate, sometimes substantially.

Here are the inputs shared across the three views below: CDN cost: $0.02/GB; baseline encoding cost: $5/hour; encoding cost multipliers (AV1: 4x, HEVC: 2x, VP9: 1.5x); projected viewing hours: 250,000.

Figure 8 shows the default view, with no audience distribution applied.

Figure 8. Breakeven with no audience distribution. The savings percentages match BD-Rate.

Efficiency over H.264 (the BD-Rate result): AV1 -59.83%, VP9 -35.08%, HEVC -31.48%.

Breakeven hours: VP9 520, HEVC 1,140, AV1 2,025.

Net savings over 250k viewing hours: AV1 $1,836, VP9 $1,199, HEVC $1,091.

VMAF change: blank. BD-Rate is a same-quality comparison by definition, so there’s no quality delta to report.

The story this view tells is the relationship between bandwidth savings and encoding cost premium. AV1 saves the most bandwidth per hour but needs the most viewing hours to pay back its 4x encoding cost. VP9’s 1.5x multiplier means it reaches profitability four times faster than AV1 even though it saves much less bandwidth per hour. Below a few thousand viewing hours, AV1’s encoding premium hasn’t been amortized, and the only codec actually making money is VP9.

Figure 9 applies a top-heavy distribution, with 81.7% of viewing on the 3.8 Mbps top rung. That’s a premium-tier viewing profile

Figure 9. Breakeven with a top-heavy distribution.

Efficiency over H.264: AV1 -51.12%, VP9 -21.97%, HEVC -18.65%. Each codec is less efficient at this audience-weighted bitrate than its BD-Rate average across the full ladder.

Breakeven hours: VP9 368, HEVC 866, AV1 948.

Net savings over 250k viewing hours: AV1 $3,942, VP9 $1,697, HEVC $1,438. Dollar numbers go up despite lower percentages because more absolute bandwidth is in motion at premium-tier viewing volume.

VMAF change: AV1 +0.93, VP9 +0.57, HEVC +0.58. Small deltas. Premium content sits near the top of the quality curve where all four codecs converge, so the codec choice mostly buys bandwidth efficiency rather than quality lift.

The story this view tells is that audience volume amplifies absolute dollar savings even when percentage savings shrink. AV1’s net dollars more than double the no-distribution case, and VP9’s breakeven drops to 368 hours. At a premium service with significant viewing volume, AV1 produces the largest absolute return. At smaller premium audiences, VP9 may pay back faster and net more total dollars.

Figure 10 shifts to a mobile-centric distribution, with viewing weighted toward the lower rungs of the ladder.

Figure 10. Breakeven with a mobile-centric distribution. The ranking shifts.

Efficiency over H.264: AV1 -23.16%, VP9 -4.96%, HEVC -3.99%. Each codec is much less efficient at this audience-weighted bitrate than its BD-Rate average.

Breakeven hours: VP9 3,188, AV1 4,099, HEVC 7,936.

Net savings over 250k viewing hours: AV1 $899, VP9 $194, HEVC $152.

VMAF change: AV1 +6.96, VP9 +3.96, HEVC +3.48. H.264’s audience-weighted VMAF is 84.01 at this distribution. AV1 brings it to 90.97, VP9 to 87.97, HEVC to 87.49.

The story this view tells is that, at constrained bitrates for mobile-centric content, codec efficiency shows up more as a quality lift than as a bandwidth reduction. AV1’s bandwidth savings shrink to 23%, but it delivers nearly +7 VMAF over the H.264 baseline. The deployment question on this audience profile isn’t mostly about saving bandwidth. It’s about whether your viewers see 84 VMAF or 91 VMAF.

The three codecs offer different deployment recommendations across three axes.

Audience volume. Below VP9’s breakeven point, no codec is making money. VP9 is the only one paying back fast enough to matter at small scale, because its encoding cost premium is low enough that even thin bandwidth savings cover it.

Audience distribution. Premium-tier viewing makes the choice mostly about bandwidth efficiency, because the codecs converge near the top of the quality curve. Mobile-centric viewing makes it about quality delivered at constrained bitrates, because that’s where the codecs diverge.

Optimization target. If you’re optimizing for absolute dollars at scale, AV1 produces the highest return across all three distributions. If you’re optimizing for fastest payback, VP9 reaches it first in every distribution. If you’re optimizing for QoE on mobile-centric content, AV1 again, because the +7 VMAF lift is large enough to justify the encoding premium at any reasonable audience size.

HEVC gets squeezed on every distribution at this content profile: less bandwidth savings than VP9, less quality lift than VP9, longer breakeven than VP9, and less net dollars than VP9. There’s an audience and a content type where HEVC is still the right answer, but it isn’t this one.

This is the analysis the field has been doing by hand in custom spreadsheets, or skipping entirely. Having it in the same window as the BD-Rate that feeds it is what turns “which codec is more efficient” into “which codec is right for our deployment, our audience, and our quality target.”

Wrapping up

The findings from this comparison:

  • AV1 produced a file less than half the size of H.264 at matched VMAF (4.1 MB vs 9.0 MB)
  • VP9 and AV1 deliver calmer peak-to-average bitrate behavior than H.264 and HEVC
  • I/P/B percentages are directly comparable across H.264 and HEVC; single-codec only for VP9 and AV1 until the next SBE release surfaces hidden frames and hierarchical references
  • The shared metric dip at 0:19 turned out to be a sponsor overlay transition — perceptually clean once you A/B against the source in Frame Viewer
  • BD-Rate ranking: AV1 ~60% more efficient than H.264, HEVC ~31%, VP9 ~35%, with VP9 marginally outperforming HEVC on this content
  • Codec ranking depends on audience distribution. HEVC nearly stops being worth the encoding cost on a mobile-centric audience

BD-Rate provides only general guidance and little meaningful insight into the economic impact of your encoding decisions. To recommend a new codec, or even a simple change in encoding configuration, you need to understand the return on investment it provides, given your encoding ladder, your distribution pattern, and your encoding and distribution cost structure. Before that, you need to verify that the metrics underpinning the BD-Rate calculations are accurate, which means still-image and real-time playback comparison.

SBE delivers a comprehensive, integrated toolset that lets you elevate your analysis and recommendations from generic BD-Rate to contextual ROI. It’s the only available tool that enables real-time playback comparisons and the only that allows you to comprehend and present the economic impact of your encoding-related decisions.

SLC Bitrate Explorer is free for a full 14-day trial. After the trial, File Intelligence, File Compare, Bitrate Chart, and still-frame Frame Viewer stay in the free tier. Metrics, BD-Rate, Frame Viewer video playback, and the Breakeven Calculator are Pro at $109.99 once. SBE is available on Windows, Intel Mac, and Apple Silicon.

About Jan Ozer

Avatar photo
I help companies train new technical hires in streaming media-related positions; I also help companies optimize their codec selections and encoding stacks and evaluate new encoders and codecs. I am a contributing editor to Streaming Media Magazine, writing about codecs and encoding tools. I have written multiple authoritative books on video encoding, including Video Encoding by the Numbers: Eliminate the Guesswork from your Streaming Video (https://amzn.to/3kV6R1j) and Learn to Produce Video with FFmpeg: In Thirty Minutes or Less (https://amzn.to/3ZJih7e). I have multiple courses relating to streaming media production, all available at https://bit.ly/slc_courses. I currently work as www.netint.com as a Senior Director in Marketing.

Check Also

CMSD-MQA: Carrying Quality Scores Through the Live Streaming Chain

I’ve been tracking CMSD-MQA for a while now. Briefly, Common Media Server Data for Media …

When There’s No FRAND: What Dolby’s Suit Against Snap Means for the Industry

On March 24, 2026, Dolby sued Snap Inc. (Snapchat) for AV1 and HEVC patent infringement …

Broadpeak Debuts “Best of Both Worlds” Multi-Package Multiview

I recently interviewed Damien Sterkers, VP of Products and Solutions Marketing at Broadpeak, to discuss …

Leave a Reply

Your email address will not be published. Required fields are marked *