This post explores the economic factors to consider when choosing a preset for the SVT-AV1 codec. It’s adapted from a new lesson added to the course, Encoding with the AV1 Codec. While this analysis uses the SVT-AV1 codec, it’s useful for choosing a preset for any codec.
I’ve always considered codec preset selection a cost and quality-driven decision. You see this in Figure 1, which displays two preset selection criteria, encoding time and overall quality, as a percentage from 0% to 100%. The maximum number for encoding quality and encoding time is delivered by preset 0; the scores for each preset are the respective percentages of those maximums.
Accordingly, as compared to preset 0, preset 3 shaves roughly 92% of the maximum encoding time and delivers 98.30% of the maximum quality. If you’re running your own encoding farm, this means that preset 3 costs about 92% less to utilize than preset 0, which is a pretty significant number.
Upon consideration, however, while Figure 1 is valuable data, it presents an incomplete picture. Why is that? Because producers don’t choose a preset based upon encoding cost and simply accept the quality that preset delivers. Rather, they choose the required quality level and figure out the most cost-effective preset to achieve that level.
If the quality output by that preset doesn’t meet the required level, you increase the bitrate to meet your target. For this reason, it’s incomplete to consider preset 3 based upon encoding time; you also have to consider the additional bandwidth necessary to achieve your target quality.
Incorporating Encoding and Bandwidth Costs into Preset Selection
That’s what’s shown in Figure 2, which tracks encoding time and the additional bitrate needed to achieve the same quality as preset 0 using each preset. To produce these numbers, I encoded at increasingly higher bitrates with each preset until the VMAF quality equaled that of preset 0. I did this with two ten-second test files and averaged the results.
Getting back to our example, preset 3 does shave encoding time by 92%, but you have to increase your bandwidth by 5% to achieve the same quality as preset 0. To understand whether preset 3 is right for you, you have to incorporate both cost factors into your analysis, along with the expected view count. For an interesting perspective on this, check out How Facebook Encodes Your Videos, which details how Meta incorporates expected view counts and other factors into choosing a codec and encoding parameters to encode uploaded videos.
As shown in Figure 3, when viewed in this light, each preset offers a unique blend of encoding and bandwidth costs. If you’re delivering each video millions of times, it might make sense to use preset 0 to save that .5% of bandwidth. With preset 2, you save 70% of encoding but have to increase bandwidth by a full 1% as compared to preset 0.
If you’re streaming to fewer viewers, preset 6 might make sense, where you reduce encoding costs by 99+% but have to increase bandwidth by 27% to achieve the same quality. Of course, if the view count is really low, AV1 probably doesn’t make sense and you’re probably better off sticking with H.264 or HEVC.
Choosing the Best Preset for Your Production
The point is when it comes to preset selection, encoding time and bandwidth are two sides of the same coin; you shouldn’t consider one without the other. Also, since preset selection is almost certainly the largest factor in encoding cost, you should perform this analysis with more and longer test files over complete encoding ladders. Do a bit more work and you’ll almost certainly get a clearer picture. Once you understand your encoding cost per preset, and the necessary bandwidth adjustments to achieve your quality target levels, you can factor in view counts and compute the most cost-effective preset for your production scenario.