As quality metrics go, PSNR is considered a blunt tool obsoleted by higher end metrics like Netflix’s VMAF or SSIMwave’s SSIMplus. That said, it’s accessible and understood, and has been used for years. Of course, none of that matters if it delivers misleading results.
So that’s what I decided to test in a round of two tests. This is the first, which includes H.264 configuration options like those I cover in my recent book, Video Encoding by the Numbers, which relies very heavily on PSNR results. The second will see how the metrics compare when evaluating the quality of codecs like x.264, x.265, and VP9.
For this series, I encoded four 720p files in configurations that tested the quality produced by different H.264 profiles, x264 presets, bitrate control techniques, and B-frame intervals. Then I computed PSNR, VMAF, and SSIMplus scores using tools defined in the attached PowerPoint.
I input the results into a table, multiplied PSNR values by 2.25 to get them close to VMAF and SSIMplus, and graphed the results. After a short test description, these graphs are presented in the downloadable PDF accessible below. I am not a math geek, and I know there are mathematical comparisons that I could have supplied. Rather, I decided to let the graphs stand on their own.
Here are the main conclusions from the final page of the PDF.
- The key question is, how many times would I have reached a different recommendation by using VMAF or SSIMplus. The answer is not that often (and never where both SSIMplus and VMAF agreed).
- So, for simple configuration decisions, PSNR results were reasonably consistent with VMAF and SSIMPlus. If PSNR is the only tool you have affordable access to, it appears useful for these types of comparisons.
My preliminary peeks at the codec comparisons show that I won’t be able to make the same claims for that analysis.
Here’s the PDF.