MSU VQMT Gets Auto-Scaling But Watch the Scaling Algorithm

One of the first operations I learned in FFmpeg was how to scale subsampled video files back to the source resolution to compute PSNR in the Moscow State University (MSU) Video Quality Measurement Tool (VQMT). Thankfully, for those not familiar with FFmpeg, as of VQMT version 11.1, this operation is no longer necessary. Yup, VQMT can now autoscale your lower-resolution video files back to source resolution in both the GUI and command line, saving time and disk space.

In testing this new feature, I learned two things. First, that the feature is fast and accurate, so I’ll never pre-scale for VQMT again. No surprise there. What was surprising was that your choice of scaling technique can swing your VMAF score by as much as 6 points for a 640×360 file scaled back to 1080p resolution. If you use video quality metrics to analyze lower resolution streams in encoding ladders you’re going to want to read the rest of this article, even if you don’t use VQMT.

But first, the facts. MSU added auto-scaling to version 11.1 of VMQT, which shipped around the end of May 2019. In the GUI, you enable the feature on the left by clicking Geometry transform and choosing the size, strategy, and algorithm.

Using the settings shown in the figure, VQMT will scale the 640×360 MP4 file back to the 1920x1080p resolution of the source file and use the Lanczos scaling method as opposed to the default Bicubic. There are multiple options for size and scale strategy, though none appear appropriate for simple metric computations. The only real variable you’d want to adjust would be the algorithm, which I’ll get back to in a moment.

In the command line, you add this switch to accomplish the same computation as shown in the Figure. Obviously, the manual has more details should you want to adjust the other variables.

-resize lanczos to orig

Testing Reveals that Scaling Algorithms Really Matter

The first aspect of auto-scaling that I tested was performance. To assess this, I compared the VMAF processing time of a 640×360 file auto-scaled in VQMT to a file previously scaled to 1080p Y4M format in FFmpeg. The MP4 file scaled by VQMT took 11:20 (min:sec) to process, the Y4M file took 10:30. So, scaling in VQMT added 50 seconds, about 8%. Not a huge deal.

The second aspect I tested was accuracy/consistency with previous results. Checking the VMAF scores, the file scaled in VQMT scored a 53.31 compared to 59.27 for the file previously scaled in FFmpeg. That’s a huge and unexpected difference.

By way of background, when computing objective metrics on lower-resolution files, I’ve always scaled video in FFmpeg using the Lanczos algorithm because a few years back I found a white paper from graphics vendor NVIDIA that stated that this was the method used on their graphics card. When scaling the 640×360 video to 1920×1080 to compare to the 1080p source, we’re essentially asking the question, “how does that 640×360 video look scaled to 1080p for full-scale viewing?” I used Lanczos because this was the technique deployed on the gazillion NVIDIA chips that would actually be doing the scaling.

I use two primary tools to compute VMAF scores; VQMT for desktop use and the Hybrik Cloud media analysis function for high-volume scoring. With VQMT, I’ve always pre-scaled as described, but Hybrik Cloud auto-scales automatically. Nonetheless, VMAF scores from both products have always been virtually identical, so I could mix the results of one with the results of the other. If the new auto-scaled scores in VQMT were the new normal, I couldn’t use this feature.

Lanczos vs. Bicubic

It was then I noticed that the default scaling mechanism in VQMT was Bicubic, not Lanczos. Since I used the default scaling algorithm for my initial tests, in essence, I was measuring the quality difference between scaling with Lanczos in FFmpeg and scaling with Bicubic in VQMT. When I switched to Lanczos antialiasing in VQMT, the score increased from 53.31 to 57.18, which was great, but still 2.1 points away from the score achieved by scaling in FFmpeg.

At this point, I explained the situation to the good folks at MSU. They understood the issue and added the “prefer FFmpeg scaling algorithm” checkbox shown above for users who want consistent results between prescaling with Lanczos in FFmpeg and scaling with Lanczos in VQMT. Using this option, the scores were within one-thousandth of a point (scale in FFmpeg=59.2730; scale in VQMT=59.2731).

The featured image at the top of this article shows the VMAF result plot for the two streams; As you can see on the top right of the plot, the Y4M file scaled in FFmpeg is in red; the MP4 file scaled by VQMT is in green (hover your pointer over the image if you can’t see the whole image). Though the plot includes the VMAF scores of both files over the duration of the files, the visible plot shows only green because the two scores are virtually identical.

Note that I only checked Lanczos because that’s the algorithm that I use. If you use another algorithm you should perform your own tests to ensure consistency between the two approaches.

So, what’s this all add up to?

  • That your choice of scaling algorithm makes a huge difference when measuring VMAF of lower-resolution files scaled to source resolutions. I’ll continue to use Lanczos going forward for the stated reasons. Whichever algorithm you use, make sure you use it consistently, and if you’re using multiple tools that auto-scale, make sure that they use (or make available) the same technique.
  • If you previously scaled with the Lanczos filter in FFmpeg, you should be able to efficiently produce similar results using VQMT’s new auto-scaling feature. If you used any other scaling technique you should run your own tests to ensure consistent results. I’d appreciate hearing from anyone who tries this to get a more general feel for how this new feature is working (janozer@gmail.com).

About Jan Ozer

I help companies train new technical hires in streaming media-related positions; I also help companies optimize their codec selections and encoding stacks, and evaluate new encoders and codecs.

Check Also

Choosing the Resolution for Lower Rungs on Your Encoding Ladder

Your encoding ladder should include lower resolution rungs even if higher resolution rungs deliver better …

Leave a Reply

Your email address will not be published. Required fields are marked *