Versatile Video Coding (VVC) is a codec “drafted by a joint collaborative team of ITU-T and ISO/IEC experts known as the Joint Video Experts Team (JVET), which is a partnership of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG),” as MPEG explains. The codec is designed to meet upcoming needs in videoconferencing, OTT streaming, mobile telephony, and contribution.
According to the final requirements document (Version 5), target compression performance is a 30% to 50% bitrate reduction as compared to the HEVC Main Profile with the same perceptual quality and an encoding complexity of approximately 10 times or more than HEVC. Early third-party performance results show improvements in the 27% to 33% range. VVC will be a royalty-bearing codec though how royalties will be set and administered is unclear.
VVC should be finalized by the end of 2020.
Contents
VVC Overview
Figure 1 shows the development timeline of recent codec standards with VVC in red at the far right. As with previous standards, VVC is jointly promulgated by ISO/IEC via the Moving Pictures Experts Group, or MPEG, and the International Telecommunication Union (ITU). Initial work on VVC was conducted jointly with ISO/IEC JTC1 SC29/WG11 (MPEG) from October 2015 to October 2017 by the Joint Video ExplorationTeam (JVET). In October 2017 a new team was assembled with “the same acronym but called Joint Video Experts Team (JVET).”
Figure 1. Progression of video standards. From Trends and Recent Developments in Video Coding Standardization by Jens-Rainer Ohm and Mathias Wien (via SlideShare).
The initial requirements document called for a technology that could manage the following:
- Picture formats: At least from VGA to 8Kx4K.
- Color space and color sampling: YCbCr color spaces with 4:2:0 sampling, 10 bits per component, YCbCr/RGB 4:4:4 and YCbCr 4:2:2, bit depths up to 16 bits per component, with High dynamic range and wide-gamut color, along with auxiliary channels for transparency, depth, and more.
- Frame rates: Starting with 0Hz and upper limits defined by levels.
The codec was named Versatile Video Coding because it is “meant to be very versatile and address all of the video needs from low resolution and low bitrates to high resolution and high bitrates, HDR, 360 omnidirectional and so on,” explains Christian Feldmann, a codec engineer with Bitmovin. Figure 2 illustrates that idea, listing the test sequences defined in the call for proposals, which included a range of standard dynamic range 1080p and 4K content, as well as 8 HDR videos and 5 360° videos.
Figure 2. Test sequences in the VVC Call for Proposals Figure by Mathias Wien.
VVC Formation
Every MPEG specification sounds like it was assembled during the production of a James Bond movie. With VVC, the initial Request for Contributions on Future Video Compression Technology was published in MPEG Meeting 111 in Geneva, Switzerland (in February 2015), while the initial Requirements Document was published in Warsaw, Poland, in June 2015.
In Hobart, Australia, MPEG issued a Joint Call for Evidence on Video Compression With Capability beyond HEVC in April 2017. In July 2017, in Torino, Italy, MPEG published the results of the evidence, which found “significant gain compared to HEVC was achieved for a considerable number of test cases. Comparable subjective quality (i.e. a comparable MOS value) was observed at 40% to 50% less bit rate than HEVC for the SDR and HDR test cases. Moreover, even higher bit rate savings was observed for some test sequences… As a result of the evaluation of submitted evidence, it has been concluded that evidence exists of the existence of technology that is likely to significantly outperform the compression capability of the HEVC standard after further development and standardization.”
This lead to a Joint Call for Proposals on Video Compression With Capability Beyond HEVC published in Macau in October 2017, and the start of the formal standardization process. The schedule presented in this document shows the standard being completed in October 2020.
From there, VVC ran through multiple working drafts through Draft 5, which was published in Geneva in March 2019. The most recent Test Model (6) was published in Gothenburg, Sweden in July 2019.
Technical Components of VVC
“As in most preceding standards, VVC has a block-based hybrid coding architecture, combining inter-picture and intra-picture prediction and transform coding with entropy coding” (from VVC Test Model 6).
Figure 3. The block diagram of the VVC encoder (from VVC Test Model 6).
There are a range of mostly evolutionary technologies in VVC, including intra prediction, inter prediction, transformation, in-loop filtering, palette coding, block partitioning, affine motion, and decoder side search. These are summarized in Figure 4, from a 2018 presentation entitled Versatile Video Coding – Video Compression Beyond HEVC: Coding Tools for SDR and 360° Video by Mathias Wien, a lecturer at RWTH Aachen University. Note that I combined two slides into one to create the figure.
Figure 4. New features in VVC.
You can find a more detailed explanation of the new features in an article entitled VVC Video Codec – The Next Generation Codec by Bitmovin’s Christian Feldmann. Feldmann also compared AV1 and VVC in a presentation at Streaming Media East 2019. Download the presentation handout and watch the video.
Early Performance Tests for VVC
As mentioned above, responses to the Joint Call for Evidence showed savings in the 40% to 50% range for SDR and HDR tests with higher savings on some clips, though lower gains were reported in the 360° video sequences. These results convinced MPEG to greenlight VVC. Since then, there have been several comparisons involving primarily VVC, AV1, and HEVC.
Before diving in, let’s clarify the nomenclature. JEM stands for the Joint Exploration Model, which was an experimental software version that contains all the tools in the VVC specification. It appears that the JEM has been succeeded by the VVC Test Model (VTM) since the initial set of tests described below. Again, VTM is reference software that contains all the tools in the codec, and I could find no tests using the VTM software.
HM is the HEVC Test Model that also contains all the encoding tools in the spec and is not a commercial encoder. AV1 is an actual shipping version of the AV1 codec. JM is the equivalent for H.264, again not a shipping version of the codec but a reference version with all the available tools in the spec.
While these names sound obtuse and complicate the interpretations of comparisons like those shown in Figure 5 and 6, the names identify the actual encoder used to produce the comparison. For this reason, these names are more precise than designating the results VVC, HEVC, or H.264, which can be produced by any compliant codec produced by any encoding vendor.
Back on point, one of the more notable comparisons was produced by the BBC, published in June 2018 and updated in June 2019. Tests involved HD and UHD clips “representative of typical broadcasting content” measured with both objective and subjective tests. As Figure 5 shows, AV1 produced only minor efficiencies compared to HEVC while VVC (labeled JEM in green) was 27% more efficient than HEVC with UHD content and 33% more efficient with HD video content.
Figure 5. BBC results comparing VVC (JEM in green), AV1 (yellow), and HEVC (HM in orange).
Together with a company called b<>com, Harmonic produced a codec study comparing H.264, HEVC, VP9, AV1, and VVC. Figure 6 shows the BD-Results, which list the bitrate savings (negative numbers) or surplus needed for each codec to produce the same quality as the codec it’s being compared to, with PSNR on the left and VMAF on the right.
Focusing on the bottom row, as gauged with PSNR, VVC can produce the same quality as H.264 (JM) at a 63% bitrate savings, the same quality as HEVC (HM) at a 24% bitrate savings, and the same quality as AV1 with a 20% bitrate savings. With VMAF, the numbers are a savings of 65% (H.264), 30% (HEVC), and 18% (AV1).
Figure 6. BD-Rate savings for VVC as compared to H.264, HEVC, and AV1. Image edited to highlight JEM performance (no numbers were changed).
Not to get off track, but in all fairness to AV1, these results differ from what others have published, including Facebook, Bitmovin, and Moscow State University, which compared AV1 to HEVC or VP9, not VVC, using actual shipping codecs like x264 and x265, not reference models. In Streaming Media tests also performed with x264 and x265, AV1 produced the same quality as x265 with a data rate savings of 18%.
Regarding VVC, these results are promising but will need to be recalibrated once we know the final toolset that will be included in any particular licensed version of VVC. To explain, take a look at the final section.
VVC Licensing Structure
VVC will have a royalty. As we discuss in our article, Inside MPEG’s Ambitious Plan to Launch 3 Video Codecs in 2020, it appears that a third party, the Media Coding Industry Forum, will create profiles and levels based upon licensing terms proposed by each IP owner. It also seems likely that there will be multiple patent pools. In this schema, pool pricing could determine which encoding tools are included in each profile.
How this will work has not been made public yet, but presumably, if one pool charges three times the royalty of the other pools, the tools represented in that pool might be placed in a profile that few implementors would choose to use. Obviously, this impacts competitive quality; the results above include all existing tools, including those that may not be in the profile selected by any particular user. This schema was created by MPEG to avoid the patent holdup that occurred with HEVC where several patent holders declared their proposed royalty structure well after the codec had been implemented.
This is the first codec where patented encoding tools may be relegated to different profiles based upon their royalty rate or other licensing terms. With HEVC and H.264, all tools were included in the final licensable codec, so it’s fair to use the reference models containing all tools to evaluate codec performance. However, with VVC, until we know the final royalty-related profile structure, we won’t know how the performance of the reference model will relate to the version deployed by a particular implementor. The bottom line is that until we know VVC’s licensing structure we really can’t gauge VVC’s quality or performance.
Summing Up
VVC is clearly a CPU-intensive codec that will require hardware support before it can be widely deployed, which pushes deployment to at least two years after the spec is finalized. If the licensing structure isn’t crystal clear at that time, expect few product developers to start incorporating VVC into their product plans. For the same reason, until the IP situation is clear, any quality or other performance evaluations may bear little resemblance to what’s actually deployed by any implementor.