Apple's iPhone 15 Pro stores spatial video using MV-HEVC for 3D viewing on the Apple Vision Pro.

What is MV-HEVC

Multiview High Efficiency Video Coding (MV-HEVC) is an advanced video compression standard that extends the capabilities of High Efficiency Video Coding (HEVC), also known as H.265. MV-HEVC is designed to efficiently encode multiple views or perspectives of a scene within a single video stream, making it particularly useful for 3D video content. This technology allows for stereoscopic effects, giving the illusion of depth by presenting two slightly different views of the same scene to each eye.

High-Profile Implementation: Apple iPhone and Vision Pro

Figure 1. Recording spatial video with the Apple iPhone 15 Pro. From this article in Digital Camera World.

One of the most high-profile implementations of MV-HEVC is by Apple, specifically for its iPhone 15 Pro and Vision Pro devices. Starting in November 2023, users can capture Spatial Video for the Apple Glasses headset using the iPhone 15 Pro or iPhone 15 Pro Max with iOS 17.2, enabling content production directly from these devices. This marks a significant step in making 3D content creation more accessible to consumers and professionals alike.

The Standard and Its Key Benefits

MV-HEVC was standardized as part of the second version of the HEVC standard, which was completed and approved in 2014 and published in early 2015. The key benefit of MV-HEVC is its ability to deliver efficient 3D display capabilities using standard HEVC decoder hardware. This efficiency is achieved by exploiting the high similarity between the different views of a scene, thereby reducing the amount of data required to represent the additional views.

Capturing spatial video using two cameras on the iPhone 15 Pro.
Figure 2. Capturing spatial video using two cameras on the iPhone 15 Pro.

Capturing MV-HEVC content typically involves using two cameras to shoot slightly different images of the same scene. These images represent the left and right views needed to create a stereoscopic effect, which Apple calls spatial video.

As shown in Figure 2, the Apple iPhone 15 Pro and Pro Max capture separately from two different lenses and store the video in MV-HEVC encoded files. Interestingly, the iPhones’ 3D output has been criticized because the two cameras aren’t sufficiently far apart to create a true stereoscopic image. Still, because it’s Apple and producing and displaying spatial video is simple, spatial video has been nearly uniformly lauded by the press (see here and here).

Interestingly, Apple stores the MV-HEVC video in a 1080p30 file format. As discussed in more detail below, this tells you several things, specifically that it’s not 4K or 360 degrees.

How MV-HEVC Compression Works

The frame structure for MV HEVC video.
Figure 2. The frame structure for MV-HEVC video. The stream for the left eye is a standard HEVC stream; the other stream contains only the differences between the left and right eye views.

MV-HEVC stores a base view, usually the left eye stream, and a stereo layer containing the deltas (differences) between the left and right eye views. These deltas are encoded using standard HEVC frame types. This “2D Plus Delta” technique allows 2D decoders to use the base 2D view, while 3D decoders can calculate and present both views to the corresponding eyes, creating a 3D effect. This schema creates a backward-compatible stream. Standard 2D decoders play the stream for the left eye; 3D decoders play and display both.

Playback on Apple Vision Pro

For example, devices like the Apple Vision Pro decode and display both streams of the MV-HEVC content, providing a depth-filled viewing experience beyond traditional 2D content. However, it’s important to note that while MV-HEVC facilitates a form of 3D video, it does not support fully immersive 360-degree videos.

The difference is illustrated in Figure 3. On the left is, in essence, MV-HEVC video like you would experience with the Apple Vision Pro viewing spatial video shot on the iPhone. The 3 degrees of freedom means you can move your head up, down, and side to side and view something different, but you’re viewing a 1080p video with the illusion of depth. It has edges at the top, bottom, and all four sides and is relatively low resolution (at 1080p).

MV HEVC video compared to fully immersive 3D video.
Figure 3. MV-HEVC video compared to fully immersive 3D video.

Other technologies, like V-Nova PresenZ, offer much more immersion with 6 degrees of freedom. This means truer depth, so if you walk into the video, objects move from in front to behind you.

This is neither a criticism of the iPhone nor of MV-HEVC. You can’t produce an immersive experience with a consumer smartphone, and the storage requirements for immersive video with 6 degrees of freedom are orders of magnitude higher than those for MV-HEVC. Sometimes, however, it’s easier to understand what you get with a particular technology when you understand what you don’t.

Working with MV-HEVC

At least in the short term, editing and output support for MV-HEVC are limited, though this may not last. Many video editors currently edit stereoscopic content; the question with MV-HEVC videos from the iPhone is input and output, meaning ingesting MV-HEVC into the editor and outputting it in the same format.

As of this writing, you could only trim spatial videos on the iPhone, you couldn’t otherwise edit them. This video describes how to convert spatial videos into an editable format using the iPhone app Spatialify, while this video describes editing spatial video in DaVinci Resolve, though again by converting MV-HEVC into an editable format. It does not appear that Final Cut Pro supported MV-HEVC editing as of May 2024.

On the encoding side, Ateme has announced that its TITAN encoders support MV-HEVC and showed this capability at NAB 2024. The Alteon.io platform also supports MV-HEVC. While this almost certainly means ingest, the blog didn’t state whether the platform could output MV-HEVC. At the time of this writing, it doesn’t appear that FFmpeg supports MV-HEVC output.

Conclusion

To date, MV-HEVC is almost exclusively linked to spatial video produced by the iPhone 15 Pro and consumed on the Apple Vision Pro. As evidenced by application support, industry acceptance has been meager, making it more a format for hobbyists than professional producers. Certainly, at 1080p30, MV-HEVC footage captured by these iPhones has limited use for professional productions. We’ll see if any of this changes over the next few years.

About Jan Ozer

Avatar photo
I help companies train new technical hires in streaming media-related positions; I also help companies optimize their codec selections and encoding stacks and evaluate new encoders and codecs. I am a contributing editor to Streaming Media Magazine, writing about codecs and encoding tools. I have written multiple authoritative books on video encoding, including Video Encoding by the Numbers: Eliminate the Guesswork from your Streaming Video (https://amzn.to/3kV6R1j) and Learn to Produce Video with FFmpeg: In Thirty Minutes or Less (https://amzn.to/3ZJih7e). I have multiple courses relating to streaming media production, all available at https://bit.ly/slc_courses. I currently work as www.netint.com as a Senior Director in Marketing.

Check Also

There are no codec comparisons. There are only codec implementation comparisons.

I was reminded of this recently as I prepared for a talk on AV1 readiness …

Seedtag: Harnessing AI for Contextual Audience Targeting

Cookies are gone from Safari and Firefox, and on their way out in Chrome. This …

Why That Amazon Product Follows You Everywhere: A Look at Behavioral Tracking

We’ve all experienced it—you check out a product on Amazon, and before you know it, …

Leave a Reply

Your email address will not be published. Required fields are marked *