An image of a chalkboard defining VPU as a device that decodes, encodes, scales, overlays, and has AI capabilities.
A VPU is a device that decodes, encodes, scales, overlays, and has AI capabilities.

Defining Transcoder, VPU, and VCU

A woman walks into a bar and asks, “I’ll have two video transcoders, please.” Bartender says, “We don’t carry transcoders; we have VPUs. Will they do?” The woman scratches her head and says, “hmm, I don’t know.”

Do you? If not, you’re in the right place. This short article will briefly define transcoder, VPU, and VCU, and get you in and out in under 4 minutes. And hey, it will definitely help the next time you try to order encoding gear in a bar.

In the Beginning, There Were Transcoders

Simply stated, a transcoder is any technology, software, or hardware that can input a compressed stream (decode) and output a compressed stream (encode). FFmpeg is a transcoder, and for video-on-demand applications, it works fine in most low-volume applications. For live applications, particularly high-volume live interactive applications, you’ll probably need a hardware transcoder to achieve the necessary cost per stream (CAPEX), operating cost per stream, and density.

A transcoder decodes an input file and encodes output files.
A transcoder decodes input files and encodes output files.

If Websters defined transcoder (it doesn’t), it might have a picture of the NETINT T408 as the perfect transcoder specimen. Based on a custom transcoding ASIC, the T408 is inexpensive ($400), capable (4K @ 60 FPS or 4x 1080p60 streams), flexible (H.264 and HEVC), and exceptionally power efficient (only 7 watts).

T408 Video Transcoder

The T408 Video Transcoder decodes and encodes H.264 and HEVC video. What doesn’t the T408 do? Well, that leads us to the difference between a transcoder and a VPU.

First, the T408 doesn’t scale the video. If you’re building a full encoding ladder from a high-resolution source, the host CPU performs all the scaling for the lower rungs. In addition, the T408 doesn’t perform overlay in hardware. So, if you insert a logo or other bug over your videos, again, the CPU does the heavy lifting.

Finally, the T408 was launched in 2019, the first ASIC-based transcoder to ship in quite a long time. So, it’s not surprising that it doesn’t incorporate any artificial intelligence processing capabilities.

What is a Video Processing Unit (VPU)?

What’s a Video Processing Unit? A hardware device that performs decode and encode, plus other functions like scaling, overlay, and AI processing. You see this in the transcoding pipeline shown below, which is for the NETINT Quadra.

The processing pipeline for the Quadra Video Processing Unit.
The processing pipeline for the Quadra Video Processing Unit.

When it came to labeling the Quadra, you see the problem; It does much more than a video transcoder. Not only does it decode VP9 (and H.264/HEVC) and output AV1 (ditto), it has all the other hardware functionality. It’s much more than a video transcoder; it’s a video processing unit (VPU). For the record, the Quadra delivers four times the throughput of the T408 (one 8K60 output, or 16 1080p60 outputs) and costs around $1,500.

An image of a chalkboard defining VPU as a device that decodes, encodes, scales, overlays, and has AI capabilities.
A VPU is a device that decodes, encodes, scales, overlays, and has AI capabilities.

As much as NETINT would like to claim the acronym, it existed before the Quadra. That said, if Websters defined VPU (it doesn’t); oh, you get the point. Here’s the required glamour shot of the Quadra.

A picture of the NETINT Quadra VPU
Here’s the NETINT Quadra VPU.

Just to throw a final wrinkle into the mix, when Google shipped their ASIC-based transcoder, Argos, they labeled it a Video Coding Unit, or VCU. Like the T408 and Quadra, the benefits of this ASIC-based technology are profound; as reported by CNET, “Argos handles video 20 to 33 times more efficiently than conventional servers when you factor in the cost to design and build the chip, employ it in Google’s data centers, and pay YouTube’s colossal electricity and network usage bills.”

Of course, Google doesn’t sell the Argos, so if you want the CAPEX and OPEX efficiencies that ASIC-based VPUs deliver, you’ll have to buy from NETINT. Like our woman in the bar, make it a double.

About Jan Ozer

Avatar photo
I help companies train new technical hires in streaming media-related positions; I also help companies optimize their codec selections and encoding stacks and evaluate new encoders and codecs. I am a contributing editor to Streaming Media Magazine, writing about codecs and encoding tools. I have written multiple authoritative books on video encoding, including Video Encoding by the Numbers: Eliminate the Guesswork from your Streaming Video (https://amzn.to/3kV6R1j) and Learn to Produce Video with FFmpeg: In Thirty Minutes or Less (https://amzn.to/3ZJih7e). I have multiple courses relating to streaming media production, all available at https://bit.ly/slc_courses. I currently work as www.netint.com as a Senior Director in Marketing.

Check Also

DCVC-B: A New Deep Learning Codec for Efficient B-Frame Compression

In a recent white paper titled Bi-Directional Deep Contextual Video Compression (DCVC-B), researchers Xihua Sheng, …

M3-CVC: A Glimpse into the Future of AI-Driven Video Compression

A new AI-based codec proved 18% more efficient than VVC but substantial decoding requirements will …

Comparing Fixed GOPs to Variable GOPs with I-Frames at Scene Changes

I first encountered the line, “Anything worth doing is worth overdoing,” in the Robert Heinlein …

Leave a Reply

Your email address will not be published. Required fields are marked *