Four days on the show floor at IBC solidified the concept that your choice of the live encoder is dictated by your encoding application. In this article, I’ll review the types of encoders and the trade-offs associated with each type and will identify the type of encoder that works best for a few selected encoding applications.
Types of Live Transcoders
There are generally four kinds of encoders; software-only driven by the host CPU, and hardware encoding delivered by CPUs, GPUs, and ASICs. As compared to software-only encoding, these hardware encoders include codec-specific gates on the chip to perform the transcoding, relieving the host CPU from most transcoding chores.
All encoding approaches involve different tradeoffs, and most of these are delineated in Table 1. As you can see in the table, the primary advantage of software-encoding is quality. That is, with a software encoder on a high-core computer, or distributed environment like the cloud, you can optimize the quality beyond that attainable by hardware transcoders.
There are many obvious tradeoffs. High CPU usage translates to increased cost, increased power consumption, and the associated carbon emissions, and some increase in latency. Obviously, a solution driven by high CPU count isn’t very dense, so throughput is limited, particularly with more advanced codecs like HEVC or AV1.
In contrast, all hardware devices, whether CPU, GPU, or ASIC, deploy limited encoding tools to deliver real-time throughput. This produces a bit lower quality than the best software-only solutions driven by multiple-core systems, with many operational and OPEX advantages, though this varies by type.
CPUs are Central Processing Units, the chips that drive your computer system. They are relatively expensive devices, which boost the cost per stream. CPUs are also very power hungry; for example, the Intel Core i9-12900K CPU draws 150 watts of power, more than 21x the power of the NETINT T408 ASIC. Most computers only have one or two CPU slots, which limits overall throughput and translates to very low density.
GPUs are Graphic Processing Units, which are general-purpose devices designed primarily for graphics applications. This general functionality boosts the price of most data center-oriented GPUs; for example, the NVIDIA T4 costs $2,299. GPUs are also power hungry; the same T4 draws 70 watts of power, though it can produce about 10.5 H.264 1080p streams at normal latency. While it’s possible to incorporate a high number of GPUs into a single system, the power requirements are significant, boosting OPEX and carbon emissions.
Which takes us to ASICs. ASIC stands for Application Specific Integrated Circuits, and NETINT’s ASICs are purpose-built for video decoding and encoding. This focused functionality allows for minimum power consumption. For example, the NETINT T408, which can produce 8 1080p30 streams in either H.264 or HEVC formats, draws only 7 watts of power. The newer generation Quadra can produce 32 1080p30 streams in H.264, HEVC, or AV1, and draws only 20 watts of power.
This makes power a non-issue when configuring a system so that you can deploy up to 24 T408s or Quadras in a 2RU server without any special power requirements. Form factor is another plus; both ASIC generations are available in the U.2 form factor used by NVMe storage devices, and enclosures are compact and inexpensive. Both generations are also available on PCI cards; for example, the T432 is a low height PCIe form factor with four CodensityTM G4 ASICs, and the Quadra T2 is a similarly sized card with two Codensity G5 ASICs.
Table 2 shows the cost and power consumption per stream for two NETINT transcoders and the NVIDIA T4. As you can see, the ASICs deliver about a 75% reduction in cost per stream and up to a 90% reduction in power consumption.
Matching Production Type to Live Transcoder
Now that we know the encoding methods and the relevant differentiating criteria, let’s look at three applications and identify the encoding type ideal for each.
The Super Bowl, or any similar event, is a one-off production where video quality is of paramount importance, and the rest of the differentiating criteria are largely irrelevant. This makes software encoding the preferred encoding tool.
Gaming origination is encoding a single stream on the game player’s computer. Quality is always important but since the gamer didn’t pay billions for broadcast rights, it’s not as important as the Super Bowl. Latency is a bit more important, but since the stream originator is a single player, cost per stream, power consumption, and density are largely unimportant.
Another factor not addressed in the table, CPU utilization on the host CPU, disqualifies software encoding for many gamers. High CPU utilization may translate to sluggish game play, so many gamers prefer either CPU or GPU-accelerated encoding.
The final application is 24/7 interactive gaming. Again, quality is always important though not up to Super Bowl levels. Here, however, the other factors become much more important. Interactive gaming lives and dies upon low latency, though most hardware encoders can meet this requirement.
However, cost per stream, power consumption, and density really differentiate the hardware contenders. A service provider may support thousands or tens of thousands of simultaneous inputs, requiring a low cost per stream. Power consumption is also critical, particularly in Europe where energy prices have spiked for corporations over the last few months.
As mentioned above, CPUs can’t compete in any of these parameters, and as between GPUs and ASICs, ASICs are far and away the least expensive alternative all the way around.
Summary and Conclusion
The bottom line is that all encoder categories have their strengths and weaknesses. When choosing an encoder, you should first delineate the particular requirements of your production and choose the encoder that delivers the best match for those requirements.