Encoding Brief: Apple Releases HLS Authoring Specification for Apple TV

Home/Blogs/Encoding Brief: Apple Releases HLS Authoring Specification for Apple TV
By | 2017-02-23T00:47:46+00:00 March 30th, 2016|Blogs|Comments Off on Encoding Brief: Apple Releases HLS Authoring Specification for Apple TV

Executive Summary: In October 2015, Apple released a  document entitled HLS Authoring Specifications for Apple TV (HLS stands for HTTP Live Streaming, the adaptive bit rate technology used to deliver video to Apple TV and other iOS devices). If you’re producing for Apple TV, and aren’t aware of these specs, you should review them immediately. In a broader sense, Apple is introducing another TN2224-like fixed encoding ladder while the industry moves towards content-aware encoding with a different ladder for all types of content. It will be interesting to see how this dynamic evolves in the next 12-24 months. 

Overview and Discussion:

Apple is obviously the expert when it comes to encoding for their devices, and the new document is much more specific than Apple Technote TN2224, the previous document of record for Apple TV encoding. The new specification has 12 sections, including video encoding, audio encoding, advertising, accessibility, subtitles, trick play, media segment durations (6 seconds with a keyframe interval of 2 seconds), M3U8 playlists, master playlists, and security. The specification does not reference any Apple testing requirements, or what happens if the requirements or recommendations are not met. Presumably, as with TN2224, producers who don’t meet the spec may have their apps rejected when submitted for Apple TV approval. 

Most provisions make perfect sense. For example, producers can now use 200% constrained VBR for encoding, rather than the 110% constrained VBR in TN2224.

30i to 60p?

Others are confusing. For example, the spec states, “You SHOULD de-interlace 30i content to 60p.” A note to the encoding ladder (presented below) also states, “30i source content is considered to have a source frame rate of 60 fps.” Though 30i footage is clearly on the wane, the approach taken by the vast majority of producers is to deinterlace to 30p, not 60p. 

One problem is that each 30i field has only half the resolution of the complete frame. Converting from 30i to 60p works well if you subsample the video down to half the original resolution, but requires special processing and interpolation to create a frame with the original resolution (see here). These techniques are time consuming and generally not incorporated into most encoding tools. 

The other problem with 60p files is that at the same encoded data rate as 30p, the 60p file has half the bits per pixel value, which generally means lower compressed quality. Combining this with the interpolation described above could lead to significantly degraded video quality. 

Note that the new spec references RFC 2119 for the meaning of words like SHOULD. In this regard, RFC 2119 states, “This word, or the adjective “RECOMMENDED”, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.” While most of the new spec seems reasonable, this requirement SHOULD very likely be ignored. 

Regarding the Fixed Encoding Ladder

On the fixed encoding ladder recommended by Apple, Netflix started what will be a groundbreaking trend with their per-title optimization strategy announced in December 2015. As explained in their blog post, because all videos present different degrees of encoding complexity, deploying the same fixed ladder for all is inherently inefficient and suboptimal. Netflix claims that per-title encoding is necessary to optimize viewer quality of experience, which tests performed by the Streaming Learning Center have confirmed. YouTube has since revealed a similar strategy, and there will be a number of third-party encoding tools and/or services enabling per-title optimization in the first half of 2016 and beyond. Most major producers will likely be using per-title encoding by early to mid 2017, if not before. 

The spec introduces the new ladder with the statement, “For expected bit rate variants, see Table 2-1” (included below), which is devoid of any RFC2119-defined terms. So it’s unclear whether using the fixed ladder is mandatory. 



1. If you’re encoding for Apple TV, you should check out the new specification immediately.

Additional Resources: 

HLS Authoring Specification for Apple TV

Apple Technical Note TN2224

How Netflix Pioneered Per-Title Video Encoding Optimization, Streaming Media Magazine, January 14, 2016. 

Those unfamiliar with HTTP Live Streaming may want to read, How to Encode Video for HLS Delivery, Streaming Media Magazine, March 2014. 

About the Streaming Learning Center:    

The Streaming Learning Center is a premiere resource for companies seeking advice or training on cloud or on-premise encoder selection, preset creation, video quality optimization, adaptive bitrate distribution, and associated topics. Please contact Jan Ozer at jan@streaminglearningcenter.com for more information about these services.


#1Sylvain CorvaisierSaid this on 03/31/2016 At 05:58 am

The document seems to be a compiled version of "best practices for HLS" (gathered from HLS Draft, Apple developer forums and deployments) over the years, which is good for those which have not been familiar with the standard from day 1

This encoding ladder is not mandatory, we now have 15 linear TV channels in apps available on Apple TV, and a few VOD ones. I have to say though that the ladder we've been using for 3 years is very close to the one recommended by Apple now on Apple TV. Some other "MUST" from the document don't prevent validation

The ads section is really important as it says only Server Side Ad Insertion/Ad Stitching should be used. That makes it the best way to prevent adblocking.

It's interesting to see that Apple also suggests 2s keyframe period (10s before), I guess to enhance stream launch times on Apple TV.

If followed strictly, these recommendations would mean it wouldn't be possible to share the same encoding ladder with other devices (even iOS ones), even with manifest manipulation. This is untrue, we share same encoding ladders (with manifest manipulations) for streams also broadcasted on Apple TV.


#4JanSaid this on 04/01/2016 At 07:51 amIn reply to #1


Thanks for the real world input. 

:-) Jan 

#2Ian GetzSaid this on 03/31/2016 At 06:06 pm

I always find great articles on this site. But maybe I missed something here... where is this 'new specification'? 

The only changes to the specification at https://developer.apple.com/library/tvos/documentation/General/Reference/HLSAuthoringSpec/index.html#//apple_ref/doc/uid/TP40016596-CH4-SW1 are the following:


Updated Dolby Digital bitrate recommendation to 384 kbps.


Fixed typo in section 10.2.

Thank you.


#3JanSaid this on 04/01/2016 At 07:46 amIn reply to #2Thanks Ian; as the article states, the spec was announced in October of 2015. However, I first heard about it about two weeks ago. So, while not new, it is new to me.


Thanks for the kind words about the site.

Jan a#5Ian GetzSaid this on 04/01/2016 At 10:02 amIn reply to #3

Ah okay. Thanks, Jan. Fair enough. :D

I learn so much from your articles so please keep up the great work.

#6Alex ZambelliSaid this on 04/01/2016 At 06:45 pm

To be fair, the practice of deinterlacing 30i to 30p (and 25i to 25p) for the web was primarily driven by two factors:

1) the desire to get higher picture (spatial) quality at a lower bitrate, and

2) the limitation of early software-decoded web plugins and mobile devices to smoothly decode and render 60 fps. For example, prior to the introduction of hardware-accelerated decoding and scaling in Flash and Silverlight (around 2011), neither were able to provide a smooth 60 fps playback experience.

So on one hand I support Apple's insistance on preserving the full temporal resolution of programs recorded at 60 or 50 Hz in order to provide a full TV-like experience, but I disagree with their suggestion that the jump from half-framerate to full-framerate should occur at 2 Mbps. Doubling the framerate while keeping the same bitrate hurts spatial picture quality. I would've preferred to see either a separate encoding profile for 25/30 vs 50/60 fps, or the full framerate variant introduced only at 7.8 Mbps.