Contextual Targeting’s Second Act: Past, Present, and Future

Contextual advertising matches ads to content rather than to people. The logic is straightforward: a viewer watching content about home renovation is likely receptive to a Home Depot ad at that moment, whereas a behavioral retarget from last week’s browsing is not. The problem is that behavioral targeting depends on third-party data that is becoming increasingly unavailable as privacy regulation tightens and identifiers disappear.

Over the last few years, AI-based technologies, including automated speech recognition, computer vision, and large language models, have given marketers a much more accurate read on the content’s topic, tone, and mood, making contextual targeting a serious strategy rather than a fallback.

Contextual ads have always had a role in CTV. But for most of its history, contextual was the fallback option when better targeting data wasn’t available. Targeting by show genre, channel, or network was blunt but serviceable, and the main job was avoiding bad adjacencies rather than finding the right moment.

Two things changed. Privacy regulations and signal loss have made behavioral targeting harder, putting contextual targeting back on the table as a primary strategy rather than a safety net. And AI rebuilt what contextual can actually do. What used to mean keyword lists and genre labels now means scene-level analysis; reading the objects on screen, the dialogue, the emotional tone, and using all of that to place ads in specific moments rather than broad content buckets.

The results when it works are notable. In an IAB Tech Lab article contributed by Amazon Ads, Amazon shared that a PepsiCo campaign using AI-driven contextual targeting reported 3x ROAS and a 62% reduction in cost per acquisition. A BlueAir campaign with agency Tinuiti saw CPMs drop 42% and new-to-brand customers increase 34%. Those are outlier results, not guarantees, but they show the ceiling. They prove that when you stop targeting a generic ‘Sports Fan’ and start targeting the specific moment of a game-winning goal, the advertising becomes a service rather than an interruption.

The timeline below tracks that journey from 2015 to where things are headed in 2026, showing how contextual went from a blunt genre filter to something that can read emotional tone in a live sports broadcast.

If you want to go deeper than the timeline (or if you have no idea what ROAS, CMP, and cost per acquisition mean), my course, Streaming Monetization 101, covers contextual advertising, the privacy frameworks that made it relevant again, and the broader ad tech stack it sits within.

Here’s the timeline.

2015: Genre as Targeting
In 2015, contextual targeting mostly meant matching ads to the content on a page or the program on a screen. On the open web, advertisers relied on page‑level keywords, topics, and basic semantic analysis; on TV and early CTV, they aligned campaigns with shows, channels, or broad genres. It was valued as a privacy‑friendly, brand‑safe way to reach likely‑interested audiences based on what they were consuming in the moment, but it wasn’t yet treated as a finely tuned signal for intent or nuance.

2018: Blocklists and Blunt Instruments
Advertisers layered on keyword blocklists to avoid unsafe or off‑brand content. That helped, but it was crude—over‑blocking legitimate inventory and missing genuinely problematic adjacencies. Industry discussions began to distinguish “brand safety” from “brand suitability,” but the underlying tools were still simple: keyword and category rules.
Good overview of the safety vs suitability shift:
https://www.peer39.com/blog/brand-suitability

2019–2020: Captions as Contextual Signal
As automated speech recognition and captioning got cheaper, the dialogue inside long‑form video became searchable. Contextual moved from “this is a comedy on Channel X” to “this scene is about money, health, or politics.” It was still noisy—ASR errors, sarcasm, and complex news tripped the systems—but it was a real step beyond genre.
Background on computational analysis of scripts/subtitles:
https://research-api.cbs.dk/ws/portalfiles/portal/109009600/Computational_Content_Analysis_in_Advertising_Research.pdf

2021: Computer Vision Sees the Screen
Computer‑vision models started detecting logos, products, faces, and activities directly from video frames. The image became a signal alongside the transcript. A beer brand could find scenes with people drinking; an auto brand could surface road or garage moments. Contextual stopped depending entirely on what characters said and started looking at what was actually on screen.
Logo/visual detection context:
https://dl.acm.org/doi/abs/10.1145/3611309

2022: First Multimodal Fusion
Vendors began combining transcripts, visual cues, and basic audio signals into unified content scores. Instead of separate “text” and “image” classifications, systems produced a single view of what each piece of content was about and how intense or emotional it was. This was the first practical version of multimodal contextual for video and CTV, even if coverage and accuracy were uneven.
Multimodal CTV overview:
https://www.anoki.ai/multimodal-ai-for-ctv-101

2023: Contextual 2.0 – LLMs Enter
Large language models have changed expectations for what “understanding content” means. IAB Tech Lab describes this as the point where “the old limitations of contextual targeting fade away” because AI can “recognize deeper semantic connections” and see not just what’s on a page, but how it relates to behaviors and predicted actions. Vendors started talking about “Contextual 2.0” instead of just keyword lists.

This is not a return to cookie-based tracking. While legacy targeting follows a specific individual’s browsing history, Contextual 2.0 analyzes the intent inherent in the media itself. It recognizes that a viewer watching a high-performance car review is in a “purchasing mindset” regardless of who they are or what they searched for an hour ago.
IAB Tech Lab / Amazon Ads piece:
https://iabtechlab.com/the-ai-leap-in-contextual-advertising-transforming-a-legacy-solution/

2024: The Moment, Not the Show
On CTV, the most visible shift was scene‑level targeting. Instead of buying an entire show, advertisers could buy specific moments inside it—tagged by theme, mood, objects on screen, or emotional tone. Anoki’s description captures the point: “a single program can contain many different contexts and emotional moments,” and scene‑level intelligence lets you target the ones that actually fit your message.
Scene‑level targeting intros:
https://www.viantinc.com/insights/blog/ctv-scene-level-targeting/
https://www.anoki.ai/post/hot-for-summer-scene-level-contextual-targeting-for-ctv-ads

2024: Proof It Works (At the High End)
Contextual AI moved from deckware to case studies. In the IAB Tech Lab / Amazon Ads article, BlueAir’s campaign with Tinuiti using AI‑driven contextual reportedly delivered 2.4x greater detail page‑view rates, a 42% CPM drop, and 34% more new‑to‑brand customers; PepsiCo’s Prime Day effort reported 3x ROAS and a 62% lower CPA. These are outliers, but they showed the ceiling was real.
Case‑study details:
https://iabtechlab.com/the-ai-leap-in-contextual-advertising-transforming-a-legacy-solution/

2025: Privacy‑Safe and Standardized

As cookies deprecate and/or get disabled by userss and identifiers get constrained, contextual is being reframed as “privacy‑first precision.” IAB Tech Lab emphasizes that AI‑driven contextual is strengthened by industry‑wide adoption of IAB Tech Lab Content Taxonomies, which standardize how AI‑derived context is labeled and traded. The AI handles the understanding; the taxonomy makes it interoperable between buyers and sellers.

Taxonomy and privacy framing:
https://iabtechlab.com/standards/content-taxonomy/
https://iabtechlab.com/event/taxonomy-mapping-made-ai-easy/
https://www.aidigital.com/blog/contextual-advertising

2026+: Emotional Intelligence and Live Moments (Projected)
The next frontier is emotional and live context. Roadmaps discuss “enhanced emotional intelligence,” “narrative arc awareness,” and “emotional journey mapping” across shows, so campaigns can follow tension, relief, or celebration rather than just topics. At the same time, publishers are piloting real‑time contextual for live content, where ads respond to the high-density data of a live stream, like a scoring play or a shift in crowd volume, rather than the static genre tags assigned weeks before kickoff. This moves the target from a broad category like “Live Sports” to a specific psychological window where viewer engagement is at its peak.:
https://www.anoki.ai/multimodal-ai-for-ctv-101
https://www.streamtvinsider.com/advertising/nbcu-debuts-real-time-contextual-ad-targeting-live-content

What This Means Now

Contextual targeting has traveled a long road from genre labels to scene-level emotional analysis, and it is still moving. The infrastructure is real, the early performance results are encouraging, and the privacy alignment is genuine.

The friction is now operational rather than technical. Three primary bottlenecks remain: the lack of standardized scene-level metadata across different buying platforms, the absence of creative versioning that can automatically match an ad’s tone to a scene’s sentiment, and the difficulty of attributing a sale to a specific moment rather than a broader campaign. Until these workflows are automated, the technology remains a high-touch tactic rather than a standard programmatic buy.

If you are buying or building CTV advertising today, the question is not whether contextual works. It clearly can. The question is whether your stack is ready to use it at the granularity that makes the difference between a genre label and a scene-level moment.

That is exactly what Streaming Monetization 101 covers. More information here.

About Jan Ozer

Avatar photo
I help companies train new technical hires in streaming media-related positions; I also help companies optimize their codec selections and encoding stacks and evaluate new encoders and codecs. I am a contributing editor to Streaming Media Magazine, writing about codecs and encoding tools. I have written multiple authoritative books on video encoding, including Video Encoding by the Numbers: Eliminate the Guesswork from your Streaming Video (https://amzn.to/3kV6R1j) and Learn to Produce Video with FFmpeg: In Thirty Minutes or Less (https://amzn.to/3ZJih7e). I have multiple courses relating to streaming media production, all available at https://bit.ly/slc_courses. I currently work as www.netint.com as a Senior Director in Marketing.

Check Also

My Three Take Evolution in Vibe Coding

The marketing story around vibe coding is awe-inspiring: a non‑programmer can describe any app they …

Per‑Title Before New Codecs: Fixing Your H.264 Baseline

Before comparing your existing H.264 encodes to HEVC, AV1, or any other advanced codec, you …

Beyond H.264. Why Codec Choices Now Carry Legal and Financial Risk

Codec decisions used to be an engineering sport. Today, they are capital allocation decisions with …

Leave a Reply

Your email address will not be published. Required fields are marked *