Transforming Video Metadata Management with AI: Lessons from DPG Media

The Power of AI in Enhancing Video Metadata: A Case Study with DPG Media

This post was co-written with Lucas Desard, Tom Lauwers, and Sam Landuydt from DPG Media.

DPG Media is a leading media company in Benelux operating multiple online platforms and TV channels. DPG Media’s VTM GO platform alone offers over 500 days of non-stop content.

With a growing library of long-form video content, DPG Media recognizes the importance of efficiently managing and enhancing video metadata such as actor information, genre, summary of episodes, the mood of the video, and more. Having descriptive metadata is key to providing accurate TV guide descriptions, improving content recommendations, and enhancing the consumer’s ability to explore content that aligns with their interests and current mood.

This post shows how DPG Media introduced AI-powered processes using Amazon Bedrock and Amazon Transcribe into its video publication pipelines in just 4 weeks, as an evolution towards more automated annotation systems.

The challenge: Extracting and generating metadata at scale

DPG Media receives video productions accompanied by a wide range of marketing materials such as visual media and brief descriptions. These materials often lack standardization and vary in quality. As a result, DPG Media Producers have to run a screening process to consume and understand the content sufficiently to generate the missing metadata, such as brief summaries. For some content, additional screening is performed to generate subtitles and captions.

As DPG Media grows, they need a more scalable way of capturing metadata that enhances the consumer experience on online video services and aids in understanding key content characteristics.

The following were some initial challenges in automation:

– Language diversity – The services host both Dutch and English shows. Some local shows feature Flemish dialects, which can be difficult for some large language models (LLMs) to understand.
– Variability in content volume – They offer a range of content volume, from single-episode films to multi-season series.
– Release frequency – New shows, episodes, and movies are released daily.
– Data aggregation – Metadata needs to be available at the top-level asset (program or movie) and must be reliably aggregated across different seasons.

Solution overview

To address the challenges of automation, DPG Media decided to implement a combination of AI techniques and existing metadata to generate new, accurate content and category descriptions, mood, and context.

The project focused solely on audio processing due to its cost-efficiency and faster processing time. Video data analysis with AI wasn’t required for generating detailed, accurate, and high-quality metadata.

The general architecture of the metadata pipeline consists of two primary steps:

– Generate transcriptions of audio tracks: use speech recognition models to generate accurate transcripts of the audio content.
– Generate metadata: use LLMs to extract and generate detailed metadata from the transcriptions.

In the following sections, we discuss the components of the pipeline in more detail.

Step 1. Generate transcriptions of audio tracks

To generate the necessary audio transcripts for metadata extraction, the DPG Media team evaluated two different transcription strategies: Whisper-v3-large, which requires at least 10 GB of vRAM and high operational processing, and Amazon Transcribe, a managed service with the added benefit of automatic model updates from AWS over time and speaker diarization. The evaluation focused on two key factors: price-performance and transcription quality.

To evaluate the transcription accuracy quality, the team compared the results against ground truth subtitles on a large test set, using the following metrics:

– Word error rate (WER) – This metric measures the percentage of words that are incorrectly transcribed compared to the ground truth. A lower WER indicates a more accurate transcription.
– Match error rate (MER) – MER assesses the proportion of correct words that were accurately matched in the transcription. A lower MER signifies better accuracy.
– Word information lost (WIL) – This metric quantifies the amount…

**[Continue reading on the original source >>](provide the URL)**

Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

DPG Media Leveraging Amazon Bedrock and Amazon Transcribe to Improve Video Metadata Using AI-Driven Pipelines

Transforming Video Metadata Management with AI: Lessons from DPG Media

The Power of AI in Enhancing Video Metadata: A Case Study with DPG Media

The challenge: Extracting and generating metadata at scale

Solution overview

Step 1. Generate transcriptions of audio tracks

Latest

Create a Scalable Test Suite with Dataset Management in Amazon Bedrock AgentCore

Expedia Unveils ChatGPT-Enhanced Travel Planning: Here’s How to Get Started.

2 Leading AI Robotics Stocks to Consider Over Tesla

Centre Introduces AI Voice Chatbot for Addressing Grievances

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

VOXI UK Launches First AI Chatbot to Support Customers

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Assessing Deep Agents with LangSmith on AWS

Comprehensive Observability for Amazon SageMaker AI LLM Inference: Monitoring GPU Utilization...

Training Azerbaijani Language Models Using Amazon SageMaker AI

Popular categories

Most recent

Create a Scalable Test Suite with Dataset Management in Amazon Bedrock AgentCore

Expedia Unveils ChatGPT-Enhanced Travel Planning: Here’s How to Get Started.

2 Leading AI Robotics Stocks to Consider Over Tesla

Most popular

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Subscribe