Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

DPG Media Leveraging Amazon Bedrock and Amazon Transcribe to Improve Video Metadata Using AI-Driven Pipelines

Transforming Video Metadata Management with AI: Lessons from DPG Media

The Power of AI in Enhancing Video Metadata: A Case Study with DPG Media

This post was co-written with Lucas Desard, Tom Lauwers, and Sam Landuydt from DPG Media.

DPG Media is a leading media company in Benelux operating multiple online platforms and TV channels. DPG Media’s VTM GO platform alone offers over 500 days of non-stop content.

With a growing library of long-form video content, DPG Media recognizes the importance of efficiently managing and enhancing video metadata such as actor information, genre, summary of episodes, the mood of the video, and more. Having descriptive metadata is key to providing accurate TV guide descriptions, improving content recommendations, and enhancing the consumer’s ability to explore content that aligns with their interests and current mood.

This post shows how DPG Media introduced AI-powered processes using Amazon Bedrock and Amazon Transcribe into its video publication pipelines in just 4 weeks, as an evolution towards more automated annotation systems.

The challenge: Extracting and generating metadata at scale

DPG Media receives video productions accompanied by a wide range of marketing materials such as visual media and brief descriptions. These materials often lack standardization and vary in quality. As a result, DPG Media Producers have to run a screening process to consume and understand the content sufficiently to generate the missing metadata, such as brief summaries. For some content, additional screening is performed to generate subtitles and captions.

As DPG Media grows, they need a more scalable way of capturing metadata that enhances the consumer experience on online video services and aids in understanding key content characteristics.

The following were some initial challenges in automation:

– Language diversity – The services host both Dutch and English shows. Some local shows feature Flemish dialects, which can be difficult for some large language models (LLMs) to understand.
– Variability in content volume – They offer a range of content volume, from single-episode films to multi-season series.
– Release frequency – New shows, episodes, and movies are released daily.
– Data aggregation – Metadata needs to be available at the top-level asset (program or movie) and must be reliably aggregated across different seasons.

Solution overview

To address the challenges of automation, DPG Media decided to implement a combination of AI techniques and existing metadata to generate new, accurate content and category descriptions, mood, and context.

The project focused solely on audio processing due to its cost-efficiency and faster processing time. Video data analysis with AI wasn’t required for generating detailed, accurate, and high-quality metadata.

The general architecture of the metadata pipeline consists of two primary steps:

– Generate transcriptions of audio tracks: use speech recognition models to generate accurate transcripts of the audio content.
– Generate metadata: use LLMs to extract and generate detailed metadata from the transcriptions.

In the following sections, we discuss the components of the pipeline in more detail.

Step 1. Generate transcriptions of audio tracks

To generate the necessary audio transcripts for metadata extraction, the DPG Media team evaluated two different transcription strategies: Whisper-v3-large, which requires at least 10 GB of vRAM and high operational processing, and Amazon Transcribe, a managed service with the added benefit of automatic model updates from AWS over time and speaker diarization. The evaluation focused on two key factors: price-performance and transcription quality.

To evaluate the transcription accuracy quality, the team compared the results against ground truth subtitles on a large test set, using the following metrics:

– Word error rate (WER) – This metric measures the percentage of words that are incorrectly transcribed compared to the ground truth. A lower WER indicates a more accurate transcription.
– Match error rate (MER) – MER assesses the proportion of correct words that were accurately matched in the transcription. A lower MER signifies better accuracy.
– Word information lost (WIL) – This metric quantifies the amount…

**[Continue reading on the original source >>](provide the URL)**

Latest

Forecasting Employee Turnover Using SHAP: A Comprehensive HR Analytics Guide

Predicting Employee Attrition: A Data-Driven Approach Using SHAP Feel free...

Daniel Nadler, Cofounder of OpenEvidence, Joins the Billionaire Ranks

"Revolutionizing Healthcare: Daniel Nadler's OpenEvidence Secures $210 Million to...

Generative AI Develops APIs Faster Than Teams Can Secure Them

Navigating API Sprawl: Tackling Complexity in an Era of...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Microsoft launches new AI tool to assist finance teams with generative tasks

Microsoft Launches AI Copilot for Finance Teams in Microsoft...

Enhance Generative AI Workflows with NVIDIA DGX Cloud on AWS and...

Unlocking AI Innovation: Leveraging NVIDIA DGX Cloud on AWS for Generative AI Solutions Co-Authors: Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA Introduction...

Develop AI-Powered Policy Development for Vehicle Data Collection and Automation with...

Transforming Automotive Policy Creation with Generative AI Revolutionizing Data Utilization in Software-Defined Vehicles Overview of Sonatus's AI-Powered Solutions Addressing Challenges in Data Collection and Automation Key Metrics for...

Implementing User-Level Access Control for Multi-Tenant Machine Learning Platforms on Amazon...

Implementing Efficient Access Control in Amazon SageMaker AI Environments Overview of Access Control Challenges in ML Workflows Strategies for Efficient Permission Management Implementing Attribute-Based Access Control (ABAC) Key...