Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Warner Bros. Discovery Realizes 60% Cost Savings and Accelerated ML Inference Using AWS Graviton

Transforming Personalized Content Recommendations at Warner Bros. Discovery with AWS Graviton

Insights from Machine Learning Engineering Leaders on Cost-Effective, Scalable Solutions for Global Audiences

Innovating Content Discovery: How Warner Bros. Discovery Transformed ML Inference with AWS Graviton

By Nukul Sharma, Machine Learning Engineering Manager, and Karthik Dasani, Staff Machine Learning Engineer, Warner Bros. Discovery


In the dynamic landscape of digital entertainment, it’s not enough for viewers to have exceptional content; they need sophisticated systems to discover what resonates with their unique tastes. At Warner Bros. Discovery (WBD), creators of beloved brands like HBO, CNN, and DC Entertainment, we’ve taken on the monumental task of delivering personalized content to our burgeoning audience of over 125 million users across more than 100 countries. In this post, we delve into how we leveraged AWS Graviton-based Amazon SageMaker to enhance our machine learning (ML) capabilities and revolutionize content recommendation for our viewers.

The Challenge: Scaling Personalization Without Compromise

Delivering localized recommendations from our HBO Max platform, which spans nine AWS regions globally, posed a significant challenge. The need for speed is paramount in recommendation systems—users expect real-time, personalized suggestions. With viewing traffic surging up to 500% during major premieres, we faced constraints in managing both scaling requirements and budgetary considerations while maintaining sub-100ms latency.

Our sophisticated recommendation algorithms rely on extensive data science and user behavior analysis. However, the approach must remain cost-effective to ensure continuous delivery of high-quality recommendations that engage audiences.

Our Solution: Harnessing AWS Graviton for Cost-Efficient ML Inference

In searching for a viable solution, we turned to two powerful AWS tools: AWS Graviton processors and Amazon SageMaker. Combining these technologies allowed us to tackle performance and cost challenges head-on.

The Power of AWS Graviton

AWS Graviton processors are engineered for superior price-performance ratios in cloud workloads, particularly optimized for ML. Their advanced features, including Neon vector processing and support for bfloat16, made them a compelling choice for our latency-sensitive recommender systems.

We started by testing Graviton with our XGBoost and TensorFlow models. The initial sandbox environment allowed us to optimize worker threads to maximize throughput on a single instance, revealing substantial gains in performance over x86-based instances. Transitioning to production traffic confirmed these benefits through shadow testing, showing Graviton’s linear scalability, even under heavy CPU loads.

Streamlining with SageMaker

Amazon SageMaker’s Inference Recommender streamlined our testing process, automating the evaluation across various instance types and configurations. This enabled rapid identification of optimal settings for our models, ensuring data-driven decisions that accelerated our deployment.

Utilizing SageMaker’s shadow testing capabilities allowed us to assess new deployments in a safe manner, comparing performance against existing systems without disrupting user experience. This strategic method enabled us to fine-tune our setup and preemptively address potential challenges.

A Comprehensive Approach

The integration of Graviton processors with SageMaker has transformed our ML inference framework. By utilizing multiple AWS managed services—like Amazon S3 and DynamoDB—we achieved our shared objective of scaling personalized content effectively while maintaining cost efficiency.

Achievements and Results

Our transition to AWS Graviton-based instances yielded remarkable results across our recommendation systems:

  • 60% Cost Savings: Our analysis showed average savings of 60%, with catalog ranking models experiencing reductions up to 88%.
  • Latency Improvements: We observed extraordinary performance increases, with some models experiencing latency reductions as high as 60%. For example, our XGBoost model demonstrated a dramatic 60% decrease in P99 latency.
  • Enhanced User Experience: Thanks to these improvements, users benefit from more responsive recommendations that align closely with their interests.

The smooth migration process was facilitated by the collaborative efforts of our AWS account and service teams, taking about a month from initial benchmarking to full deployment—a testament to the efficiency gains realized through this collaboration.

Future Aspirations: Commitment to ML Excellence

Spurred by these significant efficiencies, we are motivated to achieve the goal of running 100% of our recommendation systems on Graviton instances, optimizing our infrastructure further for enhanced cost and performance.

Conclusion: The Path Ahead

Our journey with AWS Graviton has not only refined how we deliver personalized recommendations but also exemplified how cloud innovations can drive operational efficiency. As we continue optimizing our ML framework, these enhancements will solidify WBD’s competitive edge in the rapidly evolving entertainment landscape, ultimately enriching our viewer’s engagement and loyalty.

For further insights and developments in our personalization strategy, stay tuned!

Acknowledgments

The success of this initiative would not have been possible without our collaborative partners at AWS. Special thanks to Sunita Nadampalli, Utsav Joshi, Karthik Rengasamy, Tito Panicker, Sapna Patel, and Gautham Panth for their invaluable contributions.


About the authors:

Nukul Sharma is a seasoned Machine Learning Engineering Manager at WBD, with extensive expertise in developing scalable ML solutions. Karthik Dasani serves as a Staff Machine Learning Engineer, specializing in recommendation systems and performance optimization. Together, they bring depth and experience to WBD’s innovative content personalization strategies.

Latest

Creating a Personal Productivity Assistant Using GLM-5

From Idea to Reality: Building a Personal Productivity Agent...

Lawsuits Claim ChatGPT Contributed to Suicide and Psychosis

The Dark Side of AI: ChatGPT's Alleged Role in...

Japan’s Robotics Sector Hits Record Orders Amid Growing Global Labor Shortages

Japan's Robotics Boom: Navigating Labor Shortages and Global Competition Add...

Analysis of Major Market Segments Fueling the Digital Language Sector

Exploring the Rapid Growth of the Digital Language Learning...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Creating a Personal Productivity Assistant Using GLM-5

From Idea to Reality: Building a Personal Productivity Agent in Just Five Minutes with GLM-5 AI A Revolutionary Approach to Application Development This headline captures the...

Creating Smart Event Agents with Amazon Bedrock AgentCore and Knowledge Bases

Deploying a Production-Ready Event Assistant Using Amazon Bedrock AgentCore Transforming Conference Navigation with AI Introduction to Event Assistance Challenges Building an Intelligent Companion with Amazon Bedrock AgentCore Solution...

A Comprehensive Guide to Machine Learning for Time Series Analysis

Mastering Feature Engineering for Time Series: A Comprehensive Guide Understanding Feature Engineering in Time Series Data The Essential Role of Lag Features in Time Series Analysis Unpacking...