Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Warner Bros. Discovery Realizes 60% Cost Savings and Accelerated ML Inference Using AWS Graviton

Transforming Personalized Content Recommendations at Warner Bros. Discovery with AWS Graviton

Insights from Machine Learning Engineering Leaders on Cost-Effective, Scalable Solutions for Global Audiences

Innovating Content Discovery: How Warner Bros. Discovery Transformed ML Inference with AWS Graviton

By Nukul Sharma, Machine Learning Engineering Manager, and Karthik Dasani, Staff Machine Learning Engineer, Warner Bros. Discovery


In the dynamic landscape of digital entertainment, it’s not enough for viewers to have exceptional content; they need sophisticated systems to discover what resonates with their unique tastes. At Warner Bros. Discovery (WBD), creators of beloved brands like HBO, CNN, and DC Entertainment, we’ve taken on the monumental task of delivering personalized content to our burgeoning audience of over 125 million users across more than 100 countries. In this post, we delve into how we leveraged AWS Graviton-based Amazon SageMaker to enhance our machine learning (ML) capabilities and revolutionize content recommendation for our viewers.

The Challenge: Scaling Personalization Without Compromise

Delivering localized recommendations from our HBO Max platform, which spans nine AWS regions globally, posed a significant challenge. The need for speed is paramount in recommendation systems—users expect real-time, personalized suggestions. With viewing traffic surging up to 500% during major premieres, we faced constraints in managing both scaling requirements and budgetary considerations while maintaining sub-100ms latency.

Our sophisticated recommendation algorithms rely on extensive data science and user behavior analysis. However, the approach must remain cost-effective to ensure continuous delivery of high-quality recommendations that engage audiences.

Our Solution: Harnessing AWS Graviton for Cost-Efficient ML Inference

In searching for a viable solution, we turned to two powerful AWS tools: AWS Graviton processors and Amazon SageMaker. Combining these technologies allowed us to tackle performance and cost challenges head-on.

The Power of AWS Graviton

AWS Graviton processors are engineered for superior price-performance ratios in cloud workloads, particularly optimized for ML. Their advanced features, including Neon vector processing and support for bfloat16, made them a compelling choice for our latency-sensitive recommender systems.

We started by testing Graviton with our XGBoost and TensorFlow models. The initial sandbox environment allowed us to optimize worker threads to maximize throughput on a single instance, revealing substantial gains in performance over x86-based instances. Transitioning to production traffic confirmed these benefits through shadow testing, showing Graviton’s linear scalability, even under heavy CPU loads.

Streamlining with SageMaker

Amazon SageMaker’s Inference Recommender streamlined our testing process, automating the evaluation across various instance types and configurations. This enabled rapid identification of optimal settings for our models, ensuring data-driven decisions that accelerated our deployment.

Utilizing SageMaker’s shadow testing capabilities allowed us to assess new deployments in a safe manner, comparing performance against existing systems without disrupting user experience. This strategic method enabled us to fine-tune our setup and preemptively address potential challenges.

A Comprehensive Approach

The integration of Graviton processors with SageMaker has transformed our ML inference framework. By utilizing multiple AWS managed services—like Amazon S3 and DynamoDB—we achieved our shared objective of scaling personalized content effectively while maintaining cost efficiency.

Achievements and Results

Our transition to AWS Graviton-based instances yielded remarkable results across our recommendation systems:

  • 60% Cost Savings: Our analysis showed average savings of 60%, with catalog ranking models experiencing reductions up to 88%.
  • Latency Improvements: We observed extraordinary performance increases, with some models experiencing latency reductions as high as 60%. For example, our XGBoost model demonstrated a dramatic 60% decrease in P99 latency.
  • Enhanced User Experience: Thanks to these improvements, users benefit from more responsive recommendations that align closely with their interests.

The smooth migration process was facilitated by the collaborative efforts of our AWS account and service teams, taking about a month from initial benchmarking to full deployment—a testament to the efficiency gains realized through this collaboration.

Future Aspirations: Commitment to ML Excellence

Spurred by these significant efficiencies, we are motivated to achieve the goal of running 100% of our recommendation systems on Graviton instances, optimizing our infrastructure further for enhanced cost and performance.

Conclusion: The Path Ahead

Our journey with AWS Graviton has not only refined how we deliver personalized recommendations but also exemplified how cloud innovations can drive operational efficiency. As we continue optimizing our ML framework, these enhancements will solidify WBD’s competitive edge in the rapidly evolving entertainment landscape, ultimately enriching our viewer’s engagement and loyalty.

For further insights and developments in our personalization strategy, stay tuned!

Acknowledgments

The success of this initiative would not have been possible without our collaborative partners at AWS. Special thanks to Sunita Nadampalli, Utsav Joshi, Karthik Rengasamy, Tito Panicker, Sapna Patel, and Gautham Panth for their invaluable contributions.


About the authors:

Nukul Sharma is a seasoned Machine Learning Engineering Manager at WBD, with extensive expertise in developing scalable ML solutions. Karthik Dasani serves as a Staff Machine Learning Engineer, specializing in recommendation systems and performance optimization. Together, they bring depth and experience to WBD’s innovative content personalization strategies.

Latest

LSEG to Incorporate ChatGPT – Full FX Insights

LSEG Launches MCP Connector for Enhanced AI Integration with...

Robots Helping Warehouse Workers with Heavy Lifting | MIT News

Revolutionizing Warehouse Operations: The Pickle Robot Company’s Innovative Approach...

Chinese Doctoral Students Account for 80% of the Market Share

Announcing the 2026 NVIDIA Graduate Fellowship Recipients The prestigious NVIDIA...

Experts Warn: North’s Use of Generative AI to Train Hackers and Conduct Research

North Korea's Technological Ambitions: AI, Smartphones, and the Pursuit...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Microsoft launches new AI tool to assist finance teams with generative tasks

Microsoft Launches AI Copilot for Finance Teams in Microsoft...

HyperPod Introduces Multi-Instance GPU Support to Optimize GPU Utilization for Generative...

Unlocking Efficient GPU Utilization with NVIDIA Multi-Instance GPU in Amazon SageMaker HyperPod Revolutionizing Workloads with GPU Partitioning Amazon SageMaker HyperPod now supports GPU partitioning using NVIDIA...

Implementing Strategies to Bridge the AI Value Gap

Bridging the AI Value Gap: Strategies for Successful Transformation in Businesses This heading captures the essence of the content, reflecting the need for actionable strategies...

Navigating Workforce Transformations in the Age of AI

Embracing AI in the Workplace: Strategies for Successful Integration Understanding the Shift to AI-First Enterprises 1. Address Organizational Debt Before It Compounds 2. Embrace the Distributed “Octopus...