Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Sprinklr boosts efficiency by 20% and cuts expenses by 25% for machine learning inference on AWS Graviton3

Optimizing AI Workloads with AWS Graviton3: A Case Study with Sprinklr

In today’s digital age, artificial intelligence (AI) has become an indispensable tool for businesses looking to streamline processes, gather insights, and improve decision-making. With the vast amount of data generated daily, companies need AI solutions that can handle diverse workloads efficiently while also being cost-effective. Sprinklr, a leading provider of customer experience management software, has achieved remarkable success in optimizing mixed AI workload inference performance with the help of AWS Graviton3-based instances.

Sprinklr’s AI processes thousands of servers to fine-tune and serve over 750 pre-built AI models across various verticals, making more than 10 billion predictions per day. To deliver tailored user experiences, Sprinklr deploys specialized AI models for specific business applications that utilize nine layers of machine learning, extracting meaning from data across different formats. This diverse database of models presents unique challenges for choosing the most efficient deployment infrastructure that balances scale and cost-effectiveness.

For mixed AI workloads with real-time latency requirements, traditional production instances were not cost-effective due to smaller model sizes and infrequent inference requests. This led Sprinklr to explore new instances that could offer the right balance of performance and cost-efficiency. AWS Graviton3 processors emerged as the ideal choice, offering optimized support for ML workloads and significant improvements in performance compared to x86-based instances.

By migrating mixed inference/search workloads to Graviton3-based c7g instances, Sprinklr achieved impressive results:
– 20% throughput improvement and 30% latency reduction
– 25-30% cost savings
– Improved customer experience through reduced latency and increased throughput
– Lower carbon footprint by running more efficiently on the same number of instances

The transition to Graviton3-based instances was seamless, with engineering time kept minimal and performance improvements quickly realized in production workloads. Sprinklr has already migrated several models to Graviton3-based instances and plans to move more models to further optimize performance and cost savings.

In conclusion, Sprinklr’s success with AWS Graviton3-based instances serves as a testament to the importance of adopting cutting-edge technologies for efficiency, cost-saving, and improved customer experience. As technology continues to evolve, companies must stay ahead of the curve and seek new compute solutions that not only reduce costs but also enhance their products and services.

Latest

Creating a Personal Productivity Assistant Using GLM-5

From Idea to Reality: Building a Personal Productivity Agent...

Lawsuits Claim ChatGPT Contributed to Suicide and Psychosis

The Dark Side of AI: ChatGPT's Alleged Role in...

Japan’s Robotics Sector Hits Record Orders Amid Growing Global Labor Shortages

Japan's Robotics Boom: Navigating Labor Shortages and Global Competition Add...

Analysis of Major Market Segments Fueling the Digital Language Sector

Exploring the Rapid Growth of the Digital Language Learning...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Apple Stock 2026 Outlook: Price Target and Investment Thesis for AAPL

Institutional Equity Research Report: Apple Inc. (AAPL) Analysis Report Overview Report Date: February 27, 2026 Analyst: Lead Equity Research Analyst Rating: HOLD 12-Month Price Target: $295 Data Sources All data sourced...

Optimize Deployment of Multiple Fine-Tuned Models Using vLLM on Amazon SageMaker...

Optimizing Multi-Low-Rank Adaptation for Mixture of Experts Models in vLLM This heading encapsulates the main focus of the content, highlighting both the technical aspect of...

Create a Smart Photo Search Solution with Amazon Rekognition, Amazon Neptune,...

Building an Intelligent Photo Search System on AWS Overview of Challenges and Solutions Comprehensive Photo Search System with AWS CDK Key Features and Use Cases Technical Architecture and...