Optimizing AI Workloads with AWS Graviton3: A Case Study with Sprinklr
In today’s digital age, artificial intelligence (AI) has become an indispensable tool for businesses looking to streamline processes, gather insights, and improve decision-making. With the vast amount of data generated daily, companies need AI solutions that can handle diverse workloads efficiently while also being cost-effective. Sprinklr, a leading provider of customer experience management software, has achieved remarkable success in optimizing mixed AI workload inference performance with the help of AWS Graviton3-based instances.
Sprinklr’s AI processes thousands of servers to fine-tune and serve over 750 pre-built AI models across various verticals, making more than 10 billion predictions per day. To deliver tailored user experiences, Sprinklr deploys specialized AI models for specific business applications that utilize nine layers of machine learning, extracting meaning from data across different formats. This diverse database of models presents unique challenges for choosing the most efficient deployment infrastructure that balances scale and cost-effectiveness.
For mixed AI workloads with real-time latency requirements, traditional production instances were not cost-effective due to smaller model sizes and infrequent inference requests. This led Sprinklr to explore new instances that could offer the right balance of performance and cost-efficiency. AWS Graviton3 processors emerged as the ideal choice, offering optimized support for ML workloads and significant improvements in performance compared to x86-based instances.
By migrating mixed inference/search workloads to Graviton3-based c7g instances, Sprinklr achieved impressive results:
– 20% throughput improvement and 30% latency reduction
– 25-30% cost savings
– Improved customer experience through reduced latency and increased throughput
– Lower carbon footprint by running more efficiently on the same number of instances
The transition to Graviton3-based instances was seamless, with engineering time kept minimal and performance improvements quickly realized in production workloads. Sprinklr has already migrated several models to Graviton3-based instances and plans to move more models to further optimize performance and cost savings.
In conclusion, Sprinklr’s success with AWS Graviton3-based instances serves as a testament to the importance of adopting cutting-edge technologies for efficiency, cost-saving, and improved customer experience. As technology continues to evolve, companies must stay ahead of the curve and seek new compute solutions that not only reduce costs but also enhance their products and services.