Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Cost-Effective Deployment of Vision-Language Models for Pet Behavior Detection Using AWS Inferentia2

Transforming Pet Monitoring: How Tomofun Optimized Furbo’s Inference with AWS Inferentia2

Revolutionizing Remote Pet Interaction with Furbo

Challenge: Reducing GPU Inference Costs for Scalable Real-Time Monitoring

Solution Overview: Pet Behavior Detection Architecture on AWS

Improving BLIP on Inferentia2: A Modular Approach

Original Model Code: Ensuring Integrity During Migration

Wrapper Code: Adapting to Neuron’s Requirements

Model Compilation for Inferentia2: Streamlining the Process

Model Deployment on Inferentia2: Achieving Seamless Integration

Stress Testing: Validating Performance at Scale

Conclusion: Achieving 83% Cost Reduction and Maintaining High Throughput

About the Authors: Meet the Team Behind the Innovation

Redefining Remote Pet Interaction: Tomofun’s Success with Furbo and AWS Inferentia2

In an age where technology continuously transforms our lifestyles, Tomofun, a pet-tech startup headquartered in Taiwan, is leading the charge in redefining how pet owners interact with their furry friends from afar. With its innovative Furbo Pet Camera, the company blends smart camera technology with artificial intelligence (AI) to allow pet parents to monitor behaviors such as barking and unusual activity in real time. But this evolution hasn’t come without its own set of challenges.

The Challenge: Scaling AI Efficiently

Initially, Tomofun hosted its advanced vision-language models on GPU-based Amazon Elastic Compute Cloud (EC2) instances. While GPUs are adept at processing real-time data, they can be costly, especially when managing continuous monitoring across hundreds of thousands of devices. Tomofun faced the dual challenge of sustaining high cost efficiency without sacrificing model fidelity or throughput.

With the inference workloads needing to run constantly to deliver real-time alerts, a transformation was essential. This is where AWS Inferentia2 came into play.

Solution Overview: Migrating to EC2 Inf2 Instances

Tomofun turned to EC2 Inf2 instances, purpose-built by AWS for AI workloads. Shifting to Inferentia2 required minimal changes to their existing system, meaning Tomofun could implement this transition without the need to rewrite large portions of their already optimized BLIP (Bootstrapping Language-image Pre-Training) codebase.

The architecture leverages various AWS services to efficiently manage pet behavior detection. At the center sits Furbo’s API, orchestrating image streams from customer cameras to inference endpoints. With Elastic Load Balancing (ELB) and Auto Scaling groups, the design accommodates real-time scaling to handle incoming requests.

When a frame is captured, it flows through Amazon CloudFront and ELB before hitting the API servers dedicated to monitoring pet behavior. These servers forward requests to an Auto Scaling group designed specifically for model inference.

Real-Time Flexibility and Monitoring

A significant advantage of this new architecture is its ability to direct inference requests between GPU and Inferentia2 backends in real time. The AWS CloudWatch service continuously monitors key operational metrics—including latency, throughput, and error rates—allowing Tomofun to react swiftly to performance changes.

Improving BLIP With Inferentia2

With the move to Inferentia2, Tomofun had to ensure that their BLIP model components were compatible with the new architecture. The approach involved creating lightweight wrappers for the three essential components of the BLIP model—Image Encoder, Text Encoder, and Text Decoder. By isolating these elements, Tomofun could compile them for Inferentia2 without altering the core BLIP architecture, thus ensuring seamless integration into the existing inference pipeline.

Stress Testing: Validating Performance

To confirm the system’s capability to handle real-world workloads, Tomofun conducted stress tests that simulated practical scenarios such as detecting if a dog was barking or playing. The results were promising: the EC2 Inf2 instances demonstrated the ability to manage high-throughput requests while maintaining low latency and confirmed an 83% cost reduction compared to their previous GPU setups.

Conclusions: A Model for Future Workloads

By leveraging AWS Inferentia2, Tomofun not only slashed operational costs but also upheld high performance levels—a critical requirement for their global customer base. The seamless migration strategy of using lightweight wrappers ensured that the core logic of the BLIP model remained untouched, allowing for easy scalability in response to demand.

Looking ahead, Tomofun plans to expand the use of Inferentia2 beyond vision-language models to include audio event detection and other innovative AI applications that could enhance pet-owner interactions.

Explore Further

For those interested in optimizing AI workloads similar to Tomofun’s journey, the AWS Neuron documentation provides a wealth of resources. Additionally, the Furbo website outlines the AI-powered features that keep pets safe and connected with their owners.

About the Authors

Chen-Hsin Ding, a Staff Machine Learning Engineer at Tomofun, specializes in Generative AI and MLOps best practices.

Ray Wang and Howard Su are Senior Solutions Architects at AWS, bringing their extensive experience in cloud solutions and software development to a diverse range of innovative projects.


Through its commitment to precision, efficiency, and continuous technological integration, Tomofun exemplifies how startups can harness AI to better engage with the world around them—starting with our beloved pets.

Latest

Is Richard Dawkins Correct About Claude? No, But It’s Understandable That AI Chatbots Seem Conscious to Us.

The Illusion of Consciousness in AI: Understanding Richard Dawkins'...

The Space Industry is Thriving, Yet the UK May Fall Behind

The Transformative Power of Space Technology: Economic Impact, Innovation,...

Transforming Customer Feedback into Actionable Insights: Hapag-Lloyd’s Use of Amazon Bedrock

Hapag-Lloyd’s Innovation Journey: Leveraging AI for Enhanced Customer Feedback...

I Asked ChatGPT About Financial Pitfalls Boomers Face That Could Sabotage Retirement—Here’s What It Revealed

Common Financial Mistakes Baby Boomers Make That Could Ruin...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Samsung Electronics (005930.KS) – AI-Driven Equity Research

Comprehensive AI-Generated Financial Analysis of Samsung Electronics Transparency and Data Sourcing Company Profile Key Statistics Block Analytical Perspective & Central Tension Consensus View Market-Implied Growth Rate Data-Based Counterpoint Macro Context Historical Context Frame Analytical...

Beyond BI: Leveraging Amazon Quick’s Dataset Q&A Feature for Next-Gen Data...

Transforming Business Intelligence: The Power of Dataset Q&A and TARA Streamlining Analytics for Operational Excellence Bridging the Data Gap: The Challenge Facing Business Leaders Introducing TARA: Revolutionizing...

Ask.com Closes Its Doors: Farewell to Ask Jeeves

The Rise and Fall of Ask.com: A Journey Through Search Engine History The End of Ask.com: A Quiet Goodbye to a Pioneer in Online Search On...