Exploring RoboTic-Tac-Toe: A Fusion of LLMs, Robotics, and AWS Technologies
An Interactive Experience
Solution Overview
Hardware and Software
Strands Agents in Action
Supervisor Agent
Move Agent
Game Agent
Powering Robot Navigation with Computer Vision
Conclusion
About the Authors
Exploring RoboTic-Tac-Toe: Bridging AI and Robotics with AWS
Large language models (LLMs) are revolutionizing technology, expanding their applications from content summarization to reasoning about complex tasks. One particularly exciting area of development involves applying generative AI to robotics and physical hardware. Inspired by this potential, we created a game for AWS re:Invent 2024 Builders Fair: RoboTic-Tac-Toe. This interactive game harnesses tools like Amazon Bedrock, Strands Agents, AWS IoT Core, AWS Lambda, and Amazon DynamoDB to illustrate how LLMs can reason about game strategies and control physical robots in real time.
An Interactive Experience
RoboTic-Tac-Toe offers a unique interaction between humans, robots, and AI. Participants can access the game through a simple QR code scan and choose from several modes:
- Player vs. Player: Compete against a human opponent.
- Player vs. LLM: Test your skills against an AI-powered LLM.
- LLM vs. LLM: Observe two AI models strategizing and competing autonomously.
In this engaging setup, physical robots navigate a tic-tac-toe board, responding to player commands and placing X or O markers based on natural language input.
Solution Overview
RoboTic-Tac-Toe employs a seamless integration of AWS services, replacing pre-programmed sequences with dynamically generated instructions in real time. At the heart of the architecture is AWS IoT Core, which facilitates communication between Raspberry Pi-controlled robots and the cloud.
Key Services in Action
-
Hardware Setup:
- A tic-tac-toe board embedded with LED indicators highlights placements.
- Two modified toy robots operate via Raspberry Pi controllers, equipped with infrared and RF modules.
- A Raspberry Pi camera provides vision-based analysis, capturing board state data for processing.
-
Software Functionality:
- AWS Lambda manages game logic and orchestration.
- OpenCV enables computer vision capabilities for precise robot movements.
- Amazon Bedrock agents orchestrate tasks, generating movement plans and game strategies.
Strands Agents in Action
Strands Agents automate application tasks by managing interactions between foundation models, data sources, software applications, and user conversations.
Supervisor Agent
The Supervisor Agent orchestrates player strategies by managing the Move Agent and the Game Agent:
- Receives gameplay events, such as "Player X moved to 2B," to determine the specialized agent to invoke.
- The AWS Lambda function acts as the central controller, routing requests and logging interactions for traceability.
Move Agent
The Move Agent generates Python code for robot navigation:
- Receives start and destination positions, determining necessary movements.
- The LLM Navigator uses Strands Agents to produce and log movement instructions.
Game Agent
The Game Agent functions as an opponent for human players:
- Interacts through a mobile-friendly portal, tracking game history using DynamoDB.
- Processes player moves, retrieves board states, and generates the next move through real-time AI-driven gameplay.
Powering Robot Navigation with Computer Vision
Computer vision is crucial in ensuring the accuracy of robot movements and gameplay. An overhead Raspberry Pi camera continuously monitors the game board, feeding images for processing.
- Principal Component Analysis (PCA) is leveraged to track robot orientation and positions, while an OpenCV module containerized in Amazon SageMaker provides image analysis.
- An AWS Lambda function orchestrates the workflow, processing vision results and updating robot positions in real time.
Conclusion
RoboTic-Tac-Toe exemplifies the convergence of AI, robotics, and cloud computing, showcasing the immense potential of AWS IoT, machine learning, and generative AI in gaming and education. As AI-driven robotics continue to advance, our project serves as a glimpse into the intelligent, interactive future of gaming.
Stay tuned for enhancements and expanded gameplay modes that will further enrich AI-powered interactions.
About the Authors
Georges Hamieh is a Senior Technical Account Manager at AWS, specializing in Data and AI. With a passion for innovation, he guides customers on their digital transformation journeys.
Mohamed Salah is a Senior Solutions Architect at AWS, dedicated to supporting organizations in the Middle East and North Africa with scalable cloud solutions.
Saddam Hussain is a Senior Solutions Architect at AWS, focusing on Generative AI and helping public sector clients innovate using cloud technologies.
Dr. Omer Dawelbeit is a Principal Solutions Architect at AWS, passionate about designing scalable solutions to address complex challenges across various sectors.