Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Create a Serverless Workflow for Amazon Bedrock Batch Job Orchestration with AWS Step Functions

Efficient Management of Large-Scale Inference with Amazon Bedrock

Introduction to Batch Inference Workflows

Solution Overview

Prerequisites for Deployment

Steps to Deploy the Solution

Job Input Structure

Working with Hugging Face Datasets

Uploading Data to Amazon S3

Generating Batch Embeddings

Understanding the Step Functions Workflow

Clean Up Instructions

Conclusion and Future Prospects

About the Authors

Streamlining Large-Scale Inference with Amazon Bedrock Batch Processing

As organizations increasingly leverage foundation models (FMs) for artificial intelligence (AI) and machine learning (ML) workloads, efficient management of large-scale inference operations is becoming pivotal. Amazon Bedrock stands out by providing two primary inference patterns: real-time inference and batch inference. The latter is particularly advantageous for processing extensive datasets when immediate results are not a priority.

Cost-Effective Batch Inference with Amazon Bedrock

Amazon Bedrock’s batch inference offers a cost-effective solution, reducing processing costs by 50% compared to on-demand options. This makes it ideal for high-volume, time-insensitive workloads. However, scaling batch inference operations presents challenges, including managing input formats, adhering to job quotas, orchestrating concurrent executions, and performing post-processing tasks. To address these complexities, a robust solution is essential.

A Scalable Solution for Batch Inference

In this post, we introduce a flexible, scalable solution that enhances the batch inference workflow. Our approach simplifies managing FM batch inference, whether generating embeddings for millions of documents or executing custom evaluation and completion tasks on large datasets.

Solution Overview

Our automated workflow consists of three main phases:

  1. Preprocessing Input Datasets: Transforming data into the required format, such as prompt formatting.
  2. Executing Batch Inference Jobs: Running jobs in parallel for maximum efficiency.
  3. Post-processing Outputs: Parsing model responses to attain useful insights.

By utilizing Amazon Step Functions within an AWS Cloud Development Kit (AWS CDK) stack, we streamline these operational phases, allowing for seamless orchestration of batch jobs.

Use Case: The SimpleCoT Dataset

For our demonstration, we utilize a dataset from SimpleCoT, containing 2.2 million rows of task-oriented examples aimed at enhancing chain-of-thought (CoT) reasoning in language models. This diverse dataset addresses various challenges, including reading comprehension, mathematical reasoning, and natural language processing.

Architectural Considerations

To effectively manage batch processing workflows with Amazon Bedrock, our architecture incorporates scalable serverless components that address key considerations:

  • Input File Format & Storage: Job inputs must be structured as JSON Lines (JSONL) files stored in an Amazon S3 bucket, ensuring compatibility with the API request structure for each FM provider.
  • Step Functions State Machine: This robust orchestration tool coordinates asynchronous, long-running jobs. Using Amazon DynamoDB, we maintain an inventory of job states while adhering to quota limits on jobs in progress.
  • Postprocessing Mechanisms: AWS Lambda functions handle parsing and joining outputs to the original input data after batch results are available.

Implementation Steps

Prerequisites

Before deploying the solution, ensure you have:

  1. Node.js and npm installed.
  2. The AWS CDK set up.

Clone the GitHub repository:

git clone https://github.com/aws-samples/amazon-bedrock-samples
cd poc-to-prod/bedrock-batch-orchestrator

Deployment

Install the required packages:

npm i

In the prompt_templates.py file, configure a new prompt template for your use case, ensuring your input dataset aligns with the formatting keys.

Deploy the AWS CDK stack:

npm run cdk deploy

Take note of the outputs, which will include information about the workflow and S3 bucket created:

✅ BedrockBatchOrchestratorStack

✨ Deployment time: 23.16s
Outputs:
BedrockBatchOrchestratorStack.bucketName = batch-inference-bucket-
BedrockBatchOrchestratorStack.stepFunctionName = bedrockBatchOrchestratorSfnE5E2B976-4yznxekguxxm

Job Input Structure

You can either use a Hugging Face dataset ID or reference an Amazon S3 dataset. For Hugging Face datasets, reference the required dataset ID and split to pull data directly from Hugging Face Hub. For S3 datasets, ensure the file structure aligns with the model requirements.

Generate Batch Embeddings

For embedding generation, ensure your input CSV file includes a column labeled input_text. The structure resembles:

{
  "s3_uri": "s3://batch-inference-bucket-/inputs/embeddings/embedding_input.csv",
  "job_name_prefix": "test-embeddings-job1",
  "model_id": "amazon.titan-embed-text-v2:0",
  "prompt_id": null
}

Step Functions Workflow

The Step Functions workflow processes your jobs through several stages, including preparing inputs, orchestrating jobs, and concurrent post-processing to merge model responses back with the original data. Monitoring the workflow provides insights into job status and resource utilization.

Conclusion

In this post, we’ve explored a serverless architecture using Amazon Bedrock for large-scale batch processing. This solution is versatile for various use cases beyond inference, including large-scale data labeling and embedding generation.

The solution is publicly available in the GitHub repository, and we encourage you to implement this architecture to unlock new possibilities in your AI/ML endeavors.

Meet the Authors

  • Swagat Kulkarni: Senior Solutions Architect at AWS, passionate about cloud-native services and innovative AI solutions.
  • Evan Diewald: Data & ML Engineer, dedicated to developing and deploying ML solutions across various industries.
  • Shreyas Subramanian: Principal Data Scientist, specializing in generative AI and deep learning, with a rich background in cutting-edge research.

We look forward to seeing how you leverage this architecture for your projects!

Latest

Tailoring Text Content Moderation Using Amazon Nova

Enhancing Content Moderation with Customized AI Solutions: A Guide...

ChatGPT Can Recommend and Purchase Products, but Human Input is Essential

The Human Voice in the Age of AI: Why...

Revolute Robotics Unveils Drone Capable of Driving and Flying

Revolutionizing Remote Inspections: The Future of Hybrid Aerial-Terrestrial Robotics...

Walmart Utilizes AI to Improve Supply Chain Efficiency and Cut Costs | The Arkansas Democrat-Gazette

Harnessing AI for Efficient Supply Chain Management at Walmart Listen...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Microsoft launches new AI tool to assist finance teams with generative tasks

Microsoft Launches AI Copilot for Finance Teams in Microsoft...

Tailoring Text Content Moderation Using Amazon Nova

Enhancing Content Moderation with Customized AI Solutions: A Guide to Amazon Nova on SageMaker Understanding the Challenges of Content Moderation at Scale Key Advantages of Nova...

Building a Secure MLOps Platform Using Terraform and GitHub

Implementing a Robust MLOps Platform with Terraform and GitHub Actions Introduction to MLOps Understanding the Role of Machine Learning Operations in Production Solution Overview Building a Comprehensive MLOps...

Automate Monitoring for Batch Inference in Amazon Bedrock

Harnessing Amazon Bedrock for Batch Inference: A Comprehensive Guide to Automated Monitoring and Product Recommendations Overview of Amazon Bedrock and Batch Inference Implementing Automated Monitoring Solutions Deployment...