Introducing On-Demand Deployment for Customized Models in Amazon Bedrock

Unlocking Flexible AI Solutions with Tailored Model Deployment

Understanding the Custom Model On-Demand Deployment Workflow

Prerequisites for Successful On-Demand Deployment

Implementation Guide for On-Demand Deployment

Step-by-Step Implementation Using the Amazon Bedrock Console

Step-by-Step Implementation Using API or SDK

Best Practices and Considerations for Effective Deployment

Cleanup: Managing Your Resources

Conclusion: Embracing Cost-Effective AI with On-Demand Deployment

About the Authors

Unlocking the Power of Customized AI with On-Demand Deployment in Amazon Bedrock

In the rapidly evolving world of Artificial Intelligence, flexibility and customization are key. Amazon Bedrock has stepped forward with a groundbreaking feature: on-demand deployment for customized foundation models (FMs). This allows organizations to tailor models to meet their specific needs without the overhead of unnecessary compute resources. In this post, we’ll explore what this means, how it works, and best practices for implementation.

What is On-Demand Deployment?

Amazon Bedrock now offers on-demand deployment for customized models, enabling real-time processing of requests without pre-provisioned compute resources. This new configuration is ideal for varying workloads, allowing organizations to invoke their models only as needed.

The pricing model follows a token-based structure, charging based on the number of tokens processed during inference, giving users the advantage of a pay-as-you-go approach. This complements the existing Provisioned Throughput option, allowing users the flexibility to choose the deployment strategy that aligns well with their business objectives.

The Custom Model Lifecycle

Understanding the journey from model conceptualization to deployment is essential. The process starts by defining specific use cases and preparing appropriate data. You can customize models using Amazon Bedrock’s fine-tuning or model distillation features.

Once the model is customized, you enter the evaluation phase, where decisions are made about how to deploy the model for inference. Here, on-demand deployment plays a pivotal role, particularly for businesses with fluctuating demands.

Step-by-Step Implementation Guide

Implementing via the Amazon Bedrock Console

Select Your Model: Choose the customized model you want to deploy and navigate to the inference setup.
Deployment Configuration: Provide a name and description, and optionally add tags.
Start Deployment: Click "Create" to launch your on-demand model deployment.
Monitor Status: Keep track of your deployment status—Active, InProgress, or Failed—and view important details like the Deployment ARN and creation time.

Implementing via API or SDK

Using APIs allows for more programmatic control over deployment. Here’s how:

Create Bedrock Client: Start by configuring your Amazon Bedrock client to interact with the service.
Deploy Model: Use the CreateCustomModelDeployment API with your model details.
Check Deployment Status: Use GetCustomModelDeployment to monitor whether the deployment is Active.
Invoke Model: Employ the InvokeModel or Converse API to utilize your deployed model, providing input data in the appropriate format.

Here’s a concise example of invoking a model:

response = bedrock_runtime.converse(
    modelId="your_model_id",
    messages=[{"role": "user", "content": [{"text": "Hello, model!"}]}]
)
print(response.get('output'))

Best Practices for On-Demand Deployment

Implementing an on-demand deployment strategy requires careful consideration of several operational factors:

Cold Start Latency: Be aware that models may experience initial latencies if they haven’t received requests recently.
Regional Availability: On-demand deployment is currently available in specific regions, so it’s essential to check where you can deploy your custom models.
Quota Management: Pay attention to quotas for tokens per minute and requests per minute. Each deployment operates under its assigned limits to ensure efficient resource use.
Cost Management: Utilize cost allocation tags to track your spending and optimize deliveries through AWS Cost Explorer.

Cleaning Up Resources

If you test on-demand deployment and decide not to continue, ensure you clean up resources to avoid unwanted costs. You can delete deployments either through the Amazon Bedrock Console or using the DeleteCustomModelDeployment API.

Conclusion

The launch of on-demand deployment for customized models in Amazon Bedrock is a game-changer for organizations looking to leverage AI more effectively. This new feature provides cost optimization, operational simplicity, and scalable solutions—all while ensuring you only pay for what you use.

Getting started is straightforward. Customize your model today through fine-tuning or distillation, choose on-demand deployment, and integrate seamlessly into your workflows.

To delve deeper into on-demand deployments and to explore comprehensive guides, check out the Amazon Bedrock documentation and visit our GitHub repository for code samples.

About the Authors
A dedicated team of AWS experts has contributed to this guide, combining extensive experience in AI/ML to help you successfully navigate your journey through customized AI solutions.

Let’s unlock new possibilities with Amazon Bedrock!

Exclusive Content:

On-Demand Deployment of Customized Amazon Nova Models on Amazon Bedrock