Streamlining Machine Learning Model Deployment: Using Amazon SageMaker Canvas and Serverless Inference
Overview of Serverless Model Deployment with Amazon SageMaker
Key Benefits of SageMaker Canvas and Serverless Inference
Step-by-Step Guide to Deploying Your ML Model
Prerequisites for Model Deployment
Saving Your Model to SageMaker Model Registry
Approving Your Model for Deployment
Creating a New SageMaker Model
Configuring Your Endpoint for Serverless Deployment
Finalizing Your Endpoint Creation
Automated Deployment of SageMaker Models
Managing Costs and Resources Post-Deployment
Conclusion: Simplifying Your Machine Learning Journey with AWS
About the Authors
Simplifying ML Model Deployment with Amazon SageMaker Canvas and Serverless Inference
Deploying machine learning (ML) models into production can often feel like navigating a labyrinth—especially for customers who lack deep ML and DevOps expertise. Fortunately, Amazon SageMaker Canvas is here to simplify that journey. With its no-code interface, you can create highly accurate ML models using your existing data sources—without writing a single line of code. However, building the model is just the beginning; deploying it efficiently and cost-effectively is equally crucial.
That’s where Amazon SageMaker Serverless Inference comes into play. Designed for workloads with variable traffic patterns and idle periods, SageMaker Serverless Inference automatically provisions and scales infrastructure based on demand, sparing you the burdens of server management and preconfigured capacity. In this blog post, we will walk you through the steps to take an ML model built in SageMaker Canvas and deploy it using SageMaker Serverless Inference.
Solution Overview
Let’s explore an example workflow for creating a serverless endpoint for a SageMaker Canvas-trained model:
- Add the trained model to the Amazon SageMaker Model Registry.
- Create a new SageMaker model with the right configuration.
- Create a serverless endpoint configuration.
- Deploy the serverless endpoint with the specified model and endpoint configuration.
You can also automate this process—a feature we’ll discuss later.
Example: Deploying a Pre-trained Regression Model
In this example, we’ll deploy a pre-trained regression model to a serverless SageMaker endpoint. This allows us to efficiently handle variable workloads without requiring real-time inference.
Prerequisites
Before we dive into the steps, make sure you have:
- Access to Amazon Simple Storage Service (Amazon S3) and Amazon SageMaker AI.
- A regression or classification model that you have trained (you can train this in SageMaker Canvas, including data wrangling and transformations).
For demonstration purposes, we’ll use a classification model trained on a dataset named canvas-sample-shipping-logs.csv
.
Step 1: Save Your Model to SageMaker Model Registry
- Launch Amazon SageMaker Studio via the SageMaker AI console.
- Open SageMaker Canvas in a new tab.
- Locate the model and version you wish to deploy.
- On the options menu (three vertical dots), select Add to Model Registry.
You can log out of SageMaker Canvas afterward to manage costs and prevent extra charges.
Step 2: Approve Your Model for Deployment
- In the SageMaker Studio UI, choose Models from the navigation pane.
- Find the model exported from SageMaker Canvas, which will have a deployment status of Pending manual approval.
- Update the status to Approved.
- Navigate to the Deploy tab to view model container information. Check for the
SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT
environment variable.
Step 3: Create a New Model
-
Open a new SageMaker AI console tab.
-
Choose Models in the Inference section and click Create model.
-
Name your model.
-
Leave the container input as Provide model artifacts and inference image location using the CompressedModel type.
-
Input the necessary Amazon ECR URI, Amazon S3 URI, and environment variables you noted earlier.
Ensure environment variables are on separate lines:
SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT: text/csv SAGEMAKER_INFERENCE_OUTPUT: predicted_label ...
-
Choose Create model.
Step 4: Create an Endpoint Configuration
- On the SageMaker AI console, go to Endpoint configurations.
- Create a new model endpoint configuration, selecting Serverless as the endpoint type.
Step 5: Create an Endpoint
- Click on Endpoints from the navigation pane and create a new endpoint.
- Name the endpoint, select the configuration created in the previous step, and click Create endpoint.
The endpoint will take a few minutes to provision. When the status is InService, it’s ready for you to start calling.
Sample Code for Inference
Here’s how you can invoke the endpoint directly from a Jupyter notebook in your SageMaker Studio environment:
import boto3
import csv
from io import StringIO
import time
def invoke_shipping_prediction(features):
sagemaker_client = boto3.client('sagemaker-runtime')
output = StringIO()
csv.writer(output).writerow(features)
payload = output.getvalue()
response = sagemaker_client.invoke_endpoint(
EndpointName="canvas-shipping-data-model-1-serverless-endpoint",
ContentType="text/csv",
Accept="text/csv",
Body=payload
)
response_body = response['Body'].read().decode()
result = list(csv.reader(StringIO(response_body)))[0]
return {
'predicted_label': result[0],
'confidence': float(result[1]),
'class_probabilities': eval(result[2]),
'possible_labels': eval(result[3])
}
# Features for inference
features_set = [
["Bell", "Base", 14, 6, 11, 11, "GlobalFreight", "Bulk Order", "Atlanta", "2020-09-11 00:00:00", "Express", 109.25199890136719],
["Bell", "Base", 14, 6, 15, 15, "MicroCarrier", "Single Order", "Seattle", "2021-06-22 00:00:00", "Standard", 155.0483856201172]
]
for features in features_set:
start_time = time.time()
result = invoke_shipping_prediction(features)
end_time = time.time()
print(f"Prediction: {result['predicted_label']}, Confidence: {result['confidence']*100:.2f}%")
Automate the Process
To automatically create endpoints each time a new model is approved, you can use AWS CloudFormation. Here is an illustrative YAML template for creating Lambda functions to manage this process:
AWSTemplateFormatVersion: "2010-09-09"
Description: Lambda function to handle SageMaker model package creation and endpoint management
...
This template automates various aspects of SageMaker deployment but is only for inspiration; always test thoroughly before deploying.
Clean Up
To avoid unnecessary costs, remember to log out of SageMaker Canvas and stop your JupyterLab instance after testing the endpoint.
Conclusion
By following these steps, we successfully demonstrated how to deploy a SageMaker Canvas model to a serverless endpoint using SageMaker Serverless Inference. This serverless approach allows for quick and efficient prediction serving without needing to manage infrastructure.
This seamless deployment experience exemplifies how AWS services like SageMaker Canvas and SageMaker Serverless Inference simplify the ML journey, enabling organizations of all sizes and technical levels to unlock the power of AI and ML.
About the Authors
Nadhya Polanco is a Solutions Architect at AWS in Brussels, Belgium, supporting organizations in integrating AI and ML into their workflows. Brajendra Singh is a Principal Solutions Architect at AWS, helping enterprise customers implement innovative solutions across Data Analytics and ML.
Embark on your machine learning journey with SageMaker and discover a world of possibilities!