Deploying OpenAI’s GPT-OSS Models for Enhanced Stock Analysis with Amazon SageMaker
Introduction
Explore the deployment of OpenAI’s open-weight models on Amazon SageMaker to create a powerful stock analyzer that utilizes advanced agent workflows.
Solution Overview
Incorporate a multi-agent orchestration with key components for an effective stock evaluation system.
Prerequisites
Requirements to set up your environment and ensure smooth deployments.
Deploying GPT-OSS Models to SageMaker Inference
Step-by-step guidance on deploying the models and customizing your inference setup.
Building a Stock Analyzer Agent with LangGraph
Leverage LangGraph to orchestrate a sophisticated multi-agent stock analysis system.
Deploying to Amazon Bedrock AgentCore
Instructions for deploying your agent to a scalable cloud infrastructure.
Invoking the Agent
Methods for invoking the stock analysis agent and handling its responses.
Clean Up
Best practices for deleting resources post-testing to manage costs effectively.
Conclusion
Summarize key takeaways and benefits of deploying open source models for stock analysis.
About the Authors
Learn more about the experts behind this solution and their contributions to the field of AI and machine learning.
Leveraging OpenAI’s GPT-OSS Models with Amazon SageMaker for a Stock Analysis Agent
OpenAI has made waves in the AI landscape by releasing powerful open-weight models, notably the gpt-oss-120b (with 117 billion parameters) and gpt-oss-20b (21 billion parameters). Both of these models are designed using a Mixture of Experts (MoE) framework and boast an impressive 128K context window. Thanks to their superior performance, which has been benchmarked by Artificial Analysis, they are considered leading open-source models, particularly excelling in reasoning and agentic workflows.
This blog post will provide an insightful overview of how to deploy the gpt-oss-20b model on Amazon SageMaker, create a multi-agent stock analysis assistant using LangGraph, and ultimately deploy these agents on Amazon Bedrock AgentCore.
Solution Overview
Within this solution, we are developing a stock analysis agent that comprises the following key components:
- GPT OSS 20B Model: Deployed to a SageMaker endpoint using vLLM, an open-source serving framework optimized for large language models.
- LangGraph: Utilized for orchestrating multi-agent workflows.
- Amazon Bedrock AgentCore: The platform for deploying agents seamlessly.
Architecture Diagram
(Note: Insert architecture diagram)
This architecture exemplifies a multi-agent workflow hosted on the Amazon Bedrock AgentCore Runtime. When a user submits a query, it is managed by a pipeline of specialized agents—Data Gathering Agent, Stock Performance Analyzer Agent, and Stock Report Generation Agent—each responsible for distinct aspects of the stock evaluation process.
These agents collaborate within Amazon Bedrock’s runtime environment, invoking the hosted GPT OSS model whenever language understanding or generation is needed, ensuring a modular, serverless, and scalable system leveraging open-source capabilities.
Prerequisites
Before diving into deployment, here are some prerequisites:
- Instance Quota: Ensure you have the required quota for G6e instances. If not, request additional quota.
- SageMaker Domain: Create a SageMaker domain if you’re a first-time user.
- IAM Roles: Make sure your IAM roles have permissions needed for deploying SageMaker Models and Endpoints. Refer to the SageMaker Developer Guide for guidance.
Deploying GPT-OSS Models to SageMaker Inference
If you’re interested in customizing your models, SageMaker provides fully-managed hosting that simplifies the deployment process. The GPT-OSS models also utilize a 4-bit quantization scheme (MXFP4), allowing for faster inference while optimizing resource use.
To deploy effectively, we will build a vLLM container supporting GPT OSS models on SageMaker. Below is a sample Dockerfile and deployment script for setting it up:
# Dockerfile for vLLM
FROM <base_image>
# Install necessary dependencies
RUN pip install vllm
COPY . /app
WORKDIR /app
After building and pushing the container to Amazon ECR, the next step is to launch Amazon SageMaker Studio, where you can create a Jupyter environment for deployment.
The deployment configuration can be set via:
from sagemaker import Model
lmi_model = Model(
image_uri=inference_image,
env=config,
role=role,
name=model_name,
)
lmi_model.deploy(
initial_instance_count=1,
instance_type=instance_type,
...
)
Utilizing LangGraph for a Stock Analyzer Agent
Using LangGraph allows us to orchestrate interactions between various agents handling specific tasks. The comprehensive system we’ve built comprises three specialized tools working cohesively to conduct thorough stock analysis:
- Gather Stock Data Tool: Compiles stock data with current prices, historical performance, financial metrics, and market insights.
- Analyze Stock Performance Tool: Effectuates in-depth analysis based on technical indicators and fundamental metrics.
- Generate Stock Report Tool: Produces professional PDF reports from the data and analysis, which are then stored in Amazon S3.
Local Testing
You can initially test components locally by importing the necessary functions, making it easier to refine our agent logic before scaling up.
result = langgraph_stock_sagemaker({
"prompt": "Analyze SIM_STOCK Stock for Investment purposes."
})
print(result)
Deploying to Amazon Bedrock AgentCore
After developing the LangGraph framework, it’s time to deploy it to Amazon Bedrock AgentCore Runtime, which streamlines the management of your infrastructure while providing persistent execution environments.
Create the necessary IAM role with appropriate permissions for seamless interaction between the SageMaker endpoint and your agents:
role_arn = create_bedrock_agentcore_role(
role_name="MyStockAnalyzerRole",
...
)
Once the role is created, use the AgentCore Starter Toolkit for deployment:
agentcore_runtime = Runtime()
response = agentcore_runtime.configure(
entrypoint="langgraph_stock_sagemaker_gpt_oss.py",
...
)
launch_result = agentcore_runtime.launch(local=False)
This toolkit automatically sets up the requisite HTTP server and endpoints for processing agent invocations.
Invoking the Agent
After deployment, invoking your newly created agent can be performed as follows:
response = agentcore_client.invoke_agent_runtime(
agentRuntimeArn=launch_result.agent_arn,
...
)
You’ll need to parse the response to present the stock analysis neatly, including sections such as stock data gathering, performance analysis, and report generation.
Clean Up
To avoid unnecessary costs, remember to clean up your resources when testing is complete:
sess.delete_endpoint(endpoint_name)
Conclusion
This post outlined how to effectively deploy OpenAI’s GPT-OSS models into a functional multi-agent stock analysis system using Amazon SageMaker and Bedrock. By leveraging these technologies, organizations can significantly reduce processing times for stock analyses while increasing analyst productivity.
Experiment with the code samples provided, and enhance your workflows for practical applications in your business!
About the Authors
Vivek Gangasani is a Worldwide Lead GenAI Specialist for SageMaker Inference. He specializes in helping enterprises scale their Generative AI models.
Surya Kari is a Senior Generative AI Data Scientist at AWS, focusing on developing solutions leveraging advanced foundation models.
With innovative approaches and scalable solutions, the future of stock analysis is now at your fingertips!