Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Fast-Track Your Custom LLM Deployment: Fine-Tune with Oumi and Launch on Amazon Bedrock

Streamlining Fine-Tuning and Deployment of Open Source LLMs with Oumi and Amazon Bedrock


This title captures the essence of the content, indicating that the post will focus on the processes of fine-tuning and deploying large language models using Oumi and Amazon Bedrock.

Fine-Tuning Large Language Models with Oumi and Amazon Bedrock

Co-authored by David Stewart and Matthew Persons from Oumi.

In the rapidly evolving landscape of artificial intelligence, the journey from experimentation to production can be cumbersome, especially when fine-tuning open-source large language models (LLMs). The hurdles of managing various training configurations, artifact handling, and scalable deployments can create friction in this transition. This blog post aims to simplify that journey by detailing a workflow that utilizes Oumi to fine-tune a Llama model on Amazon EC2, stores artifacts in S3, and deploys the model using Amazon Bedrock’s Custom Model Import.

Benefits of Oumi and Amazon Bedrock

Oumi serves as an open-source platform that streamlines the foundation model lifecycle—from data preparation through to evaluation. The key advantages of using Oumi in conjunction with Amazon Bedrock include:

Key Benefits

  1. Recipe-driven Training: Define your configuration just once and reuse it across experiments, leading to reduced boilerplate and enhanced reproducibility.

  2. Flexible Fine-tuning: Select from full fine-tuning or parameter-efficient methods such as LoRA, depending on your project constraints.

  3. Integrated Evaluation: Evaluate your model checkpoints using pre-defined benchmarks without the need for extra tooling.

  4. Data Synthesis: Generate task-specific datasets when your production data is limited.

Amazon Bedrock complements this process by offering managed, serverless inference capabilities. Once fine-tuned, you can easily import your model with a three-step process: upload to S3, create the import job, and invoke the model—all without the hassle of managing inference infrastructure.


Figure 1: Oumi manages data, training, and evaluation on EC2. Amazon Bedrock provides managed inference via Custom Model Import.

Solution Overview

This workflow consists of three main stages:

  1. Fine-tune with Oumi on EC2: Start a GPU-optimized instance, install Oumi, and run training with your configuration. Oumi also supports distributed training for larger models with strategies like Fully Sharded Data Parallel (FSDP) and DeepSpeed.

  2. Store Artifacts on S3: Upload your model weights, checkpoints, and logs to S3 for durable storage.

  3. Deploy to Amazon Bedrock: Use the Custom Model Import job in Amazon Bedrock to point to your S3 artifacts, allowing automatic provisioning of inference infrastructure.

This architecture is designed to tackle common challenges associated with moving fine-tuned models into a production environment.

Technical Implementation

Let’s dive into a hands-on example using the meta-llama/Llama-3.2-1B-Instruct model. Although we chose this particular model as it fits well with an AWS g6.12xlarge EC2 instance, this methodology can be applied to various open-source models.

Prerequisites

To follow this walkthrough, make sure to set up the following AWS resources:

  1. Clone the repository:

    git clone https://github.com/aws-samples/sample-oumi-fine-tuning-bedrock-cmi.git
    cd sample-oumi-fine-tuning-bedrock-cmi
  2. Run the setup script:

    ./scripts/setup-aws-env.sh [--dry-run]

This script will prompt you for details about your AWS Region, S3 bucket name, EC2 key pair name, and security group ID.

Once your instance is up, SSH into it and continue with the next steps.

Step 1: Set Up the EC2 Environment

  1. Update your EC2 instance and install dependencies:

    sudo yum update -y
    sudo yum install python3 python3-pip git -y
  2. Clone and navigate to the project directory again:

    git clone https://github.com/aws-samples/sample-oumi-fine-tuning-bedrock-cmi.git
    cd sample-oumi-fine-tuning-bedrock-cmi
  3. Configure your environment variables:

    export AWS_REGION=your-region
    export S3_BUCKET=your-bucket-name
    export S3_PREFIX=your-s3-prefix
    aws configure set default.region "$AWS_REGION"
  4. Run the setup script to configure the environment:

    ./scripts/setup-environment.sh
    source .venv/bin/activate
  5. Authenticate with Hugging Face to access gated model weights.

Step 2: Configure Training

By default, the dataset is set to tatsu-lab/alpaca, which Oumi downloads automatically. If you want to change it, update the dataset_name parameter in configs/oumi-config.yaml.

If you’re interested in generating synthetic data, update the model_name in configs/synthesis-config.yaml and run:

oumi synth -c configs/synthesis-config.yaml

Step 3: Fine-Tune the Model

Fine-tune using Oumi’s training recipe:

./scripts/fine-tune.sh --config configs/oumi-config.yaml --output-dir models/final [--dry-run]

Monitor your job with nvidia-smi or AWS CloudWatch. If needed, enable EC2 Automatic Instance Recovery for long-running jobs.

Step 4: Evaluate the Model (Optional)

Evaluate your fine-tuned model using standard benchmarks:

oumi evaluate -c configs/evaluation-config.yaml

Step 5: Deploy to Amazon Bedrock

Upload your model artifacts to S3 and import it to Amazon Bedrock:

./scripts/upload-to-s3.sh --bucket $S3_BUCKET --source models/final --prefix $S3_PREFIX
./scripts/import-to-bedrock.sh --model-name my-fine-tuned-llama --s3-uri s3://$S3_BUCKET/$S3_PREFIX --role-arn $BEDROCK_ROLE_ARN --wait

Invoke the model:

./scripts/invoke-model.sh --model-id $MODEL_ARN --prompt "Translate this text to French: What is the capital of France?"

Step 6: Clean Up

To avoid ongoing costs, remove the resources created during this walkthrough:

aws ec2 terminate-instances --instance-ids $INSTANCE_ID
aws s3 rm s3://$S3_BUCKET/$S3_PREFIX/ --recursive
aws bedrock delete-imported-model --model-identifier $MODEL_ARN

Conclusion

In this post, we explored how to fine-tune the Llama-3.2-1B-Instruct model using Oumi on EC2 and deploy it via Amazon Bedrock’s Custom Model Import feature. This method allows you full control over your fine-tuning process while leveraging managed inference capabilities.

You can kickstart your own fine-tuning pipeline by checking out the companion repository. Happy building!

Acknowledgements

Special thanks to Pronoy Chopra and Jon Turdiev for their invaluable contributions.


About the Authors:

Bashir Mohammed is a Senior Lead GenAI Solutions Architect at AWS, specializing in architectural deployment for production-scale applications.

Bala Krishnamoorthy is a Senior GenAI Data Scientist at Amazon Bedrock GTM, helping startups leverage AI technology effectively.

Greg Fina is a Principal Startup Solutions Architect specializing in Generative AI, focusing on application modernization.

David Stewart leads Field Engineering at Oumi, enhancing generative AI applications through custom solutions.

Matthew Persons is a cofounder at Oumi, dedicated to developing open generative AI systems for practical uses.

Latest

Professors Fight to Preserve Critical Thinking in the Age of AI: ‘I Wish I Could Push ChatGPT Off a Cliff’

Navigating the Challenges of AI in Higher Education: Voices...

Voice AI in Smart Homes Market Projected to Reach USD 514.62 Billion by 2034

Voice AI in Smart Homes: Market Overview and Future...

Artificial Aesthetics | Varsity

Exploring the Aesthetics of AI-Generated Imagery: Between Absurdity and...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Access Anthropic Claude Models in India via Amazon Bedrock with Global...

Enhancing Generative AI Scale with Amazon Bedrock's Global Cross-Region Inference Introduction to Cross-Region Inference for Scalable AI Core Functionality of Global Cross-Region Inference Understanding Inference Profiles for...

Enhancing Security Analysis with Tines and Amazon Quick Suite

Seamless Security Automation: Integrating Amazon Quick Suite and Tines for Enhanced User Account Protection Unlocking Fast and Effective Security Insights through Automation Use Case: Orchestrated Security...

How Ricoh Developed a Scalable Intelligent Document Processing Solution Using AWS

Overcoming Document Processing Challenges with Generative AI: A Case Study from Ricoh Transforming Enterprise Workflows through Serverless Architecture and Standardized Frameworks Customer Overview Challenges with Complex Document...