Streamlining Your MLflow Migration: From Self-Managed Tracking Server to Amazon SageMaker’s Serverless MLflow

A Comprehensive Guide to Optimizing MLflow with Amazon SageMaker AI

Migrating Your Self-Managed MLflow Tracking Server to Amazon SageMaker: A Step-by-Step Guide

Operating a self-managed MLflow tracking server offers flexibility but comes with its unique set of administrative challenges, such as server maintenance and resource scaling. As machine learning (ML) teams grow, effectively managing resources during high-demand periods becomes integral to maintaining cost-efficiency. Many organizations utilizing MLflow on Amazon EC2 or on-premises can streamline their operations and lower expenses by migrating to a serverless MLflow setup within Amazon SageMaker.

In this post, we’ll guide you through the process of transitioning your self-managed MLflow tracking server to a serverless MLflow App on SageMaker, which automatically adjusts resources based on real-time demand. This transition alleviates the necessity for server patching and storage management—effectively simplifying your workflow.

Why Migrate to Amazon SageMaker AI?

Moving to a serverless model not only enhances resource management but also minimizes operational overhead. The MLflow Export Import tool makes transferring your experiments, runs, models, and other essential resources straightforward. Moreover, this tool can be employed for various other purposes, such as upgrading versions and establishing backup routines, making it a versatile asset in any ML workflow.

Step-by-Step Guide: Transitioning to SageMaker with MLflow

This illustration clarifies the migration process using the MLflow Export Import tool:

Prerequisites

Before diving into the migration, ensure you have the following:

Access to your existing MLflow tracking server
AWS account with necessary permissions to SageMaker
Basic familiarity with command line tools

Step 1: Verify MLflow Version Compatibility

Check if your existing MLflow version is compatible with the MLflow Export Import tool and Amazon SageMaker. To do this:

Determine your current MLflow version.
Refer to the Amazon SageMaker MLflow documentation to see the latest supported version.

If needed, upgrade to the latest marked version:

pip install --upgrade mlflow=={supported_version}

Step 2: Create a New MLflow App

Set up your target environment by creating a serverless MLflow App in Amazon SageMaker:

Launch Amazon SageMaker Studio.
Navigate to the MLflow section and create a new MLflow App.
Note the ARN of your tracking server; you’ll need it later.

Step 3: Install MLflow and the SageMaker Plugin

Ensure your execution environment can connect to your MLflow servers by installing MLflow and the required SageMaker plugin. Run:

pip install mlflow sagemaker-mlflow

Step 4: Install the MLflow Export Import Tool

Next, install the MLflow Export Import tool, which facilitates the migration. Execute:

pip install git+https://github.com/mlflow/mlflow-export-import/#egg=mlflow-export-import

Step 5: Export MLflow Resources to a Directory

It’s time to export your MLflow resources. First, create a target directory for the export. Then, configure the following commands:

export MLFLOW_TRACKING_URI=http://localhost:8080
export-all --output-dir mlflow-export

Step 6: Import MLflow Resources to Your MLflow App

Once the export is complete, set the tracking URI to your new MLflow App using its ARN and run:

export MLFLOW_TRACKING_URI=arn:aws:sagemaker:::mlflow-app/app-
import-all --input-dir mlflow-export

Step 7: Validate Your Migration Results

To ensure a successful migration, open the dashboard of your new MLflow App and check for:

Presence of all exported resources with original metadata
Complete run histories, metrics, and parameters
Downloadable model artifacts
Preserved tags and notes

You can also programmatically verify access by running:

import mlflow

mlflow.set_tracking_uri('arn:aws:sagemaker:::mlflow-app/app-')
experiments = mlflow.search_experiments()
for exp in experiments:
    print(f"Experiment Name: {exp.name}")
    runs = mlflow.search_runs(exp.experiment_id)
    print(f"Number of runs: {len(runs)}")

Considerations

Plan adequately by ensuring your execution environment maintains sufficient storage and computational resources to handle your migration’s data volume. Depending on your network connectivity, you may want to consider executing the migration in smaller batches, particularly for larger datasets.

Cleanup

Remember to handle costs associated with your SageMaker managed MLflow tracking server. Billing is based on server usage and data storage. You can stop or delete the servers when not needed to optimize costs, referring to Amazon SageMaker pricing for more details.

Conclusion

This guide has illustrated how to seamlessly migrate your self-managed MLflow tracking server to Amazon SageMaker. By shifting to a serverless MLflow App, you can significantly reduce operational overhead while enhancing your organization’s ML capabilities.

For more information, code samples, and examples, check our AWS Samples GitHub repository. To explore Amazon SageMaker’s extensive features, visit the Amazon SageMaker AI documentation.

About the Authors

Rahul Easwar – Senior Product Manager at AWS, with a rich background in managing scalable ML platforms.

Roland Odorfer – Solutions Architect at AWS, specializing in secure and scalable solutions for industrial clients.

Anurag Gajam – Software Development Engineer at AWS, recognized for contributions to enhancing MLflow capabilities.

Engage with us on LinkedIn to discover more about our work at the intersection of AI and cloud technology!

Exclusive Content:

Migrate MLflow Tracking Servers to Amazon SageMaker AI Using Serverless MLflow