Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Enhance Generative AI Workflows with NVIDIA DGX Cloud on AWS and Custom Model Import via Amazon Bedrock

Unlocking AI Innovation: Leveraging NVIDIA DGX Cloud on AWS for Generative AI Solutions

Co-Authors: Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA

Introduction to NVIDIA DGX Cloud on AWS

Amazon Bedrock Custom Model Import

Architecture Overview of NVIDIA DGX Cloud on AWS

Setting Up Your NVIDIA DGX Cloud on AWS

Fine-tuning the Llama3 Model with NVIDIA DGX Cloud

Importing Your Custom Model to Amazon Bedrock

Inference Using Amazon Bedrock

Clean-Up Procedures

Conclusion

Resources

About the Authors

Unleashing AI Potential with NVIDIA DGX Cloud on AWS

This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA.

The AI revolution is upon us, and at the forefront of this evolution is the powerful synergy between NVIDIA’s GPU expertise and Amazon Web Services (AWS). The launch of DGX Cloud on AWS represents a pivotal breakthrough in democratizing access to high-performance AI infrastructure. By streamlining performance, enhancing security, and facilitating unparalleled flexibility, this platform is set to redefine how organizations approach AI innovation.

In this post, we delve into an end-to-end development workflow that leverages NVIDIA DGX Cloud on AWS, Run:ai, and Amazon Bedrock Custom Model Import to fine-tune the open-source Llama 3.1-70b model. We will explore how this collaboration can expedite training processes, reduce operational complexities, and drive new business prospects.

NVIDIA DGX Cloud on AWS: A Game Changer in AI Training

Organizations are increasingly focused on harnessing generative AI and agentic AI solutions to quickly derive business value. In response, AWS and NVIDIA have joined forces to create a fully managed, high-performance AI training platform—NVIDIA DGX Cloud on AWS. This platform offers short-term access to expansive GPU clusters optimized for fast training times and overall productivity.

The performance of DGX Cloud is bolstered by access to the latest NVIDIA architectures, including the upcoming Amazon EC2 P6e-GB200 UltraServer with the NVIDIA Grace Blackwell GB200 Superchip. Moreover, it offers continuous access to NVIDIA’s AI experts, ensuring 24/7 support and maximized return on investment (ROI).

Amazon Bedrock Custom Model Import: Simplifying Deployment

Built for flexibility and integration, Amazon Bedrock is a fully managed service featuring high-performing foundation models (FMs) from industry leaders. It allows organizations to explore a serverless experience, facilitating quick customization and deployment without complex infrastructure management.

Amazon Bedrock Custom Model Import takes this a step further, enabling users to access imported custom models seamlessly. By utilizing built-in capabilities, enterprise developers can accelerate generative AI application development while maintaining security and privacy.

Architecture Overview: Optimized for AI Workloads

DGX Cloud is meticulously designed for customers needing to train or fine-tune models. Utilizing p5.48xlarge instances loaded with H100 GPUs, the platform organizes node instances for optimal AI/ML workloads, yielding lower latency and quicker results. By employing Amazon EKS and NVIDIA software, DGX Cloud further enhances Kubernetes cluster deployment and optimization.

DGX Cloud also provides private access options through AWS PrivateLink and AWS Transit Gateway. This allows customers secure and direct connections between their clusters and AWS accounts—reinforcing a robust architecture for AI workloads.

Setting Up Your DGX Cloud Cluster

Once access to your DGX Cloud cluster is secured, you can easily set up to run various workloads. A cluster administrator can create departments and projects, enabling effective quota management for users. With the flexibility of the Run:ai interface, users can seamlessly allocate GPUs and resources for their projects.

For example, initiate an interactive Jupyter notebook workspace with the nvcr.io/nvidia/nemo:25.02 image to preprocess and manage code. Connection to Amazon S3 buckets allows for efficient data handling directly linked to your AWS account.

Fine-tuning the Llama 3 Model

After setting up your Jupyter notebook, you can start fine-tuning the Llama 3.1-70b model. Upload datasets and leverage the NVIDIA NeMo framework to optimize your model for specific tasks. Utilizing NVIDIA’s powerful infrastructure, this process is expedited, allowing for rapid iterations.

Once your model is fine-tuned, conversion back to Hugging Face’s format simplifies the transfer to Amazon S3, preparing it for the final deployment phase.

Importing Your Custom Model to Amazon Bedrock

To seamlessly deploy your fine-tuned model, import it via the Amazon Bedrock console. Once your model files are configured, you can even set up custom encryption settings for additional security. Monitoring the import job is straightforward, ensuring transparency throughout the process.

Inference with Amazon Bedrock

Utilizing the Amazon Bedrock playground allows you to test and integrate your newly imported models efficiently. The visual interface supports experimentation, making it simpler to fine-tune configurations before deployment.

Cleanup and Conclusion

To avoid ongoing costs, remember to delete the resources created during your workflow. This means removing the imported model and any generated AWS KMS keys.

In conclusion, the integration of NVIDIA DGX Cloud on AWS with Amazon Bedrock Custom Model Import results in a powerful solution for the development, fine-tuning, and operationalization of generative and agentic AI applications. This collaboration allows organizations to minimize overhead while fostering rapid innovation.

Are you ready to accelerate your AI initiatives? Start exploring NVIDIA DGX Cloud on AWS today, and don’t forget to check out the examples available in the dgxc-benchmarking GitHub repository!

Resources

For additional insights, consider exploring the links in the AWS documentation and the NVIDIA newsletters to stay updated on the latest developments in AI technologies.

About the Authors

Vara Bonthu

Principal Open Source Specialist SA at AWS, tackling open-source initiatives and enabling diverse organizations in AI/ML and Kubernetes.

Chad Elias

Senior Solutions Architect for AWS, specializing in modernizing infrastructures with passionate contributions to open-source projects.

Brian Kreitzer

Partner Solutions Architect at AWS, focused on technical co-sell opportunities and evangelizing cloud solutions.

Timothy Ma

Principal Specialist in generative AI at AWS, assisting customers in deploying cutting-edge machine learning solutions.

Andrew Liu

Manager of DGX Cloud Technical Marketing Engineering at NVIDIA, showcasing the capabilities of DGX Cloud through various use cases.

Chelsea Isaac

Senior Solutions Architect for DGX Cloud at NVIDIA, dedicated to helping enterprises scale their AI solutions in the cloud.

Zoey Zhang

Technical Marketing Engineer with expertise in integrating machine learning models in cloud environments.

Charlie Huang

Senior Product Marketing Manager for Cloud AI at NVIDIA, responsible for bringing NVIDIA DGX Cloud to market.

Latest

Tumbler Ridge Shooting Suspect Had ChatGPT Account Banned Months Prior to Incident

Tragic Shooting at Tumbler Ridge Secondary School: Eight Dead,...

AI and Robotics May Disrupt Up to 20% of Physical Jobs in the U.S.: Exclusive Insight

The Automation Revolution: How AI and Robotics Could Displace...

AI Mastering Tasks Through Unexpectedly Concise Programs

Unpacking the Learning Mechanisms of Large Language Models: Insights...

Generative AI Outperforms Human Research Teams in Analyzing Pregnancy Data

Impact of AI on Predicting Preterm Birth Amidst Rising...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Integrating External Tools with Amazon Quick Agents through the Model Context...

Integrating Amazon Quick with Model Context Protocol (MCP): A Comprehensive Guide Introduction to MCP Integration Amazon Quick supports Model Context Protocol (MCP) integrations for enhanced action...

Amazon SageMaker AI in 2025: Year in Review – Part 1:...

Enhancements in Amazon SageMaker AI for 2025: Transforming Infrastructure for Generative AI Exploring Capacity, Price Performance, Observability, and Usability Improvements Part 1: Capacity Improvements and Price...

Create AI Workflows on Amazon EKS Using Union.ai and Flyte

Streamlining AI/ML Workflows with Flyte and Union.ai on Amazon EKS Overcoming the Challenges of AI/ML Pipeline Management The Power of Flyte and Union.ai in Orchestrating AI...