Unlocking AI Innovation: Leveraging NVIDIA DGX Cloud on AWS for Generative AI Solutions
Co-Authors: Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA
Introduction to NVIDIA DGX Cloud on AWS
Amazon Bedrock Custom Model Import
Architecture Overview of NVIDIA DGX Cloud on AWS
Setting Up Your NVIDIA DGX Cloud on AWS
Fine-tuning the Llama3 Model with NVIDIA DGX Cloud
Importing Your Custom Model to Amazon Bedrock
Inference Using Amazon Bedrock
Clean-Up Procedures
Conclusion
Resources
About the Authors
Unleashing AI Potential with NVIDIA DGX Cloud on AWS
This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA.
The AI revolution is upon us, and at the forefront of this evolution is the powerful synergy between NVIDIA’s GPU expertise and Amazon Web Services (AWS). The launch of DGX Cloud on AWS represents a pivotal breakthrough in democratizing access to high-performance AI infrastructure. By streamlining performance, enhancing security, and facilitating unparalleled flexibility, this platform is set to redefine how organizations approach AI innovation.
In this post, we delve into an end-to-end development workflow that leverages NVIDIA DGX Cloud on AWS, Run:ai, and Amazon Bedrock Custom Model Import to fine-tune the open-source Llama 3.1-70b model. We will explore how this collaboration can expedite training processes, reduce operational complexities, and drive new business prospects.
NVIDIA DGX Cloud on AWS: A Game Changer in AI Training
Organizations are increasingly focused on harnessing generative AI and agentic AI solutions to quickly derive business value. In response, AWS and NVIDIA have joined forces to create a fully managed, high-performance AI training platform—NVIDIA DGX Cloud on AWS. This platform offers short-term access to expansive GPU clusters optimized for fast training times and overall productivity.
The performance of DGX Cloud is bolstered by access to the latest NVIDIA architectures, including the upcoming Amazon EC2 P6e-GB200 UltraServer with the NVIDIA Grace Blackwell GB200 Superchip. Moreover, it offers continuous access to NVIDIA’s AI experts, ensuring 24/7 support and maximized return on investment (ROI).
Amazon Bedrock Custom Model Import: Simplifying Deployment
Built for flexibility and integration, Amazon Bedrock is a fully managed service featuring high-performing foundation models (FMs) from industry leaders. It allows organizations to explore a serverless experience, facilitating quick customization and deployment without complex infrastructure management.
Amazon Bedrock Custom Model Import takes this a step further, enabling users to access imported custom models seamlessly. By utilizing built-in capabilities, enterprise developers can accelerate generative AI application development while maintaining security and privacy.
Architecture Overview: Optimized for AI Workloads
DGX Cloud is meticulously designed for customers needing to train or fine-tune models. Utilizing p5.48xlarge instances loaded with H100 GPUs, the platform organizes node instances for optimal AI/ML workloads, yielding lower latency and quicker results. By employing Amazon EKS and NVIDIA software, DGX Cloud further enhances Kubernetes cluster deployment and optimization.
DGX Cloud also provides private access options through AWS PrivateLink and AWS Transit Gateway. This allows customers secure and direct connections between their clusters and AWS accounts—reinforcing a robust architecture for AI workloads.
Setting Up Your DGX Cloud Cluster
Once access to your DGX Cloud cluster is secured, you can easily set up to run various workloads. A cluster administrator can create departments and projects, enabling effective quota management for users. With the flexibility of the Run:ai interface, users can seamlessly allocate GPUs and resources for their projects.
For example, initiate an interactive Jupyter notebook workspace with the nvcr.io/nvidia/nemo:25.02 image to preprocess and manage code. Connection to Amazon S3 buckets allows for efficient data handling directly linked to your AWS account.
Fine-tuning the Llama 3 Model
After setting up your Jupyter notebook, you can start fine-tuning the Llama 3.1-70b model. Upload datasets and leverage the NVIDIA NeMo framework to optimize your model for specific tasks. Utilizing NVIDIA’s powerful infrastructure, this process is expedited, allowing for rapid iterations.
Once your model is fine-tuned, conversion back to Hugging Face’s format simplifies the transfer to Amazon S3, preparing it for the final deployment phase.
Importing Your Custom Model to Amazon Bedrock
To seamlessly deploy your fine-tuned model, import it via the Amazon Bedrock console. Once your model files are configured, you can even set up custom encryption settings for additional security. Monitoring the import job is straightforward, ensuring transparency throughout the process.
Inference with Amazon Bedrock
Utilizing the Amazon Bedrock playground allows you to test and integrate your newly imported models efficiently. The visual interface supports experimentation, making it simpler to fine-tune configurations before deployment.
Cleanup and Conclusion
To avoid ongoing costs, remember to delete the resources created during your workflow. This means removing the imported model and any generated AWS KMS keys.
In conclusion, the integration of NVIDIA DGX Cloud on AWS with Amazon Bedrock Custom Model Import results in a powerful solution for the development, fine-tuning, and operationalization of generative and agentic AI applications. This collaboration allows organizations to minimize overhead while fostering rapid innovation.
Are you ready to accelerate your AI initiatives? Start exploring NVIDIA DGX Cloud on AWS today, and don’t forget to check out the examples available in the dgxc-benchmarking GitHub repository!
Resources
For additional insights, consider exploring the links in the AWS documentation and the NVIDIA newsletters to stay updated on the latest developments in AI technologies.
About the Authors
Vara Bonthu
Principal Open Source Specialist SA at AWS, tackling open-source initiatives and enabling diverse organizations in AI/ML and Kubernetes.
Chad Elias
Senior Solutions Architect for AWS, specializing in modernizing infrastructures with passionate contributions to open-source projects.
Brian Kreitzer
Partner Solutions Architect at AWS, focused on technical co-sell opportunities and evangelizing cloud solutions.
Timothy Ma
Principal Specialist in generative AI at AWS, assisting customers in deploying cutting-edge machine learning solutions.
Andrew Liu
Manager of DGX Cloud Technical Marketing Engineering at NVIDIA, showcasing the capabilities of DGX Cloud through various use cases.
Chelsea Isaac
Senior Solutions Architect for DGX Cloud at NVIDIA, dedicated to helping enterprises scale their AI solutions in the cloud.
Zoey Zhang
Technical Marketing Engineer with expertise in integrating machine learning models in cloud environments.
Charlie Huang
Senior Product Marketing Manager for Cloud AI at NVIDIA, responsible for bringing NVIDIA DGX Cloud to market.