Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Enhance Generative AI Workflows with NVIDIA DGX Cloud on AWS and Custom Model Import via Amazon Bedrock

Unlocking AI Innovation: Leveraging NVIDIA DGX Cloud on AWS for Generative AI Solutions

Co-Authors: Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA

Introduction to NVIDIA DGX Cloud on AWS

Amazon Bedrock Custom Model Import

Architecture Overview of NVIDIA DGX Cloud on AWS

Setting Up Your NVIDIA DGX Cloud on AWS

Fine-tuning the Llama3 Model with NVIDIA DGX Cloud

Importing Your Custom Model to Amazon Bedrock

Inference Using Amazon Bedrock

Clean-Up Procedures

Conclusion

Resources

About the Authors

Unleashing AI Potential with NVIDIA DGX Cloud on AWS

This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA.

The AI revolution is upon us, and at the forefront of this evolution is the powerful synergy between NVIDIA’s GPU expertise and Amazon Web Services (AWS). The launch of DGX Cloud on AWS represents a pivotal breakthrough in democratizing access to high-performance AI infrastructure. By streamlining performance, enhancing security, and facilitating unparalleled flexibility, this platform is set to redefine how organizations approach AI innovation.

In this post, we delve into an end-to-end development workflow that leverages NVIDIA DGX Cloud on AWS, Run:ai, and Amazon Bedrock Custom Model Import to fine-tune the open-source Llama 3.1-70b model. We will explore how this collaboration can expedite training processes, reduce operational complexities, and drive new business prospects.

NVIDIA DGX Cloud on AWS: A Game Changer in AI Training

Organizations are increasingly focused on harnessing generative AI and agentic AI solutions to quickly derive business value. In response, AWS and NVIDIA have joined forces to create a fully managed, high-performance AI training platform—NVIDIA DGX Cloud on AWS. This platform offers short-term access to expansive GPU clusters optimized for fast training times and overall productivity.

The performance of DGX Cloud is bolstered by access to the latest NVIDIA architectures, including the upcoming Amazon EC2 P6e-GB200 UltraServer with the NVIDIA Grace Blackwell GB200 Superchip. Moreover, it offers continuous access to NVIDIA’s AI experts, ensuring 24/7 support and maximized return on investment (ROI).

Amazon Bedrock Custom Model Import: Simplifying Deployment

Built for flexibility and integration, Amazon Bedrock is a fully managed service featuring high-performing foundation models (FMs) from industry leaders. It allows organizations to explore a serverless experience, facilitating quick customization and deployment without complex infrastructure management.

Amazon Bedrock Custom Model Import takes this a step further, enabling users to access imported custom models seamlessly. By utilizing built-in capabilities, enterprise developers can accelerate generative AI application development while maintaining security and privacy.

Architecture Overview: Optimized for AI Workloads

DGX Cloud is meticulously designed for customers needing to train or fine-tune models. Utilizing p5.48xlarge instances loaded with H100 GPUs, the platform organizes node instances for optimal AI/ML workloads, yielding lower latency and quicker results. By employing Amazon EKS and NVIDIA software, DGX Cloud further enhances Kubernetes cluster deployment and optimization.

DGX Cloud also provides private access options through AWS PrivateLink and AWS Transit Gateway. This allows customers secure and direct connections between their clusters and AWS accounts—reinforcing a robust architecture for AI workloads.

Setting Up Your DGX Cloud Cluster

Once access to your DGX Cloud cluster is secured, you can easily set up to run various workloads. A cluster administrator can create departments and projects, enabling effective quota management for users. With the flexibility of the Run:ai interface, users can seamlessly allocate GPUs and resources for their projects.

For example, initiate an interactive Jupyter notebook workspace with the nvcr.io/nvidia/nemo:25.02 image to preprocess and manage code. Connection to Amazon S3 buckets allows for efficient data handling directly linked to your AWS account.

Fine-tuning the Llama 3 Model

After setting up your Jupyter notebook, you can start fine-tuning the Llama 3.1-70b model. Upload datasets and leverage the NVIDIA NeMo framework to optimize your model for specific tasks. Utilizing NVIDIA’s powerful infrastructure, this process is expedited, allowing for rapid iterations.

Once your model is fine-tuned, conversion back to Hugging Face’s format simplifies the transfer to Amazon S3, preparing it for the final deployment phase.

Importing Your Custom Model to Amazon Bedrock

To seamlessly deploy your fine-tuned model, import it via the Amazon Bedrock console. Once your model files are configured, you can even set up custom encryption settings for additional security. Monitoring the import job is straightforward, ensuring transparency throughout the process.

Inference with Amazon Bedrock

Utilizing the Amazon Bedrock playground allows you to test and integrate your newly imported models efficiently. The visual interface supports experimentation, making it simpler to fine-tune configurations before deployment.

Cleanup and Conclusion

To avoid ongoing costs, remember to delete the resources created during your workflow. This means removing the imported model and any generated AWS KMS keys.

In conclusion, the integration of NVIDIA DGX Cloud on AWS with Amazon Bedrock Custom Model Import results in a powerful solution for the development, fine-tuning, and operationalization of generative and agentic AI applications. This collaboration allows organizations to minimize overhead while fostering rapid innovation.

Are you ready to accelerate your AI initiatives? Start exploring NVIDIA DGX Cloud on AWS today, and don’t forget to check out the examples available in the dgxc-benchmarking GitHub repository!

Resources

For additional insights, consider exploring the links in the AWS documentation and the NVIDIA newsletters to stay updated on the latest developments in AI technologies.

About the Authors

Vara Bonthu

Principal Open Source Specialist SA at AWS, tackling open-source initiatives and enabling diverse organizations in AI/ML and Kubernetes.

Chad Elias

Senior Solutions Architect for AWS, specializing in modernizing infrastructures with passionate contributions to open-source projects.

Brian Kreitzer

Partner Solutions Architect at AWS, focused on technical co-sell opportunities and evangelizing cloud solutions.

Timothy Ma

Principal Specialist in generative AI at AWS, assisting customers in deploying cutting-edge machine learning solutions.

Andrew Liu

Manager of DGX Cloud Technical Marketing Engineering at NVIDIA, showcasing the capabilities of DGX Cloud through various use cases.

Chelsea Isaac

Senior Solutions Architect for DGX Cloud at NVIDIA, dedicated to helping enterprises scale their AI solutions in the cloud.

Zoey Zhang

Technical Marketing Engineer with expertise in integrating machine learning models in cloud environments.

Charlie Huang

Senior Product Marketing Manager for Cloud AI at NVIDIA, responsible for bringing NVIDIA DGX Cloud to market.

Latest

Tailoring Text Content Moderation Using Amazon Nova

Enhancing Content Moderation with Customized AI Solutions: A Guide...

ChatGPT Can Recommend and Purchase Products, but Human Input is Essential

The Human Voice in the Age of AI: Why...

Revolute Robotics Unveils Drone Capable of Driving and Flying

Revolutionizing Remote Inspections: The Future of Hybrid Aerial-Terrestrial Robotics...

Walmart Utilizes AI to Improve Supply Chain Efficiency and Cut Costs | The Arkansas Democrat-Gazette

Harnessing AI for Efficient Supply Chain Management at Walmart Listen...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Microsoft launches new AI tool to assist finance teams with generative tasks

Microsoft Launches AI Copilot for Finance Teams in Microsoft...

Leverage Amazon SageMaker HyperPod and Anyscale for Next-Gen Distributed Computing Solutions

Optimizing Large-Scale AI Deployments with Amazon SageMaker HyperPod and Anyscale Overview of Challenges in AI Infrastructure Introducing Amazon SageMaker HyperPod for ML Workloads The Integration of Anyscale...

Vxceed Creates the Ideal Sales Pitch for Scalable Sales Teams with...

Revolutionizing Revenue Retention: AI-Powered Solutions for Consumer Packaged Goods in Emerging Markets Collaborating for Change in CPG Loyalty Programs The Challenge: Addressing Revenue Retention in Emerging...

Streamline the Creation of Amazon QuickSight Data Stories with Agentic AI...

Streamlining Decision-Making with Automated Amazon QuickSight Data Stories Overview of Challenges in Data Story Creation Introduction to Amazon Nova Act Automating QuickSight Data Stories: A Step-by-Step Guide Best...