Introducing NVIDIA Nemotron 3 Nano: Unleashing Serverless AI Innovation on Amazon Bedrock
This collaborative post details NVIDIA’s latest advancements in generative AI with the launch of the Nemotron 3 Nano model, now available in Amazon Bedrock, highlighting its features, capabilities, and application use cases for various industries.
Unlocking AI Innovation with NVIDIA Nemotron 3 Nano on Amazon Bedrock
Co-written with Abdullahi Olaoye, Curtice Lockhart, and Nirmal Kumar Juluru from NVIDIA.
We are thrilled to announce the availability of NVIDIA’s Nemotron 3 Nano as a fully managed, serverless model within Amazon Bedrock. Following our initial unveiling of the NVIDIA Nemotron 2 Nano 9B and 12B models at AWS re:Invent, this new release represents another significant step in helping organizations leverage generative AI without the complexities of managing infrastructure.
Accelerating Generative AI Development
With the NVIDIA Nemotron open models provided on Amazon Bedrock, developers can accelerate innovation and realize tangible business value seamlessly. By utilizing Nemotron’s cutting-edge capabilities through Amazon Bedrock’s inference framework, businesses can harness a wealth of advanced features and tools designed to simplify development workflows.
About Nemotron 3 Nano
The NVIDIA Nemotron 3 Nano is a small language model (SLM) built on a hybrid Mixture-of-Experts (MoE) architecture, boasting exceptional compute efficiency and accuracy. This model is fully open, complete with open weights, datasets, and recipes that reinforce transparency for developers and enterprises.
Key Specifications:
- Architecture: Hybrid Transformer-Mamba with Mixture-of-Experts
- Model Size: 30 billion parameters, with 3 billion active at any given time
- Context Length: Up to 256K tokens
- Input/Output: Text|Text
Nemotron 3 Nano excels in coding, scientific reasoning, mathematics, tool calling, instruction following, and interactive chat in various benchmarks, outpacing other models of similar size, including benchmarks like SWE Bench Verified and IFBench.
Model Benchmarks
Nemotron 3 Nano prominently features in the top echelon of the Artificial Analysis Openness Index vs. Intelligence Index, boasting transparency and trustworthiness. This focus on openness not only boosts developer confidence but enables straightforward auditing and governance.
(Chart showing Nemotron 3 Nano’s position in the Openness vs. Intelligence Index)
It achieves an impressive score of 52 points on the Intelligence vs. Output Speed Index, showcasing substantial improvements over its predecessor, Nemotron 2 Nano. This model is tailor-made for rapid inference, vital for applications in agentic AI.
Use Cases Across Industries
The versatility of Nemotron 3 Nano allows it to power diverse applications, including:
- Finance: Streamlining loan processing by evaluating data, identifying fraud, and reducing risks.
- Cybersecurity: Enhancing threat detection and automating vulnerability assessments.
- Software Development: Empowering code summarization and generation.
- Retail: Enriching customer experiences through real-time, personalized recommendations.
Getting Started with NVIDIA Nemotron 3 Nano in Amazon Bedrock
To begin exploring Nemotron 3 Nano, follow these steps:
- Open the Amazon Bedrock console and navigate to the Chat/Text playground.
- Select NVIDIA from the model category list, then choose NVIDIA Nemotron 3 Nano.
- Click Apply to load the model and start generating text instantly.
For example, you can prompt it to generate a Python unit test with the following command:
Write a pytest unit test suite for a function called calculate_mortgage(principal, rate, years).
Through this prompt, the model employs its inherent reasoning capabilities to output precise results effectively.
Programmatic Access via AWS CLI and SDKs
The model can be activated programmatically using model ID nvidia.nemotron-nano-3-30b. Below is an example of how to invoke this model using the AWS CLI:
aws bedrock-runtime invoke-model \
--model-id nvidia.nemotron-nano-3-30b \
--region us-west-2 \
--body '{"messages": [{"role": "user", "content": "Type_Your_Prompt_Here"}], "max_tokens": 512, "temperature": 0.5, "top_p": 0.9}' \
--cli-binary-format raw-in-base64-out \
invoke-model-output.txt
You can also interact with the model through the OpenAI-compatible endpoint using the OpenAI SDK.
Enhanced Capabilities with Amazon Bedrock Features
Harness the full potential of Amazon Bedrock by integrating Nemotron 3 Nano with managed tools such as:
-
Amazon Bedrock Guardrails: Implement safety features to filter harmful content and ensure responsible AI usage.
-
Amazon Bedrock Knowledge Bases: Automate Retrieval-Augmented Generation workflows to improve response accuracy and relevance.
Conclusion
In this post, we showcased how to embark on your journey with NVIDIA Nemotron 3 Nano in Amazon Bedrock. This model’s launch signifies a major leap toward unlocking the full potential of generative AI in various fields, poised to tackle real-world challenges with confidence. As we move forward, we invite developers and enterprises alike to explore, innovate, and optimize their solutions with the unmatched capabilities of Nemotron 3 Nano.
The model is now available in several AWS regions, including US East (N. Virginia), Asia Pacific (Tokyo), and Europe (Milan). For further information and to get started, don’t hesitate to check out NVIDIA Nemotron and try out the Nemotron 3 Nano in the Amazon Bedrock console today!