Revolutionizing Music Creation with Generative AI: A Spotlight on Splash Music and AWS
Harnessing Technology to Democratize Music Production
Navigating Challenges: Scaling Advanced Music Generation
Unveiling HummingLM: The Future of AI-Driven Music Composition
Accelerating Innovation: Collaborating with AWS to Optimize Model Development
Streamlining Data Processing: Efficient Dataset Preparation Techniques
Mastering Model Architecture: The Dual-Component Design of HummingLM
Enhancing Training Efficiency: Optimizations Through AWS Neuron
Transforming Inference: Deploying HummingLM with AWS Infrastructure
Assessing Impact: Results from Splash Music’s Generative AI Journey
Future Horizons: A New Era in Music Creation and Innovation
Meet the Experts: Authors Behind the Revolution in AI Music Technology
Revolutionizing Music Creation with Generative AI: How Splash Music is Leading the Charge
Generative AI is quickly transforming the music industry, making it easier than ever for creators of all skill levels to produce professional-quality tracks. By utilizing advanced foundation models (FMs), artists can personalize compositions in real-time, catering to the increasing demand for unique, instantly generated content. Recognizing this growing need, Splash Music joined forces with AWS to develop and scale innovative music generation FMs, democratizing music creation for millions worldwide.
In this post, we’ll explore how Splash Music is setting new standards in AI-powered music creation through its advanced HummingLM model, leveraging AWS Trainium on Amazon SageMaker HyperPod.
The Challenge: Scaling Music Generation
Having already achieved over 600 million streams globally, Splash Music empowers a new generation of creators by making music production accessible, enjoyable, and relevant. However, the journey to unlock this creative freedom posed several challenges:
-
Model Complexity and Scale: Splash Music developed HummingLM, a multi-billion-parameter model capable of capturing the nuances of human humming and converting creative ideas into music tracks. Achieving the fidelity necessary for studio-quality music required significant expansions in computing power and storage.
-
Rapid Pace of Change: As the music industry rapidly evolves, Splash Music must consistently adapt by training and fine-tuning models to keep up with user expectations.
-
Infrastructure Scaling: The previous reliance on external GPU clusters resulted in unpredictable costs and management challenges. The need for a more scalable and cost-effective infrastructure became clear.
Overview of HummingLM: The Innovative Foundation Model
HummingLM is celebrated for its ability to interpret and generate music through a multi-modal approach. The model is built around a transformer-based large language model (LLM) paired with a specialized music encoder upsampler.
Key Innovations in HummingLM:
- Descript-Audio-Codec (DAC): This audio encoding technique provides compressed audio representations, capturing essential frequency and timbre characteristics.
- Transformative Capacity: HummingLM can turn hummed melodies into professional instrumental performances by fusing melodic intent with stylistic cues, allowing users to create high-fidelity tracks simply by humming a tune.
This dual-component architecture allows for faster model training and efficient adaptation to new genres and user preferences.
Accelerating Model Development with AWS Trainium on Amazon SageMaker HyperPod
To fast-track the development of HummingLM, Splash Music collaborated with AWS, utilizing Amazon SageMaker HyperPod and AWS Trainium chips. The architecture leverages best practices to optimize performance and scalability.
Key Stages of Model Development:
-
Dataset Preparation:
- Efficiently process large-scale audio datasets through a robust feature extraction pipeline.
- Resample audio files and generate synthetic representations to enhance extracted features.
- Isolate different musical components using an advanced stem separation system.
-
Model Architecture and Optimization:
- The HummingLM architecture divides responsibilities between the LLM for generating foundational structures and the upsampling component for high-fidelity audio.
- Innovations such as flexible control signal design and zero-shot capability allow the model to adapt to unseen instrument presets rapidly.
-
Training on AWS Trainium:
- Optimize model training for scalability and efficiency by employing techniques like sequence and data parallelism.
- Implement AWS Neuron optimizations to enhance the deployment process.
-
Inference on AWS:
- Deploy the model on Amazon Elastic Container Service (Amazon ECS), ensuring the efficient handling of user submissions and real-time audio processing.
Results and Impact
Through its collaboration with AWS, Splash Music has established a unified infrastructure, generating significant benefits:
-
Automated and Scalable Training: The SageMaker HyperPod manages orchestration, resource allocation, and fault recovery, reducing the need for manual setup and virtually eliminating downtime.
-
Cost and Speed Enhancements: Splash Music decreased training costs by over 54% while doubling their training speed compared to previous GPU solutions, allowing for more extensive model iterations and quicker updates.
-
Industry Recognition: Their innovations have earned Splash Music a spotlight at the AWS Summit Sydney 2025 keynote, underscoring their leadership in the AI-driven music landscape.
Conclusion and Next Steps
Splash Music is revolutionizing how music creators translate their ideas into reality, enabling the generation of personalized tracks that resonate with audiences worldwide. By leveraging AWS’s technological prowess, including SageMaker and Trainium, Splash Music is poised to enhance the creation experience further.
Looking ahead, Splash Music aims to expand its training datasets tenfold and explore multimodal audio/video generation, continuing its commitment to pushing the boundaries of AI in music.
Ready to create your own music? Discover the transformative capabilities of Splash Music and explore Amazon SageMaker HyperPod and AWS Trainium today!