Introducing Stable Audio Open - An Open Source Model for Audio Samples and Sound Design
Key Takeaways:
Stable Audio Open is an open source text-to-audio model for generating up to 47 seconds of samples and sound effects.
Users can create drum beats, instrument riffs, ambient sounds, foley and production elements.
The model enables audio variations and style transfer of audio samples.
We’re excited to announce Stable Audio Open, an open source model optimised for generating short audio samples, sound effects and production elements using text prompts. This release marks a key milestone as we further open portions of our generative audio capabilities to empower sound designers, musicians and creative communities.
What is Stable Audio Open?
Stable Audio Open allows anyone to generate up to 47 seconds of high-quality audio data from a simple text prompt. Its specialised training makes it ideal for creating drum beats, instrument riffs, ambient sounds, foley recordings and other audio samples for music production and sound design.
A key benefit of this open source release is that users can fine-tune the model on their own custom audio data. For example, a drummer could fine-tune on samples of their own drum recordings to generate new beats.
How is it Different from Stable Audio?
Our commercial Stable Audio product produces high-quality, full tracks with coherent musical structure up to three minutes in length, as well as advanced capabilities like audio-to-audio generation and coherent multi-part musical compositions.
Stable Audio Open, on the other hand, specialises in audio samples, sound effects and production elements. While it can generate short musical clips, it is not optimised for full songs, melodies or vocals. This open model provides a glimpse into generative AI for sound design while prioritising responsible development alongside creative communities.
The new model was trained on audio data from Freesound and the Free Music Archive. This allowed us to create an open audio model while respecting creator rights.
Getting Started
The Stable Audio Open model weights are available on Hugging Face. We encourage sound designers, musicians, developers and audio enthusiasts to download the model, explore its capabilities and provide feedback.
While an exciting step forward, this is still just the beginning for open and responsible audio generation capabilities. We look forward to continuing research and prioritizing development hand-in-hand with creative communities. Let the open exploration of AI audio begin!
To stay updated on our progress follow us on Twitter, Instagram, LinkedIn, and join our Discord Community.