Introducing Stable Diffusion 3.5

Updated October 29th with release of Stable Diffusion 3.5 Medium

Key Takeaways:

  • Today we are introducing Stable Diffusion 3.5. This open release includes multiple model variants, including Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo, and as of October 29th, Stable Diffusion 3.5 Medium. 

  • These models are highly customizable for their size, run on consumer hardware, and are free for both commercial and non-commercial use under the permissive Stability AI Community License

  • You can download all Stable Diffusion 3.5 models from Hugging Face and the inference code on GitHub now.

Today we are releasing Stable Diffusion 3.5, our most powerful models yet. This open release includes multiple variants that are customizable, run on consumer hardware, and are available for use under the permissive Stability AI Community License. You can download Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo models from Hugging Face and the inference code on GitHub now. 

In June, we released Stable Diffusion 3 Medium, the first open release from the Stable Diffusion 3 series. This release didn't fully meet our standards or our communities’ expectations. After listening to the valuable community feedback, instead of a quick fix, we took the time to further develop a version that advances our mission to transform visual media. 

Stable Diffusion 3.5 reflects our commitment to empower builders and creators with tools that are widely accessible, cutting-edge, and free for most use cases. We encourage the distribution and monetization of work across the entire pipeline - whether it's fine-tuning, LoRA, optimizations, applications, or artwork.

What’s being released

Stable Diffusion 3.5 offers a variety of models developed to meet the needs of scientific researchers, hobbyists, startups, and enterprises alike:

  • Stable Diffusion 3.5 Large: At 8.1 billion parameters, with superior quality and prompt adherence, this base model is the most powerful in the Stable Diffusion family. This model is ideal for professional use cases at 1 megapixel resolution.

  • Stable Diffusion 3.5 Large Turbo: A distilled version of Stable Diffusion 3.5 Large generates high-quality images with exceptional prompt adherence in just 4 steps, making it considerably faster than Stable Diffusion 3.5 Large.

  • Stable Diffusion 3.5 Medium: At 2.5 billion parameters, with improved MMDiT-X architecture and training methods, this model is designed to run “out of the box” on consumer hardware, striking a balance between quality and ease of customization. It is capable of generating images ranging between 0.25 and 2 megapixel resolution. 

Developing the models

In developing the models, we prioritized customizability to offer a flexible base to build upon. To achieve this, we integrated Query-Key Normalization into the transformer blocks, stabilizing the model training process and simplifying further fine-tuning and development.

To support this level of downstream flexibility, we had to make some trade-offs. Greater variation in outputs from the same prompt with different seeds may occur, which is intentional as it helps preserve a broader knowledge-base and diverse styles in the base models. However, as a result, prompts lacking specificity might lead to increased uncertainty in the output, and the aesthetic level may vary. 

For the Medium model specifically, we made several adjustments to the architecture and training protocols to enhance quality, coherence, and multi-resolution generation abilities.

Where the models excel

The Stable Diffusion 3.5 version excels in the following areas, making it one of the most customizable and accessible image models on the market, while maintaining top-tier performance in prompt adherence and image quality:

  • Customizability: Easily fine-tune the model to meet your specific creative needs, or build applications based on customized workflows.

  • Efficient Performance: Optimized to run on standard consumer hardware without heavy demands, especially the Stable Diffusion 3.5 Medium and Stable Diffusion 3.5 Large Turbo models. 

    We took a look at the hardware compatibility for running Stable Diffusion 3.5 Medium alongside other open-image base models. This model only requires 9.9 GB of VRAM (excluding text encoders) to unlock its full performance, making it highly accessible and compatible with most consumer GPUs.

  • Diverse Outputs: Creates images representative of the world, not just one type of person, with different skin tones and features, without the need for extensive prompting. 

  • Versatile Styles: Capable of generating a wide range of styles and aesthetics like 3D, photography, painting, line art, and virtually any visual style imaginable.

Additionally, our analysis shows that Stable Diffusion 3.5 Large leads the market in prompt adherence and rivals much larger models in image quality.

Stable Diffusion 3.5 Large Turbo offers some of the fastest inference times for its size, while remaining highly competitive in both image quality and prompt adherence, even when compared to non-distilled models of similar size

Stable Diffusion 3.5 Medium outperforms other medium-sized models, offering a balance of prompt adherence and image quality, making it a top choice for efficient, high-quality performance.

The Stability AI Community license at a glance

We are pleased to release this model under our permissive community license. Here are the key components of the license:

  • Free for non-commercial use: Individuals and organizations can use the model free of charge for non-commercial use, including scientific research.  

  • Free for commercial use (up to $1M in annual revenue): Startups, small to medium-sized businesses, and creators can use the model for commercial purposes at no cost, as long as their total annual revenue is less than $1M.

  • Ownership of outputs: Retain ownership of the media generated without restrictive licensing implications.

For organizations with annual revenue more than $1M, please contact us here to inquire about an Enterprise License.

More ways to access the models

While the model weights are available on Hugging Face now for self-hosting, you can also access the model through the following platforms:

Our commitment to safety

We believe in safe, responsible AI practices and take deliberate measures to ensure Integrity starts at the early stages of development. This means we have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3.5 by bad actors. For more information about our approach to Safety please visit our Stable Safety page.

Coming soon

We will also launch ControlNets soon, providing advanced control features for a wide variety of professional use cases.

We look forward to hearing your feedback on Stable Diffusion 3.5 and seeing what you create with the models. You can share thoughts directly with us through this form.

To stay updated on our progress follow us on X, LinkedIn, Instagram, and join our Discord Community.

Previous
Previous

Expanding Our Collaboration with Amazon: Stable Diffusion 3.5 Large is Now Available in Amazon SageMaker JumpStart

Next
Next

James Cameron, Academy Award-Winning Filmmaker, Joins Stability AI Board of Directors