Stable Diffusion Now Optimized for AMD Radeon™ GPUs and Ryzen™ AI APUs
Key Takeaways
We’ve collaborated with AMD to deliver select ONNX-optimized versions of the Stable Diffusion family of models, engineered to run faster and more efficiently on AMD Radeon™ GPUs and Ryzen™ AI APUs.
AMD-optimized versions of Stable Diffusion 3.5 Large, Stable Diffusion 3.5 Large Turbo, Stable Diffusion XL 1.0, and Stable Diffusion XL Turbo are now available on Hugging Face and suffixed with “_amdgpu”. End users can try out the AMD optimized models using Amuse 3.0.
You can learn more about the technical details of these speed upgrades on AMD’s blog post.
We’ve collaborated with AMD to deliver select ONNX-optimized versions of the Stable Diffusion model family, engineered to run faster and more efficiently on AMD Radeon™ GPUs and Ryzen™ AI APUs. This joint engineering effort focused on maximizing inference performance without compromising model output quality or our open licensing.
The result is a set of accelerated models that integrate into any ONNX Runtime-supported environment, making it easy to drop them into your existing workflows right out of the box. Whether you’re deploying Stable Diffusion 3.5 (SD3.5) variants, our most advanced image model, or Stable Diffusion XL Turbo (SDXL Turbo), these models are ready to power faster creative applications on AMD hardware.
As generative visual media adoption accelerates, it’s essential our models are optimized for leading hardware. This collaboration ensures builders and businesses can integrate Stable Diffusion into their production pipelines, making workflows faster, more efficient, and ready to scale.
Available models
AMD has optimized four models across SD3.5 and SDXL for improved performance.
SD3.5 Version:
AMD-optimized SD3.5 models deliver up to 2.6x faster inference when compared to the base PyTorch models.
SDXL Version:
With AMD optimization, SDXL 1.0 and SDXL Turbo achieve up to 3.8x faster inference, when compared to the base PyTorch models.
Analysis compares AMD-optimized model inference speed to the base PyTorch models. Testing was conducted using Amuse 3.0 RC and AMD Adrenalin 24.30.31.05 KB driver - 25.4.1 preview.
Get started
The AMD-optimized Stable Diffusion models are available now on Hugging Face and suffixed with “_amdgpu”. End users can also try out the AMD optimized models using Amuse 3.0.You can learn more about the technical details of these speed upgrades on AMD’s blog post.
To stay updated on our progress, follow us on X,LinkedIn, Instagram, and join our Discord Community.