Stable Video Diffusion

Stable Video Diffusion is designed to serve a wide range of video applications in fields such as media, entertainment, education, marketing. It empowers individuals to transform text and image inputs into vivid scenes and elevates concepts into live action, cinematic creations.

Read the Paper
Try Now

Stable Video Diffusion Specifications

Stable Video Diffusion is released in the form of two image-to-video models, capable of generating 14 and 25 frames at customizable frame rates between 3 and 30 frames per second.

At the time of release in their foundational form, we have found these models surpass the leading closed models in user preference studies.

Video duration

2-5 seconds


Frame rate

up to 30 FPS (frames per second)


Processing time

2 minutes or less


Build with Stability AI 

Stability AI licenses offer flexibility for your generative AI needs by combining our range of state-of-the-art open models with self-hosting benefits.