Today, we are releasing Stable Video Diffusion, our first foundation model for generative video based on the image model Stable Diffusion.

Now available in research preview, this state-of-the-art generative AI video model represents a significant step in our journey toward creating models for everyone of every type.

With this research release, we have made the code for Stable Video Diffusion available on our GitHub repository & the weights required to run the model locally can be found on our Hugging Face page. Further details regarding the technical capabilities of the model can be found in our research paper.

Adaptable to Numerous Video Applications

Our video model can be easily adapted to various downstream tasks, including multi-view synthesis from a single image with finetuning on multi-view datasets. We are planning a variety of models that build on and extend this base, similar to the ecosystem that has built around stable diffusion.

Sample multi-view generations from our finetuned video model

In addition, today, you can sign up for our waitlist here to access a new upcoming web experience featuring a Text-To-Video interface. This tool showcases the practical applications of Stable Video Diffusion in numerous sectors, including Advertising, Education, Entertainment, and beyond.

Competitive in Performance

Stable Video Diffusion is released in the form of two image-to-video models, capable of generating 14 and 25 frames at customizable frame rates between 3 and 30 frames per second. At the time of release in their foundational form, through external evaluation, we have found these models surpass the leading closed models in user preference studies.

Exclusively for Research

While we eagerly update our models with the latest advancements and work to incorporate your feedback, we emphasize that this model is not intended for real-world or commercial applications at this stage. Your insights and feedback on safety and quality are important to refining this model for its eventual release.

This aligns with our previous releases in new modalities, and we look forward to sharing the full release with you all.

Our Ever-Expanding Suite of AI Models

Stable Video Diffusion is a proud addition to our diverse range of open-source models. Spanning across modalities including image, language, audio, 3D, and code, our portfolio is a testament to Stability AI’s dedication to amplifying human intelligence.

Stay updated on our progress by signing up for our newsletter and discovering more about commercial applications by contacting us here.

Follow us on Twitter, Instagram, LinkedIn, and join our Discord Community.

Introducing Stable Video Diffusion

Introducing SDXL Turbo: A Real-Time Text-to-Image Generation Model

Clipdrop Launches Real Estate Tools