Introducing Stable LM 2 12B

Key takeaways

  • Stable LM 2 12B is a pair of powerful 12 billion parameter language models trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model during release. You can now try Stable LM 2 12B live here.

  • Both models are available for testing on Hugging Face (base & chat) and can be utilized non-commercially as well as commercially with a Stability AI Membership.

  • This release includes an update to Stable LM 2 1.6B which improves its conversational skills in all of the seven aforementioned languages and incorporates tool usage and function calling.

Introducing the latest additions to our Stable LM 2 language model series: a 12 billion parameter base model and an instruction-tuned variant, trained on 2 trillion tokens in seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch. This medium-sized model balances strong performance, efficiency, memory requirements, and speed, following our established Stable LM 2 1.6B framework as detailed in our previously released technical report. With this release, we’re extending our model range, offering a transparent and powerful tool for developers to innovate in AI language technology. Soon, we plan to introduce a long-context variant of these models which will be available on Hugging Face upon release.


Today, we are also releasing a new version of Stable LM 2 1.6B, improving its conversation abilities in the same seven languages while retaining its remarkably low system requirements. The original release of Stable LM 2 1.6B has already achieved a leading position on the Open LLM Leaderboard, demonstrating its exceptional performance for its size.


Capabilities

Stable LM 2 12B is designed as an efficient open model tailored for multilingual tasks with smooth performance on widely available hardware. This model can handle a variety of tasks that are typically feasible only for significantly larger models, which often necessitate substantial computational and memory resources, such as large Mixture-of-Experts (MoEs). Moreover, the instruction-tuned version features high performance in tool usage and function calling, allowing it to be perfectly suited for various uses including as a central part of retrieval RAG systems.


Performance

We compare Stable LM 2 12B to other popular strong language models such as Mixtral (MoE, 13B active parameters out of 47B in total), Llama2 (13B & 70B), Qwen 1.5 (14B), Gemma (8.5B), and Mistral (7B). As shown below, the new Stable LM 2 12B offers solid performance when tested on zero-shot and few-shot tasks across general benchmarks outlined in the Open LLM Leaderboard and (the newly corrected) MT-Bench.

MT Bench (Inflection-corrected)

Open LLM Leaderboard (instruct models)

Open LLM Leaderboard (Base models)

0-Shot NLP Tasks (Base Models)

With this new release, we extend the StableLM 2 family of models into the 12B category, providing an open and transparent model that makes no compromise on power and accuracy. We are confident this new release will enable developers and businesses to continue developing the future while retaining full control over their data.

Stable LM 2 12B can be used now for commercial and non-commercial purposes with a Stability AI Membership. Learn more about commercial applications by contacting us here

To stay updated on our progress, follow us on Twitter, Instagram, LinkedIn, and join our Discord Community.

Previous
Previous

MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data

Next
Next

Introducing Stable Audio 2.0