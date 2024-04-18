Stable Diffusion 3 and Stable Diffusion 3 Turbo models were unveiled in preview in February. Now, Stability AI is finally making the artificial intelligence (AI) text-to-image models available for some users. The company will let developers access the AI model through the Stability AI Developer Platform API. It has partnered with the API platform Fireworks AI to bring the models to the public. Notably, the next-generation AI image models by the AI firm come with improved text understanding and spelling capabilities.

Stability AI announced the limited availability of the AI models via a post in its newsroom, and said, “As revealed in the Stable Diffusion 3 research paper, this model is equal to or outperforms state-of-the-art text-to-image generation systems such as DALL-E 3 and Midjourney v6 in typography and prompt adherence, based on human preference evaluations.”

The new text-to-image models have two noteworthy upgrades. First, its understanding of the prompt text has improved. It can now understand the contextual knowledge within the prompt better and can generate images which are closer to what the user desires. It also has improved spelling capabilities. This will help when a user wants to generate an image with written words in it. The company highlighted earlier that the AI will take a closer look at what's being written and offer better output. Overall image quality is also expected to be improved.

These new AI models will also be open-sourced in the near future, at least to some extent. The company said that it will make the model weights available for self-hosting with a Stability AI Membership soon. Stability AI also explained that it used a new Multimodal Diffusion Transformer (MMDiT) architecture for the model.

Apart from the AI image generators, Stability AI also invited a limited number of users to participate in the early release of its Stable Assistant which is currently in beta. The AI assistant is powered by Stable Diffusion 3, and Stable LM 2 12B which adds conversational capabilities. It can generate images from conversations, generate content, as well as improve content to match the generated image. Currently, it is not known when the company might release the new AI image models to all members.

