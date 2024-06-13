Technology News
Stable Diffusion 3 Medium is a two billion parameter image generation model.

Written by Akash Dutta, Edited by David Delima | Updated: 13 June 2024 14:34 IST
Photo Credit: Hugging Face/ Stability AI

Stability AI says the Stable Diffusion 3 Medium is its most advanced text-to-image AI model

Highlights
  • Stable Diffusion 3 Medium’s open model weights are hosted on Hugging Face
  • Reportedly, the AI model has a minimum requirement of 5GB of GPU VRAM
  • Stable DIffusion 3 Medium is also available via API and Stable Artisan
Stability AI on Wednesday released a smaller version of its Stable Diffusion 3 (SD3) artificial intelligence (AI) model. Dubbed Stable Diffusion 3 Medium, the company introduced the smaller text-to-image model as its most advanced image generation model. While retaining all the functionality of the larger generative AI model, the latest tool has lower GPU requirements and consumes less power that previous models. The open weights have also been made available on Hugging Face, and the company says that this AI model can run efficiently on consumer PCs and laptops.

Stability AI Introduces Stable Diffusion 3 Medium

While the Stable Diffusion 3 model (which is now being called the Stable Diffusion 3 Large) became publicly available in April, its high GPU and compute requirements made it difficult for most people with a consumer-grade PC or laptop to run efficiently. The company is solving this problem by offering Stable Diffusion 3 Medium, which can run on most laptops and PCs.

According to a report by VentureBeat, the minimum requirement for the AI model is 5GB of GPU VRAM and the recommended requirement is 16GB of GPU VRAM. Notably, the Nvidia GeForce RTX 3090 features 24GB of GDDR6X VRAM.

Despite the smaller size of two billion parameters (as opposed to SD3 Large's eight billion parameters), Stability AI said in a newsroom post that the Stable Diffusion 3 Medium will be able to show a similar level of efficiency as its larger counterpart. The latest image generation model will deliver detailed photorealistic outputs as well as high-quality outputs in flexible styles. To improve realism in hands and faces, the AI firm is using a 16-channel VAE (Variational Autoencoder).

Prompt adherence will also be at the same level as the larger model. SD3 Medium can understand complex prompts that include spatial reasoning, compositional elements, actions, and styles. Further, typography, which has been a common pitfall of image generation models, has also been improved in the latest AI model, added the company.

Stable Diffusion 3 Medium has been made generally available via the company's Fireworks AI-powered API (App Programming Interface). The text-to-image AI model can also be accessed via the Stable Assistant platform or the Stable Artisan Discord server. Further open weights have also been made available with a non-commercial licence on Hugging Face. To use it for commercial purposes, users will have to get a creator licence from the company.

Affiliate links may be automatically generated - see our ethics statement for details.
Further reading: Stable Diffusion 3 Medium, Stability AI, Stable Diffusion, AI, Artificial Intelligence
Akash Dutta
Akash Dutta
