Mistral Announces Pixtral 12B Multimodal AI Model With 'Computer Vision' Feature

Mistral’s Pixtral 12B AI model can accept images as input and answer queries about them.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 12 September 2024 13:49 IST
Highlights
  • Pixtral 12B cannot generate images
  • Mistral’s new AI model has a size of 24GB
  • Pixtral 12B will also be available on le Chat and la Plateforme soon

Pixtral 12B is built on Mistral’s Nemo 12B large language model

Photo Credit: Unsplash/Solen Feyissa

Mistral released its first multimodal artificial intelligence (AI) model dubbed Pixtral 12B on Wednesday. The AI firm, known for its open-source large language models (LLMs), has also made the latest AI model available on GitHub and Hugging Face for users to download and test out. Notably, despite being multimodal, Pixtral can only process images using computer vision technology and answer queries about them. Two special encoders have been added for this functionality. It cannot generate images like the Stable Diffusion models or Midjourney's Generative Adversarial Networks (GANs).

Mistral Releases Pixtral 12B

Gaining a reputation for minimalist announcements, the official account of Mistral on X (formerly known as Twitter) released the AI model in a post by sharing its magnet link. The total file size of Pixtral 12B is 24GB, and it will require an NPU-enabled PC or one with a powerful GPU to run the model.

The Pixtral 12B comes with 12 billion parameters and is built using the company's existing Nemo 12B AI model. Mistral highlights users will also need the Gaussian Error Linear Unit (GeLU) as the vision adapter and 2D Rotary Position Embedding (RoPE) as the vision encoder.

Advertisement

Notably, users can upload image files or URLs to the Pixtral 12B and it should be able to answer queries about the image such as identifying the objects, counting the number of objects, and sharing additional information. Since it is built on Nemo, the model will also be adept at completing all the typical text-based tasks as well.

A Reddit user posted an image about the benchmarking scores of Pixtral 12B, and it appears that the LLM outperforms Claude-3 Haiku and Phi-3 Vision in multimodal capabilities on the ChartQA bench. It also outperforms both rival AI models on the Massive Multitask Language Understanding (MMLU) bench for multimodal knowledge and reasoning.

Citing the company spokesperson, TechCrunch reports that the Mistral AI model can be fine-tuned and used under an Apache 2.0 license. This means the outputs from the model can be used for personal or commercial usage without restrictions. Additionally, Sophia Yang, the Head of Developer Relations at Mistral clarified in a post that Pixtral 12B will soon be available on Le Chat and Le Platforme.

Advertisement

For now, users can directly download the AI model using the magnet link provided by the company. Alternatively, the model weights have also been hosted on Hugging Face and GitHub listings.

 

Catch the latest from the Consumer Electronics Show on Gadgets 360, at our CES 2026 hub.

Advertisement

Related Stories

Popular Mobile Brands
  1. Redmi Note 15 Pro Series Might Launch in India With These Storage Options
  2. Samsung Galaxy S26 Ultra May Arrive in Six Colourways
  3. Here's How Much the Motorola Signature Could Cost in India
  4. Bindiya Ke Bahubali Season 2 OTT Release Date: Know Everyting About Cast, Plot, and Mo
  5. OnePlus Says India Operations 'Normal' Amid Claims of Internal Collapse
  6. Motorola Edge 70 Fusion Leak Reveals Full Specifications Ahead of Launch
  7. Moto G67, Moto G77 Specifications Leaked; Could Launch Soon
  8. Oppo A6 5G Launched in India With 7,000mAh Battery at This Price
  9. OpenAI's Age Prediction System to Detect Underage Users Is Rolling Out
  10. Adobe Brings New Capabilities to Premiere Pro and After Effects
  1. Oppo Reno 15 FS 5G Launched With 6,500mAh Battery, 80W Fast Charging and Snapdragon 6 Gen 1 SoC
  2. Samsung Qi2 Power Bank for Galaxy S26 Series With 15W Wireless Charging Leaked Online
  3. Oppo Find X9 Ultra Design Spotted in Real-Life Images With Bigger Telephoto Kit
  4. OpenAI’s First Mystery AI Device Is Reportedly an Audio Headset, Not an AI Pen
  5. Motorola Signature Price in India Tipped Ahead of January 23 Launch in India: Expected Specifications
  6. Retta Thala Now Streaming on Prime Video: What You Need to Know About This Tamil Crime Thriller
  7. OpenAI’s Age Prediction System to Detect Underage ChatGPT Users Is Now Rolling Out
  8. Life Is Strange: Reunion Officially Announced, Launch Set for March 26
  9. Moto G67, Moto G77 Chipset, Memory and Camera Specifications Leaked, Could Launch Soon
  10. Redmi Turbo 5 Max Charging Details Revealed as Pre-Reservations Begin Ahead of China Launch
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.