Search

Hugging Face Introduces Compact Versions of SmolVLM Vision Language Model That Can Run on Consumer Laptops

Hugging Face claimed that the SmolVLM-256M is the world’s smallest vision language model.

Advertisement
Highlights
  • The new SmolVLM models are available in 256M and 500M parameter sizes
  • SmolVLM can analyse images and process visual information at high speeds
  • The open-source models are available with an Apache 2.0 licence
Hugging Face Introduces Compact Versions of SmolVLM Vision Language Model That Can Run on Consumer Laptops

Hugging Face introduced the SmolVLM 2B model in December 2024

Photo Credit: Hugging Face

Hugging Face introduced two new variants to its SmolVLM vision language models last week. The new artificial intelligence (AI) models are available in 256 million and 500 million parameter sizes, with the former being claimed as the world's smallest vision model by the company. The new variants focus on retaining the efficiency of the older two-billion parameter model while reducing the size significantly. The company highlighted that the new models can be locally run on constrained devices, consumer laptops, or even potentially browser-based inference.

Hugging Face Introduces Smaller SmolVLM AI Models

In a blog post, the company announced the SmolVLM-256M and SmolVLM-500M vision language models, in addition to the existing 2 billion parameter model. The release brings two base models and two instruction fine-tuned models in the abovementioned parameter sizes.

Hugging Face said that these models can be loaded directly to transformers, Machine Learning Exchange (MLX), and Open Neural Network Exchange (ONNX) platforms and developers can build on top of the base models. Notably, these are open-source models available with an Apache 2.0 licence for both personal and commercial usage.

With the new AI models, Hugging Face aims to bring multimodal models focused on computer vision to portable devices. The 256 million parameter model, for instance, can be run on less than one GB of GPU memory and 15GB of RAM to process 16 images per second (with a batch size of 64).

Andrés Marafioti, a machine learning research engineer at Hugging Face told VentureBeat, “For a mid-sized company processing 1 million images monthly, this translates to substantial annual savings in compute costs.”

To reduce the size of the AI models, the researchers switched the vision encoder from the previous SigLIP 400M to a 93M-parameter SigLIP base patch. Additionally, the tokenisation was also optimised. The new vision models encode images at a rate of 4096 pixels per token, compared to 1820 pixels per token in the 2B model.

Notably, the smaller models are also marginally behind the 2B model in terms of performance, but the company said this trade-off has been kept at a minimum. As per Hugging Face, the 256M variant can be used for captioning images or short videos, answering questions about documents, and basic visual reasoning tasks.

Developers can use transformers and MLX for inference and fine-tuning the AI model as they work with the old SmolVLM code out-of-the-box. These models are also listed on Hugging Face.

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

 
Show Full Article
Please wait...
Advertisement

Related Stories

Popular Mobile Brands
  1. Moto G96 5G Launched in India With 50-Megapixel Sony Lytia 700C Camera
  2. OnePlus Nord 5, OnePlus Nord CE 5 Launched in India at These Prices
  3. OnePlus Nord 5 Review
  4. Samsung Galaxy Buds 3 Pro's Amazon Prime Day 2025 Offer Revealed
  5. AI+ Pulse, AI+ Nova 5G With 50-Megapixel Rear Cameras Launched in India
  6. Oppo Reno 14 Gets a New Variant With a Colour Changing Rear Panel
  7. Google Pixel Phones Receiving Monthly Software Update for July 2025
  8. Nothing Phone 3 Review: Enters the Big League With a Big Price
  9. OnePlus Nord CE 5 Review
  10. Ai+ Wearbuds Smartwatch Launched in India With Built-In TWS Earbuds
  1. Ghost of Yotei Is Getting a Gameplay Deep Dive at a State of Play Livestream This Week
  2. Moto G96 5G Launched in India With Snapdragon 7s Gen 2 SoC, 50-Megapixel Sony Lytia 700C Rear Camera
  3. Realme 15 Pro 5G Confirmed to Get Snapdragon 7 Gen 4 SoC, GT Boost 3.0 for Gaming
  4. Intel's Arrow Lake Refresh With Upgraded NPU to Bring Microsoft Copilot+ Features to Desktop PCs: Report
  5. Zoom, Meta Join Forces To Roll Out Standalone App for Its VR Headsets
  6. iPhone 17 Air Dummy Unit Surfaces in Hands-on Video, Showcasing Thin Design
  7. OnePlus Pad Lite Launched With 11-Inch Display, 9,340mAh Battery: Price, Specifications
  8. Gmail Announces Manage Subscriptions View for Decluttering Inbox on Android, iOS and Web
  9. Google Pixel Phones Receiving Android 16-Based Monthly Software Update for July 2025: What’s New
  10. Samsung Galaxy Unpacked 2025 Event Today: Galaxy Z Fold 7, Z Flip 7 Launch Expected, How to Watch Livestream
Gadgets 360 is available in
Download Our Apps
App Store App Store
Available in Hindi
App Store
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.
Trending Products »
Latest Tech News »