Hugging Face Releases SmolVLA Open Source AI Model For Robotics Workflows

Hugging Face’s SmolVLA is a vision language action (VLA) model with 450 million parameters.

Advertisement
Written by Akash Dutta, Edited by David Delima | Updated: 5 June 2025 19:59 IST
Highlights
  • SmolVLA works with affordable robotics hardware such as SO-100 and SO-101
  • The AI model supports real-time robotics workflows
  • Hugging Face claimed the model can outperform larger VLA models

The company said the model is trained on compatibly licensed, open-source community-shared datasets

Photo Credit: Unsplash/Aideal Hwa

Hugging Face on Tuesday released SmolVLA, an open source vision language action (VLA) artificial intelligence (AI) model. The large language model is aimed at robotics workflows and training-related tasks. The company claims that the AI model is small and efficient enough to run locally on a computer with a single consumer GPU, or a MacBook. The New York, US-based AI model repository also claimed that SmolVLA can outperform models that are much large than it. The AI model is currently available to download.

Hugging Face's SmolVLA AI Model Can Run Locally on a MacBook

According to Hugging Face, advancements in robotics have been slow, despite the growth in the AI space. The company says that this is due to a lack of high-quality and diverse data, and large language models (LLMs) that are designed for robotics workflows.

Advertisement

VLAs have emerged as a solution to one of the problems, but most of the leading models from companies such as Google and Nvidia are proprietary and are trained on private datasets. As a result, the larger robotics research community, which relies on open-source data, faces major bottlenecks in reproducing or building on these AI models, the post highlighted.

These VLA models can capture images, videos, or direct camera feed, understand the real-world condition and then carry out a prompted task using robotics hardware.

Advertisement

Hugging Face says SmolVLA addresses both the pain points currently faced by the robotics research community — it is an open-source robotics-focused model which is trained on an open dataset from the LeRobot community. SmolVLA is a 450 million parameter AI model which can run on a desktop computer with a single compatible GPU, or even one of the newer MacBook devices.

Coming to the architecture, it is built on the company's VLM models. It consists of a SigLip vision encoder and a language decoder (SmolLM2). The visual information is captured and extracted via the vision encoder, while natural language prompts are tokenised and fed into the decoder.

Advertisement

When dealing with movements or physical action (executing the task via a robotic hardware), sensorimotor signals are added to a single token. The decoder then combines all of this information into a single stream and processes it together. This enables the model in understanding the real-world data and task at hand contextually, and not as separate entities.

SmolVLA sends everything it has learned to another component called the action expert, which figures out what action to take. The action expert is a transformer-based architecture with 100 million parameters. It predicts a series of future moves for the robot (walking steps, arm movements, etc), also known as action chunks.

Advertisement

While it applies to a niche demographic, those working with robotics can download the open weights, datasets, and training recipes to either reproduce or build on the SmolVLA model. Additionally, robotics enthusiasts who have access to a robotic arm or similar hardware can also download these to run the model and try out real-time robotics workflows.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. LASD Issues Warning Over Crypto Scams Ahead of FIFA World Cup 2026
  2. Xiaomi 17T First Impressions
  3. Samsung Galaxy Z Fold 8 Series Could Get Carbon Standing Case: Report
  4. Amazfit Balance 3, Balance Ultra Launched With Hyrox Tools, Up to 30-Day Battery Life
  5. Infinix Smart 20 to Launch in India Next Week With These Features
  6. Xiaomi TV FX Mini LED Series With Up to 75-Inch Screen Launched in India
  7. iPhone 18 Pro Max Leak Suggests It Has the Same Thickness as This iPhone
  8. Xiaomi 17T Launches in India With Leica-Tuned Triple Rear Cameras
  9. Gram Chikitsalay Season 2 OTT Release Date: When and Where to Watch it Online?
  10. Motorola Edge 70 Pro+ vs Vivo V70 vs Nothing Phone 4a Pro Compared
  1. Sun Unleashes Triple Solar Flare Blast, Triggering G3 Geomagnetic Storm Alert
  2. Tomb Raider: Legacy of Atlantis Gets AI Disclosure on Steam, Crystal Dynamics Clarifies AI Use
  3. iPhone 18 Pro Max Leak Hints at No Significant Changes to Smartphone's Thickness Over Predecessor
  4. OnePlus 16 and iQOO 16 Development Progressing 'Rapidly', Could Launch Sooner Than Expected, Tipster Claims
  5. Nintendo Switch 2 Could Get a Removable Battery Variant Next Year to Comply With EU Regulations
  6. FIFA World Cup 2026: LASD Issues Warning Over Crypto Scams Days Ahead of World Cup
  7. Dridam OTT Release Date: When and Where to Watch Shane Nigam’s Crime Thriller Online
  8. Gram Chikitsalay Season 2 OTT Release Date: When and Where to Watch it Online?
  9. Samsung Reportedly Developing Carbon Standing Case for Galaxy Z Fold 8, Galaxy Z Fold 8 Ultra
  10. Vi Unveils Silent Mobile Verification for ‘Faster’ Verification on WhatsApp, Instagram and Facebook in India
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.