Google Introduces Gemini 3.1 Flash-Lite as Its Fastest and Most Cost-Efficient AI Model

The Gemini 3.1 Flash-Lite is rolling out in preview via the Gemini API in AI Studio and Vertex AI.

Advertisement
Written by Akash Dutta, Edited by Rohan Pal | Updated: 5 March 2026 15:32 IST
Highlights
  • The AI model costs $0.25 per million input tokens
  • Gemini 3.1 Flash-Lite costs $1.50 per million output tokens
  • Google claims the new model outperforms 2.5 Flash in response speed

Gemini 3.1 Flash-Lite achieved an Elo score of 1432 on the Arena.ai Leaderboard

Photo Credit: Google

Google introduced the Gemini 3.1 Flash-Lite artificial intelligence (AI) model on Thursday. Calling it the fastest and the most cost-efficient AI model in the Gemini 3 series, the Mountain View-based tech giant said it is designed for high-volume developer workloads. The model is currently not available to end users and has been reserved for developers and enterprises via specific channels. The company also claimed that the model's output speed is higher than that of the 2.5 series. Notably, the Gemini 3.1 Flash-Lite is currently only available in preview.

Gemini 3.1 Flash-Lite Is Here

In a blog post, the tech giant announced and detailed its latest Gemini 3.1 series large language model (LLM). Currently, the Gemini 3.1 Flash-Lite can be accessed in preview via the Gemini application programming interface (API) in Google AI Studio, and via Vertex AI for enterprises.

Advertisement

Coming to capabilities, the company said the 3.1 Flash-Lite outperforms 2.5 Flash with a “2.5X faster Time to First Answer Token,” and a 45 percent increase in output speed, citing the Artificial Analysis benchmark. It is also said to have achieved an Elo score of 1432 on the Arena.ai leaderboard. It is also claimed to outperform GPT-5 mini, Claude 4.5 Haiku, and Grok 4.1 Fast in terms of output speed.

In AI Studio and Vertex AI, developers will be able to access the LLM in standard and thinking modes, with the latter allowing users to control the thinking time for a task. Highlighting some use cases, Google said the model can handle high-volume translation and content moderation, and can also be used for complex tasks, such as generating user interfaces and dashboards, creating simulations, or just following instructions.

Advertisement

The company also claimed that the Gemini 3.1 Flash-Lite is a cost-efficient AI model, with one million input tokens priced at $0.25 (roughly Rs. 23) and output tokens priced at $1.5 (roughly Rs. 137) per million tokens. In comparison, the Gemini 2.5 Flash costs $0.3 (roughly Rs. 27.5) per million input and $2.5 (roughly Rs. 229) per million output tokens.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. NASA Observes Rare Sungrazer Comet Disintegration Near the Sun
  1. NASA Observes Rare Sungrazer Comet Disintegration Near the Sun
  2. Kolaiseval Out on OTT: Know Everything About This Tamil Psychological Thriller Film Online
  3. Band Melam OTT Release Date Revealed: Know When and Where to Stream it Online
  4. LEGO Friends: The Next Chapter Season 4 Now Streaming on Netflix: What You Need to Know
  5. Small NASA Satellite Could Reveal How Lightning Impacts Space Weather
  6. Piece by Piece: Pharrell Williams’ LEGO Documentary Now Streaming on Netflix
  7. Ustaad Bhagat Singh OTT Release: When & Where to Watch Pawan Kalyan’s Telugu Film Online
  8. Battleground Season 2 Now on OTT: Know Where to Watch This Ultimate Fitness Reality Show Online
  9. Apne Paraye Out on OTT: Know Where to Watch This Hindi Dub of Bengali Drama Series
  10. Scientists Just Created the Largest 3D Map of the Universe Ever to Study Dark Energy
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.