• Home
  • Ai
  • Ai News
  • Google Introduces Gemini 3.1 Flash Lite as Its Fastest and Most Cost Efficient AI Model

Google Introduces Gemini 3.1 Flash-Lite as Its Fastest and Most Cost-Efficient AI Model

The Gemini 3.1 Flash-Lite is rolling out in preview via the Gemini API in AI Studio and Vertex AI.

Google Introduces Gemini 3.1 Flash-Lite as Its Fastest and Most Cost-Efficient AI Model

Photo Credit: Google

Gemini 3.1 Flash-Lite achieved an Elo score of 1432 on the Arena.ai Leaderboard

Click Here to Add Gadgets360 As A Trusted Source As A Preferred Source On Google
Highlights
  • The AI model costs $0.25 per million input tokens
  • Gemini 3.1 Flash-Lite costs $1.50 per million output tokens
  • Google claims the new model outperforms 2.5 Flash in response speed
Advertisement

Google introduced the Gemini 3.1 Flash-Lite artificial intelligence (AI) model on Thursday. Calling it the fastest and the most cost-efficient AI model in the Gemini 3 series, the Mountain View-based tech giant said it is designed for high-volume developer workloads. The model is currently not available to end users and has been reserved for developers and enterprises via specific channels. The company also claimed that the model's output speed is higher than that of the 2.5 series. Notably, the Gemini 3.1 Flash-Lite is currently only available in preview.

Gemini 3.1 Flash-Lite Is Here

In a blog post, the tech giant announced and detailed its latest Gemini 3.1 series large language model (LLM). Currently, the Gemini 3.1 Flash-Lite can be accessed in preview via the Gemini application programming interface (API) in Google AI Studio, and via Vertex AI for enterprises.

Coming to capabilities, the company said the 3.1 Flash-Lite outperforms 2.5 Flash with a “2.5X faster Time to First Answer Token,” and a 45 percent increase in output speed, citing the Artificial Analysis benchmark. It is also said to have achieved an Elo score of 1432 on the Arena.ai leaderboard. It is also claimed to outperform GPT-5 mini, Claude 4.5 Haiku, and Grok 4.1 Fast in terms of output speed.

In AI Studio and Vertex AI, developers will be able to access the LLM in standard and thinking modes, with the latter allowing users to control the thinking time for a task. Highlighting some use cases, Google said the model can handle high-volume translation and content moderation, and can also be used for complex tasks, such as generating user interfaces and dashboards, creating simulations, or just following instructions.

The company also claimed that the Gemini 3.1 Flash-Lite is a cost-efficient AI model, with one million input tokens priced at $0.25 (roughly Rs. 23) and output tokens priced at $1.5 (roughly Rs. 137) per million tokens. In comparison, the Gemini 2.5 Flash costs $0.3 (roughly Rs. 27.5) per million input and $2.5 (roughly Rs. 229) per million output tokens.

Comments

For details of the latest launches and news from Samsung, Xiaomi, Realme, OnePlus, Oppo and other companies at the Mobile World Congress in Barcelona, visit our MWC 2026 hub.

Akash Dutta
Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More
Honor 600 Lite Launched With MediaTek Dimensity 7100 Elite, 6,520mAh Battery: Price, Specifications

Advertisement

Follow Us

Advertisement

© Copyright Red Pixels Ventures Limited 2026. All rights reserved.
Trending Products »
Latest Tech News »