Home
Ai
Ai News
Google Introduces Gemini 3.1 Flash Lite as Its Fastest and Most Cost Efficient AI Model

Google Introduces Gemini 3.1 Flash-Lite as Its Fastest and Most Cost-Efficient AI Model

The Gemini 3.1 Flash-Lite is rolling out in preview via the Gemini API in AI Studio and Vertex AI.

Written by Akash Dutta, Edited by Rohan Pal | Updated: 5 March 2026 15:32 IST

Google Introduces Gemini 3.1 Flash-Lite as Its Fastest and Most Cost-Efficient AI Model

Photo Credit: Google

Gemini 3.1 Flash-Lite achieved an Elo score of 1432 on the Arena.ai Leaderboard

Click Here to Add Gadgets360 As A Trusted Source

Highlights

The AI model costs $0.25 per million input tokens
Gemini 3.1 Flash-Lite costs $1.50 per million output tokens
Google claims the new model outperforms 2.5 Flash in response speed

Google introduced the Gemini 3.1 Flash-Lite artificial intelligence (AI) model on Thursday. Calling it the fastest and the most cost-efficient AI model in the Gemini 3 series, the Mountain View-based tech giant said it is designed for high-volume developer workloads. The model is currently not available to end users and has been reserved for developers and enterprises via specific channels. The company also claimed that the model's output speed is higher than that of the 2.5 series. Notably, the Gemini 3.1 Flash-Lite is currently only available in preview.

Gemini 3.1 Flash-Lite Is Here

In a blog post, the tech giant announced and detailed its latest Gemini 3.1 series large language model (LLM). Currently, the Gemini 3.1 Flash-Lite can be accessed in preview via the Gemini application programming interface (API) in Google AI Studio, and via Vertex AI for enterprises.

Coming to capabilities, the company said the 3.1 Flash-Lite outperforms 2.5 Flash with a “2.5X faster Time to First Answer Token,” and a 45 percent increase in output speed, citing the Artificial Analysis benchmark. It is also said to have achieved an Elo score of 1432 on the Arena.ai leaderboard. It is also claimed to outperform GPT-5 mini, Claude 4.5 Haiku, and Grok 4.1 Fast in terms of output speed.

Meta Tests Shopping Capabilities in AI Assistant to Rival ChatGPT, Gemini

In AI Studio and Vertex AI, developers will be able to access the LLM in standard and thinking modes, with the latter allowing users to control the thinking time for a task. Highlighting some use cases, Google said the model can handle high-volume translation and content moderation, and can also be used for complex tasks, such as generating user interfaces and dashboards, creating simulations, or just following instructions.

The company also claimed that the Gemini 3.1 Flash-Lite is a cost-efficient AI model, with one million input tokens priced at $0.25 (roughly Rs. 23) and output tokens priced at $1.5 (roughly Rs. 137) per million tokens. In comparison, the Gemini 2.5 Flash costs $0.3 (roughly Rs. 27.5) per million input and $2.5 (roughly Rs. 229) per million output tokens.

Comments

For details of the latest launches and news from Samsung, Xiaomi, Realme, OnePlus, Oppo and other companies at the Mobile World Congress in Barcelona, visit our MWC 2026 hub.

Further reading: Google, Gemini 3 1 Flash Lite, Gemini, AI, Artificial Intelligence, AI Model

Akash Dutta Email Akash Dutta

Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More