The Gemini 3.1 Flash-Lite is rolling out in preview via the Gemini API in AI Studio and Vertex AI.
Gemini 3.1 Flash-Lite achieved an Elo score of 1432 on the Arena.ai Leaderboard
Photo Credit: Google
Google introduced the Gemini 3.1 Flash-Lite artificial intelligence (AI) model on Thursday. Calling it the fastest and the most cost-efficient AI model in the Gemini 3 series, the Mountain View-based tech giant said it is designed for high-volume developer workloads. The model is currently not available to end users and has been reserved for developers and enterprises via specific channels. The company also claimed that the model's output speed is higher than that of the 2.5 series. Notably, the Gemini 3.1 Flash-Lite is currently only available in preview.
In a blog post, the tech giant announced and detailed its latest Gemini 3.1 series large language model (LLM). Currently, the Gemini 3.1 Flash-Lite can be accessed in preview via the Gemini application programming interface (API) in Google AI Studio, and via Vertex AI for enterprises.
Coming to capabilities, the company said the 3.1 Flash-Lite outperforms 2.5 Flash with a “2.5X faster Time to First Answer Token,” and a 45 percent increase in output speed, citing the Artificial Analysis benchmark. It is also said to have achieved an Elo score of 1432 on the Arena.ai leaderboard. It is also claimed to outperform GPT-5 mini, Claude 4.5 Haiku, and Grok 4.1 Fast in terms of output speed.
In AI Studio and Vertex AI, developers will be able to access the LLM in standard and thinking modes, with the latter allowing users to control the thinking time for a task. Highlighting some use cases, Google said the model can handle high-volume translation and content moderation, and can also be used for complex tasks, such as generating user interfaces and dashboards, creating simulations, or just following instructions.
The company also claimed that the Gemini 3.1 Flash-Lite is a cost-efficient AI model, with one million input tokens priced at $0.25 (roughly Rs. 23) and output tokens priced at $1.5 (roughly Rs. 137) per million tokens. In comparison, the Gemini 2.5 Flash costs $0.3 (roughly Rs. 27.5) per million input and $2.5 (roughly Rs. 229) per million output tokens.
For details of the latest launches and news from Samsung, Xiaomi, Realme, OnePlus, Oppo and other companies at the Mobile World Congress in Barcelona, visit our MWC 2025 hub.