Google Releases Cost-Efficient and Low-Latency Gemini 2.5 Flash AI Model

Google said the Gemini 2.5 Flash model is ideal for responsive virtual assistants and real-time summarisation tools.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 11 April 2025 12:23 IST
Highlights
  • Gemini 2.5 Flash comes with native reasoning capability
  • Google said it will be added to Vertex AI and AI Studio soon
  • Gemini 2.5 Flash can also be used to build AI agents

There is no word on when the AI model will be rolled out to end consumers

Photo Credit: Google

Google released its second artificial intelligence (AI) model in the Gemini 2.5 family on Thursday. Dubbed Gemini 2.5 Flash, it is a cost-efficient low-latency model which is designed for tasks requiring real-time inference, conversations at scale, and those which are generalistic in nature. The Mountain View-based tech giant will soon make the AI model available on both the Google AI Studio as well as Vertex AI to help users and developers access the Gemini 2.5 Flash, and build applications and agents using it.

Gemini 2.5 Flash Is Now Available on Vertex AI

In a blog post, the tech giant detailed its latest large language model (LLM). Alongside announcing the debut of the Flash model, the post also confirmed that the Gemini 2.5 Pro model is now available on Vertex AI. Differentiating between the use cases of the two models, Google said the Pro model is ideal for tasks that require intricate knowledge, multi-step analyses, and making nuanced decisions.

Advertisement

On the other hand, the Flash model prioritises speed, low latency, and cost efficiency. Calling it a workhorse model, the tech giant said it is an “ideal engine for responsive virtual assistants and real-time summarisation tools where efficiency at scale is key.”

While launching the 2.5 Pro model, Google had specified that all LLMs in this series would feature natively built reasoning or “thinking” capability. This means the 2.5 Flash also comes with “dynamic and controllable reasoning.” Developers can adjust the processing time for a query based on the complexity, enabling them to get a granular control over the response generation times.

Advertisement

For its enterprise clients, Google is also introducing the Vertex AI Model Optimiser tool. Available as an experimental feature within the platform, it takes away the confusion of choosing a specific model when users are not sure. The feature can automatically generate the highest-quality response for each prompt based on factors such as quality and cost.

Google did not release a technical paper or model information card alongside the release, so information about its architecture, pre- and post-training processes, and benchmark scores are not known. The company might release it at a later time while making the model available to end consumers.

Advertisement

Meanwhile, the tech giant is also adding new tools to support agentic application building on Vertex AI. The company is adding a new Live application programming interface (API) for Gemini models that will allow AI agents to process streaming audio, video, and text with low latency to let it complete tasks in real-time.

The Live API, which is powered by Gemini 2.5 Pro, also supports resumable sessions longer than 30 minutes, multilingual audio output, time-stamped transcripts for analysis, tool integration, and more.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. RTX Spark-Powered Laptops Could Cost a Lot More Than Regular AI PCs
  2. Marvel's Wolverine Gets Gameplay Trailer at State of Play, Pre-Orders Go Live
  3. Lumio Launches 55-Inch Variants of Vision 9 (2026), Vision 7 (2026) in India
  4. Redmi Turbo 5 Tipped to Launch in India on This Date
  5. Sony Bravia 7II 4K TVs With Cognitive Processor XR Debut in India
  6. Nothing Ear 3a, CMF Buds Neo Visit Regulatory Databases, Might Launch Soon
  7. God of War Laufey Revealed at State of Play: Everything You Need to Know
  8. Lava Bold N2 5G Launched in India With 6,000mAh Battery, 6.75-Inch Display
  9. Realme P4R 5G India Launch Date, Design and Key Specifications Revealed
  10. Motorola Edge 2026 With 6.3-Inch Display Goes Official
  1. Realme P4R 5G India Launch Date Revealed Along With Design and Key Specifications
  2. Marvel's Wolverine Gets Visceral Gameplay Trailer at State of Play, Pre-Orders Now Live
  3. RTX Spark Laptops Said to Cost More Than Traditional AI PCs; Base Models Could Start at $1,799
  4. Lumio Introduces 55-Inch Variants of Vision 9 (2026) and Vision 7 (2026) Smart TVs in India: Price, Features
  5. Bitcoin Drops Below $67,000 as ETF Outflows, Institutional Selling Intensify
  6. Lava Bold N2 5G Launched in India With 6,000mAh Battery, 6.75-Inch Display: Price, Specifications
  7. WhatsApp Said to Be Developing On-Device Scam Detection Feature for Android
  8. Motorola Edge 2026 Launched With 6.3-Inch Display, MediaTek Dimensity 7450 SoC: Price, Specifications
  9. Honor X7e Launched With 7,500mAh Battery, 50-Megapixel Rear Camera: Price, Specifications
  10. God of War Laufey Revealed With Extended Gameplay Trailer Showcasing New Protagonist, Setting and Combat System
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.