OpenAI Introduces GPT-Realtime Speech Generation Model, Makes Realtime API Generally Available

OpenAI’s GPT-Realtime is reportedly the company’s most advanced voice model, designed for customer support and assistance.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 29 August 2025 13:21 IST
Highlights
  • OpenAI said the model was trained in collaboration with companies
  • GPT-Realtime will be available with new Cedar and Marine voices
  • The Realtime API was first released as a public beta in October 2024

OpenAI’s new speech generation model can also analyse and read text in images

Photo Credit: OpenAI

OpenAI, on Thursday, announced a new artificial intelligence (AI) speech generation model dubbed GPT-Realtime. This is an enterprise-focused model that is capable of generating native audio with low latency, enabling two-way, real-time voice conversations. The San Francisco-based AI firm said that compared to its existing voice models, the Realtime model offers higher quality output, lower processing times, as well as additional features such as tool calling, support for remote Model Context Protocol (MCP) servers and image input, and the ability to detect alphanumeric sequences in select non-English languages.

OpenAI Brings New Speech Model for Enterprises

In a post, the AI firm announced the release of its most advanced speech generation model, GPT-Realtime. To explain, a speech generation model is different from the traditional voice assistants that companies use for customer support. Those chains together multiple systems, such as text-to-speech and speech-to-text, to carry out a voice conversation with a human. In comparison, the OpenAI model can natively process speech input and generate corresponding speech output, resulting in significantly lower response times.

GPT-Realtime features several new and enhanced capabilities. Similar to Advanced Voice Mode, it is capable of generating a highly expressive and natural-sounding voice, which developers can fine-tune with text-based instructions. Two new voices are being introduced, male voice Cedar and female voice Marin, and the company is also updating the existing eight voices.

Advertisement

In terms of performance, the model can capture non-verbal cues, such as laughter, and respond to them. It can also switch languages mid-sentence and adapt to the user's tone. Based on internal evaluations, OpenAI claims that the model displays higher performance in detecting alphanumeric sequences (such as phone and policy numbers) in non-English languages, such as Chinese, French, Japanese, and Spanish.

Advertisement

The company claimed that GPT-Realtime scored 82.8 percent on the Big Bench Audio benchmark, which measures a voice model's accuracy and reasoning ability. This is significantly higher than its predecessor from December 2024, which scored 65.6 percent.

Additionally, OpenAI claimed that the speech generation model has higher instruction adherence, supports function and tool calling, and can be configured to support remote MCP servers. It can also analyse and read images, allowing use cases where users can upload an image for better context, and the model can then incorporate it into the conversation.

Advertisement

Notably, GPT-Realtime is an enterprise-focused offering, and it is exclusively available with the company's Realtime API, which is now generally available to all developers. The API was first introduced in October 2024 as a public beta.

Coming to the model's pricing, GPT-Realtime will cost developers $32 (roughly Rs. 2,800) per million input and $64 (roughly Rs. 5,600) per million output tokens. Cached input tokens (per million) are priced at $0.40 (roughly Rs. 35).

 

Catch the latest from the Consumer Electronics Show on Gadgets 360, at our CES 2026 hub.

Advertisement

Related Stories

Popular Mobile Brands
  1. Here's When the Motorola Signature Will Launch in India
  2. iQOO Z11 Turbo With 200-Megapixel Camera Arrives in China at This Price
  3. Realme P4 Power 5G Will be Launched in India Soon: See Expected Specs
  4. Nothing Confirms Bengaluru as Location for India's First Flagship Store
  5. Amazon Sale: Best Deals on Galaxy S25 Ultra and More Samsung Phones
  6. Apple May Launch M5 Pro and M5 Max MacBook Pro Models This Month
  7. Bitcoin Nears $97,000 as ETF Demand Fuels Crypto Rally
  8. Oppo A6c Launched With 6,500mAh Battery, Snapdragon 685 SoC
  9. iPhone 17e Launch Timeline Leaked Again Alongside Key Specifications
  1. Civilization VII Coming to iPhone, iPad as Part of Apple Arcade in February
  2. OpenAI’s Hardware Pivot: Rejecting Apple to Focus on Jony Ive-Designed AI Wearables
  3. iQOO Z11 Turbo Launched With 7,600mAh Battery, 200-Megapixel Camera: Price, Specifications
  4. Google Photos App Could Soon Bring New Battery Saving Feature, Suggests APK Teardown
  5. OpenAI Takes on Google Translate With Its New AI-Powered Translation Feature
  6. Nothing Confirms Bengaluru as Location for India’s First Flagship Store; Set to be Second in the World
  7. Resident Evil Village, Like a Dragon: Infinite Wealth and More Join PS Plus Game Catalogue in January
  8. Lava Blaze Duo 3 Confirmed to Launch in India Soon; Key Specifications Revealed via Amazon Listing
  9. Lumio Vision 7, Vision 9 Smart TVs Go on Sale on Flipkart With Republic Day Offers
  10. God of War TV Series OTT Release: Know When, Where to Watch the Live Adaptation of Kratos' Adventures
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.