OpenAI Introduces GPT-Realtime Speech Generation Model, Makes Realtime API Generally Available

OpenAI’s GPT-Realtime is reportedly the company’s most advanced voice model, designed for customer support and assistance.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 29 August 2025 13:21 IST
Highlights
  • OpenAI said the model was trained in collaboration with companies
  • GPT-Realtime will be available with new Cedar and Marine voices
  • The Realtime API was first released as a public beta in October 2024

OpenAI’s new speech generation model can also analyse and read text in images

Photo Credit: OpenAI

OpenAI, on Thursday, announced a new artificial intelligence (AI) speech generation model dubbed GPT-Realtime. This is an enterprise-focused model that is capable of generating native audio with low latency, enabling two-way, real-time voice conversations. The San Francisco-based AI firm said that compared to its existing voice models, the Realtime model offers higher quality output, lower processing times, as well as additional features such as tool calling, support for remote Model Context Protocol (MCP) servers and image input, and the ability to detect alphanumeric sequences in select non-English languages.

OpenAI Brings New Speech Model for Enterprises

In a post, the AI firm announced the release of its most advanced speech generation model, GPT-Realtime. To explain, a speech generation model is different from the traditional voice assistants that companies use for customer support. Those chains together multiple systems, such as text-to-speech and speech-to-text, to carry out a voice conversation with a human. In comparison, the OpenAI model can natively process speech input and generate corresponding speech output, resulting in significantly lower response times.

GPT-Realtime features several new and enhanced capabilities. Similar to Advanced Voice Mode, it is capable of generating a highly expressive and natural-sounding voice, which developers can fine-tune with text-based instructions. Two new voices are being introduced, male voice Cedar and female voice Marin, and the company is also updating the existing eight voices.

Advertisement

In terms of performance, the model can capture non-verbal cues, such as laughter, and respond to them. It can also switch languages mid-sentence and adapt to the user's tone. Based on internal evaluations, OpenAI claims that the model displays higher performance in detecting alphanumeric sequences (such as phone and policy numbers) in non-English languages, such as Chinese, French, Japanese, and Spanish.

Advertisement

The company claimed that GPT-Realtime scored 82.8 percent on the Big Bench Audio benchmark, which measures a voice model's accuracy and reasoning ability. This is significantly higher than its predecessor from December 2024, which scored 65.6 percent.

Additionally, OpenAI claimed that the speech generation model has higher instruction adherence, supports function and tool calling, and can be configured to support remote MCP servers. It can also analyse and read images, allowing use cases where users can upload an image for better context, and the model can then incorporate it into the conversation.

Advertisement

Notably, GPT-Realtime is an enterprise-focused offering, and it is exclusively available with the company's Realtime API, which is now generally available to all developers. The API was first introduced in October 2024 as a public beta.

Coming to the model's pricing, GPT-Realtime will cost developers $32 (roughly Rs. 2,800) per million input and $64 (roughly Rs. 5,600) per million output tokens. Cached input tokens (per million) are priced at $0.40 (roughly Rs. 35).

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. NASA Confirms Third Interstellar Visitor 3I/ATLAS Is a Natural Comet
  2. Realme P4x 5G Price in India Leaked; Here's How Much It Might Cost
  3. Redmi Note 16 Pro+, Realme 16 Pro+ Tipped to Launch Soon
  4. Vivo X300 Ultra Launch Timeline, Battery Capcity Leaked
  5. Nothing Phone 3a Series Gets Nothing OS 4.0 Update With These Features
  6. Apple's iPhone 17e Could Arrive in 2026 With This Notable Design Upgrade
  7. Bitcoin Drops Below $85,000 As Market Reacts To Liquidity Shock
  8. iQOO 15 Sale in India Begins Today: All You Need to Know
  9. Lava Play Max Could Launch in India Soon at This Price
  10. Xiaomi 17 Ultra Tipped to Launch Soon With This Leica Camera Upgrade
  1. Vivo X300 Ultra Launch Timeline Leaked; Tipped to Arrive With 7,000mAh Battery
  2. Nothing Phone 3a, Phone 3a Pro Get Nothing OS 4.0 Update With Android 16, AI Usage Dashboard and More
  3. Bitcoin Price Slips to $85,000 Zone After Liquidation Shock; Crypto Market Eyes US Fed Shift
  4. OnePlus Ace 6T Camera Details Revealed: Expected Specifications, Features
  5. Oakley Meta Glasses With Meta AI Integration Now Available for Purchase in India: Price, Availability
  6. Capcom Reportedly Working on New Dead Rising Game With Frank West as Protagonist
  7. Redmi Note 16 Pro+, Realme 16 Pro+ Tipped to Launch Soon With 200-Megapixel Rear Cameras
  8. OnePlus Pad Go 2 Reportedly Bags FCC Certification Ahead of December 17 Launch
  9. iPhone 17e Tipped to Resemble iPhone 17 With Dynamic Island; Specifications Leaked
  10. Realme Watch 5 Key Features, Colourways Confirmed Ahead of December 4 India Launch
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.