WhatsApp is reportedly working on a new artificial intelligence (AI) feature. The new feature is said to allow users to hold hands-free verbal conversations with Meta AI, the AI chatbot integrated into the app. Earlier, a report claimed that WhatsApp was working on letting users send voice notes to Meta AI, allowing for one-way verbal communication, but the new information now claims that the AI chatbot will also respond verbally. The voice mode feature might also arrive with several voice options to choose from, although the differences between them are not known.

WhatsApp to Reportedly Add Meta AI Voice Mode

According to a post by WhatsApp feature tracker WABetaInfo, the voice mode feature for Meta AI was spotted in WhatsApp beta for Android version 2.24.17.16. A separate post also found the same feature in WhatsApp beta for iOS version 24.16.10.70.

Meta AI voice mode feature

Photo Credit: WABetaInfo

The feature is currently not visible in the beta version of the app, likely because the company is still working on the feature. As a result, those who have enrolled in the Google Play Beta programme will not be able to test the Meta AI voice mode.

As per the screenshots shared by the feature tracker, a new voice icon represented by an audio waveform can be seen next to the text field in the Meta AI chat. Tapping on it appears to open a bottom sheet with Meta AI written on top. In the middle, a circular shape created by multiple bubbles can be seen. At the bottom, the text “Hi, how can I help” with an expanded audio waveform icon suggesting the AI is listening can be seen.

Further, more screenshots reveal that the Meta AI voice mode may have up to 10 different voices to choose from. It is unclear what the differences between the voices would be, but they might have different accents, energy levels, or tonalities. It is unlikely that the voices would support multiple languages.

Apart from that, an option to turn on captions and transcriptions using text-to-speech can also be seen. This feature likely documents the entire verbal conversation and types it out as text so that the user can refer to it at a later point in time. It is not known when this feature might be rolled out to the public.