LAGOS, Nigeria (VOICE OF NAIJA) – OpenAI is enhancing ChatGPT with new voice and image capabilities, marking a significant evolution beyond its text-based origins.
Users will now be able to engage in voice conversations with ChatGPT, asking questions or requesting it to generate verbal responses, such as bedtime stories.
Additionally, users can employ image-based prompts, like uploading a picture and asking ChatGPT to explain it or provide instructions.
The voice feature relies on a text-to-speech model capable of producing human-like voices from text and short speech samples.
OpenAI collaborated with established voice actors to create five distinct voices for this feature.
It also employs its Whisper speech recognition system to transcribe verbal input into text.
READ ALSO: ChatGPT For Android Users Set To Launch On Google Store
OpenAI has partnered with Spotify for this initiative, allowing podcasters to translate their shows from English into Spanish, French, or German while maintaining their original voice.
However, access to this technology is not available to the general public and is limited to specific podcasters.
OpenAI acknowledges the potential risks associated with voice synthesis technology, such as impersonation and fraud.
Despite these challenges, the company believes the new voice technology can serve creative and accessibility-focused applications.
These features will roll out to paying Plus and Enterprise subscribers in the next two weeks.
Users can activate voice capabilities by navigating to the app’s settings menu, selecting “new features,” and opting into voice conversations.
Initially, voice functionality will be available as an opt-in beta on the ChatGPT Android and iOS apps, while image search will be enabled by default across all platforms.