OpenAI’s new voice AI model gpt-4o-transcribe lets you add speech to your existing text apps in seconds

ChatGPT creator OpenAI has released three new voice generator models called gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-mini-tts, which are available on its API for developers and a demo site, OpenAI.fm, for individual users for testing.
The models, which are variants of the GPT-4o model, have been trained to excel at transcription and speech and can be customised by the user, who can chose an accent, change the pitch and tone, and dictate which emotions the voice should convey.
They are aimed at use in customer call centres, meeting notes, and AI-assistant apps, and could be integrated using OpenAI’s new Agents SDK into apps built on its GPT-4o model with just nine lines of code, according to the firm.
OpenAI is hosting a competition for the public to post their most creative examples of using the demo site. The winner will receive a custom Teenage Engineering radio with the OpenAI logo.

Fast Feed