Summary

  • ElevenLabs, the AI voice cloning and generation start-up founded by former Palantir alumni, has released Scribe v1, a speech-to-text model that boasts the highest accuracy across multiple languages.
  • The model outperforms Google, OpenAI and Deepgram in accuracy, achieving record-low error rates, and is able to transcribe and distinguish between 32 different speakers in the same audio file.
  • Scribe is currently available through the ElevenLabs website and API, with pricing set at $0.40 per hour of audio, with a 50% discount for the next six weeks.
  • The start-up has also unveiled plans for a low-latency version which will be aimed at real-time applications, including its use in communication tools.
  • The development of Scribe and Octave, an opposite text-to-speech model from rival Hume AI, highlights the growth in competition surrounding AI-driven audio models.

By Carl Franzen

Original Article