ElevenLabs
ElevenLabs is an AI audio research and deployment company founded in 2022 that develops text-to-speech, voice cloning, dubbing, and conversational voice agent technologies based on proprietary deep learning models.
ElevenLabs is a London- and New York-based artificial intelligence company specialising in generative audio and voice technologies. Founded in 2022 by former Google engineer Piotr Dabkowski and former Palantir deployment strategist Mati Staniszewski, the company emerged with a focus on high-quality multilingual text-to-speech and voice cloning, and has since expanded to dubbing, sound effects generation, conversational voice agents, and audio APIs for developers. The company's models are notable for naturalness of prosody, emotional expressiveness, and cross-lingual capability, and have been widely adopted in audiobook production, gaming, accessibility, education, journalism, and customer service.
Products and Capabilities
Text-to-Speech
ElevenLabs's text-to-speech models generate naturalistic spoken audio from written text with control over voice, language, accent, emotion, and pacing. The platform supports a library of ready-made voices as well as custom voices, and offers programmatic control through an API. Multilingual model variants such as Multilingual v2 and Turbo v2 added support for languages spanning English, Spanish, German, French, Italian, Portuguese, Polish, Dutch, Hindi, Arabic, Chinese, Japanese, Korean, and others, with cross-lingual voice transfer that preserves speaker identity across languages.
Voice Cloning
ElevenLabs offers Instant Voice Cloning, which produces a voice copy from a short audio sample of around one minute, and Professional Voice Cloning, which trains a higher-fidelity voice clone from several hours of studio-quality recordings. Voice cloning has been used for audiobook production by authors recreating archival voices with permission, for accessibility tools that allow individuals losing their ability to speak to retain a personal voice, and for media localisation.
Dubbing
The ElevenLabs Dubbing service translates and dubs video content into multiple target languages while preserving the original speaker's voice characteristics. The service segments input video by speaker, transcribes the audio, translates the transcript, and re-synthesises speech in the target language using cross-lingual voice transfer.
Conversational AI and Voice Agents
The Conversational AI platform allows developers to build low-latency voice agents that combine large language model reasoning with ElevenLabs's text-to-speech and a speech recognition front end. Use cases include customer service voice bots, telephone-based assistants, and interactive characters for games and virtual reality applications.
Sound Effects and Audio APIs
A sound effects generation product produces short audio clips from text prompts. The company also offers speech-to-text APIs and APIs for developers to integrate audio generation into mobile apps, websites, and enterprise software.
Safety and Responsible Use
The realism of ElevenLabs voice cloning has raised concerns about misuse for fraud, harassment, and political disinformation, including a high-profile 2024 incident involving cloned political speech in the United States. The company has invested in safety measures including the AI Speech Classifier, an open detection model that estimates whether audio was generated by ElevenLabs, voice verification at clone-creation time, content moderation on the generation pipeline, terms of service prohibiting deceptive impersonation, and participation in industry initiatives such as the Coalition for Content Provenance and Authenticity (C2PA). The company has also been a signatory and contributor to safety frameworks discussed in international AI safety conferences.
Funding and Growth
ElevenLabs raised a Series A round in 2023 led by Andreessen Horowitz at a reported valuation in the hundreds of millions of US dollars, followed by a Series B in 2024 at a multi-billion-dollar valuation with participation from Andreessen Horowitz, Sequoia, NEA, ICONIQ Capital, World Innovation Lab, and others. The company has expanded into enterprise sales with customers in publishing, gaming, e-learning, and media.
Open Models and Research
ElevenLabs publishes a portion of its research and has open-sourced certain models including the AI Speech Classifier. The company also operates research collaborations with academic institutions on voice biometrics, accessibility, and synthetic media safety.
References
- ElevenLabs. (2024). Company Overview and Research Page. London: ElevenLabs Inc.
- ElevenLabs. (2024). AI Speech Classifier: Detecting Synthetic Speech. ElevenLabs Research Blog.
- Coalition for Content Provenance and Authenticity (C2PA). (2024). Technical Specification 2.0.
- MOSTI Malaysia. (2024). National Guidelines on Artificial Intelligence Governance and Ethics. Putrajaya: MOSTI.
- Malaysian Communications and Multimedia Commission. (2024). Guidelines on Online Safety. Cyberjaya: MCMC.