The potential to create conversational AI bots was introduced on Monday by ElevenLabs, a firm that offers text-to-speech API and AI voice cloning.
With adjustable features like answer duration and voice tone, ElevenLabs’ developer platform now allows users to create whole conversational bots, the firm announced.
Providing various voices and AI tools for text-to-speech services has been ElevenLabs’ primary focus. According to Sam Sklar, the company’s head of growth, many of its clients were already utilizing this capability to develop conversational AI agents, as reported by TechCrunch. The most challenging aspects, meanwhile, were managing customer interruptions and integrating the information base. The business chose to construct a complete pipeline for conversational bots for this reason.
After logging into their ElevenLabs account, users can choose a template or start a new project to begin developing a chat agent. To establish the agent’s persona, they can select the agent’s first message, system prompt, and preferred language. Additionally, developers must decide on a large language model (Gemini, GPT, or Claude), the temperature of responses (to ascertain the level of creativity required), and the token usage restriction.
They can also adjust other features including voice, latency, stability, authentication standards, and the AI agent’s maximum conversation duration.
The conversational bot can be powered by the user’s own knowledge base, which can be a file, URL, or text block. Additionally, businesses can integrate the bot with their own unique LLM. The SDK from ElevenLabs works with Python, JavaScript, React, and Swift. Additionally, for greater customisation, the business provides a WebSocket API.
Businesses can also establish criteria to gather specific information, such as the name and email of the client interacting with the agent, as well as evaluation criteria in natural language to determine if the call was successful or not.
The text-to-speech portion is being handled by ElevenLabs using its current pipeline. The new conversational AI product must have speech-to-text capabilities developed by the company. In the future, the company may offer its speech-to-text API as a stand-alone product, competing with specialized APIs like OpenAI’s Whisper, AssemblyAI, Deepgram, Speechmatics, and Gladia, as well as speech-to-text APIs from Google, Microsoft, and Amazon.
In addition to competing with other speech AI startups like Vapi and Retell, which are also developing conversational bots, the company is seeking to raise further capital at a valuation of more than $3 billion. Furthermore, the business will compete with OpenAI’s real-time conversational API. But according to ElevenLabs, its model-switching capabilities and modifications will give them a competitive advantage over OpenAI.

