ElevenLabs Will Introduce AI For Audio Effects

ElevenLabs, an artificial intelligence (AI) firm created by former Google and Palantir workers, is expanding its portfolio with a new text-to-sound model after perfecting the art of machine learning (ML) based voice cloning and synthesis. The company has been operating for two years.

The AI, which was hinted at a few hours ago, will let producers create sound effects by just verbally expressing their ideas. It is anticipated to bring new levels of enrichment to material in the era of AI-powered digital experiences.

Although the model isn’t accessible to the general public, ElevenLabs has demonstrated its potential with the release of a minute-long teaser that includes films created by OpenAI’s new Sora and enhanced with its artificial intelligence noises. In addition, the business has called prospective customers and put up a sign-up page for them to go on an early access waitlist for the model.

AI Sound Effects: Going Beyond Speech

ElevenLabs, a 2022 startup, has been studying artificial intelligence to enable cross-language and cross-border access to audio and video material, such as podcasts and movies. To this end, the company has introduced several products, such as text-to-speech and speech-to-speech models that can generate AI speech in 29 different languages from a given piece of content (text, audio, or video) while emulating the natural voice and emotions of the original speaker in speech-to-speech.

While businesses and individuals who create content continue to widely use both of these technologies, completely AI-generated material has also become more popular as a result of programs like Runway, Pika, and most recently OpenAI (with Sora). These solutions use straightforward text cues to create convincing AI films; default audio is absent. Here’s where ElevenLabs’ new model comes in; it lets users describe what they want to utilize to create sound effects for their material.

When used, this product can help AI artists improve their work by adding background noises that belong there by default. Anything from chirping birds to moving cars and honking horns can be heard as the sound effect. Even individuals dining, conversing, or strolling down a crowded street may be heard.

We at ElevenLabs have only ever displayed our text-to-speech models in front of an audience. But we still have a ton of stuff in the works. Luke Harries, the growth head at ElevenLabs, wrote: “And when OpenAI announced their Sora model — which generates incredible videos but without sound — we decided to show a sneak peek of our new product line.” Harries reshared an X post that included several videos produced by Sora that were enhanced with artificial intelligence sound effects from the company’s model.

The sounds generated by the new model might be used for anything from AI-generated material to plain speech derived from text or any other video that requires some background audio, such as an Instagram clip, ad, or video game teaser. What kind of quality it produces and how it is used are still unknown.

Register For Priority Access

ElevenLabs has opened up signups for early access to the model, but it has not said when it intends to make it publicly available. Interested visitors can visit this website, register with their name and email address, and specify the purpose for which they need the sound effects. To help the model respond as best it can, ElevenLabs is also requesting early volunteers to compose a sample prompt for an AI sound effect.

Following completion of the sign-up process, the user is added to a queue and will be granted access as soon as the model becomes available. However, the timeline is still unknown at this point.

ElevenLabs might have a first-mover advantage thanks to the new text-to-sound technology, but it’s vital to remember that several other businesses involved in AI speech could potentially enter this market. Well-known participants including MURF.AI, Play.ht, and WellSaid Labs are among them.

The global market for these products was valued at $1.2 billion in 2022 and is projected to grow at a compound annual growth rate (CAGR) of just over 15.40% to reach nearly $5 billion in 2032, according to Market US.

ElevenLabs Will Introduce AI For Audio Effects

Category :

Posted On :

Share This :

Have a project in mind?

Start Today

Services

Quick Links

Contact