There are times where voice-driven systems don’t work all that well because of background noise or other voices. That’s because it’s hard for machines (and humans) to pull out a particular voice when there are many others speaking. This is sometimes called ‘the cocktail party problem.’ Yobe was created out of research at MIT on how to solve this issue, and today it announced $1.8 million in seed funding.
The investment comes from Clique Capital Partners, a $100 million fund created specifically to fund innovative voice technology. Yobe had previously received $790,000 in the form of a National Science Foundation SBIR grant in 2016.
Company co-founder and CEO Ken Sutton says Yobe is solving an entrenched problem identifying a particular voice in noise. That means for instance if you are at a party and you want Alexa to play a Spotify playlist, you could (in theory at least), say the wake word from across a crowded room, give the playlist command and the device would execute it in spite of the noise. That’s because Yobe can pinpoint a voice based on biometric markers, aggressively enhance the volume and then use AI to smooth it out.
As Sutton says, most of these voice interface technologies fail in this situation because they can’t distinguish your voice from the background noise, but Yobe is supposed to solve this.
Sutton made clear the research phase is done and the funding is about getting ready to go to market. “The capital raised is not to continue R&D. The capital raised is to streamline and optimize [the technology] for deployment. We will be in market with a product to sell in 30 days, and all of the usual suspects are lined up and waiting for us to call with a live demo,” Sutton told TechCrunch.
Ultimately the company hopes to license its technology to chip or phone manufacturers and others in a scenario not unlike Dolby. Sutton acknowledges there are many use cases for a solution that could identify a specific voice in noise or among other voices such as law enforcement, hearing aid manufacturers and meeting transcription services. You could even use voice as a biometric marker for authentication purposes. But the company has decided to focus its early efforts on voice-driven devices as an initial go-to market strategy.
The company was founded by Sutton and Dr. S. Hamid Nawab, an MIT PhD and researcher, whose work has focused on applying AI to signal processing. Nawab is the company’s Chief Scientist.
You can watch this video demo to see how Yobe works: