Sam Liang talks about Otter's real time voice transcription services

Sam Liang talks about Otter's real time voice transcription services

Sam Liang is the CEO/Founder of Otter.AI based in Los Altos, California. Otter.AI is funded by some amazing VC's and Angel Investors.  They include Horizons Ventures (DeepMind, Waze, Zoom, Facebook), Tim Draper, David Cheriton (Stanford Professor Billiona
50 Minuten
Podcast
Podcaster
Sam Talks Technology, a round-up of the hottest stories and latest news.

Beschreibung

vor 6 Jahren

Sam Sethi talked with Sam Liang about his entrepreneurial
journey, from his beginnings as a computer student at Peking
University through to a software engineer at Google where he was
responsible for developing the blue dot on Google Maps.


Sam sold his first startup, a location platform called Alohar
Mobile to Alibaba, before finally starting Otter, the 30-person
startup team that hails from Google, Facebook, Nuance, Yahoo, as
well as Stanford, Duke, MIT and Cambridge.


Otter was founded, just over three years ago and has raised $13
million in funding from the who’s who of Silicon Valley. e.g
Horizons Ventures – a backer of Viv, DeepMind, Siri, Slack and
others – who led the $10 million Series A. Also participating
were Bridgewater Associates, i-Hatch Ventures, MetaLab, Jay
Markley, and Boston investors Jim Pallotta and Stu Porter.


Seed investors included Tim Draper through
Draper Associates and Draper Dragon; Dave Morin through Slow
Ventures; David Cheriton "The billionaire
Professor and 1st investor in Google"; SV Tech Ventures,
Danhua Capital, and 500 Startups.


Otter wants to make it as easy to search your voice conversations
as it is to search your email and texts. The idea to create a new
voice assistant focused on transcribing everyday conversations –
like meetings and interviews.


Essentially, a voice recorder that offers automatic
transcription, Otter is designed to be able to understand and
capture long-form conversations that take place between multiple
people.


This is a different sort of voice technology than what’s been
developed today for voice assistance – as with Alexa or Google
Assistant.


The existing technologies are not good enough for
human-to-human conversations,” explains Sam Liang.


“Google’s voice API has been trained to optimize voice
search,” he says, adding that when people talk to voice
assistants, it’s typically only one person talking and they tend
to speak more slowly and clearly than usual.


They also often ask shorter questions, like “what’s the
weather?,” not carry on long conversations.


“Human meetings are much more complicated, it usually involves
at least two people, and the people could talk for an hour. It’s
a long-form conversation.”


With Otter, the goal is to capture those conversations –
meetings, interviews, lectures, etc. – and turn them into a
searchable archive where everything said is immediately
transcribed by Otter's software.


The entire technology stack, including speech recognition, was
built in-house. The company is not using existing speech
recognition APIs, because they wanted to improve the accuracy,
and optimize for multiple speakers, says Liang.


To identify when someone else starts talking, Otter uses a
technology called diarization to separate each
individual speaker; it then generates a voice print for each
person’s voice.


Broadly speaking, this is like the voice equivalent to facial
recognition, with the voice print being used to identify the
speaker going forward.


Sam envisions a number of potential use cases for Otter's
technology, including in enterprise, health care, education,
phone calls and more.

Kommentare (0)

Lade Inhalte...

Abonnenten

15
15