Realistic Enough

Leor Grebler
2 min readJul 3, 2021

--

Screen grab from HBO’s WestWorld.

Thinking back to how I thought a few years ago, one question that I constantly thought about was whether we’d pass a Touring Test for speech synthesis. Now, I’m not sure if that matters for most applications. I know that Alexa, Siri, and Google Assistant are not real people. In fact, the more they try to convince me that they are, the more I’m turned off.

There voice does not represent a person. It’s a way of way of presenting information to me aurally. Sure, it can present the information in a way that sounds better with certain inflections and emphasis than without, but not necessarily to convince me that it’s a person.

Removing the constraint of realism might actually open up the possibility of a better user experience. What if the way a normal person reads something isn’t the best way for us to consume that type of information? What if the SSML could be sped up according to sentiment?

Somewhere, I hope, is a group of graduate students who are doing studies on how speech synthesis can be modified to improve comprehension of a read passage. Such a study could incorporate different speeds and expressions in reading back passages and then have listeners tested on their understanding. You’re welcome, world!

One hypothesis is that what we’re getting from the current voice assistants is realistic enough. We don’t need them to be perfectly mimicking of human voice for them to be useful.

--

--

Leor Grebler
Leor Grebler

Written by Leor Grebler

Independent daily thoughts on all things future, voice technologies and AI. More at http://linkedin.com/in/grebler

Responses (1)