Image for post
Image for post
Image by M4th5 [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)]

Amazon announced today a new newscaster style voice to Amazon Polly. While Google has been working on great, lifelike voice synthesis with Wavenet and Tacotron2 projects, Amazon’s announcements makes me think that Google was barking up the wrong tree.

The real challenge with speech synthesis is that it’s hard to tolerate flat, emotionless voice for anything longer that a quick response. Getting information read out with these voices is painful. Being lifelike doesn’t necessarily make the experience better. It just means listening to a real person speak flatly and emotionlessly back to you.

The newscaster voice mimics NPR-style news reading. It adds a cadence and prosody that makes it tolerable for listening to longer text.

What’s weird is that announcer voices are more fake sounding but in this case, make the voice sound more real.

Written by

Independent daily thoughts on all things future, voice technologies and AI. More at http://linkedin.com/in/grebler

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store