Image for post
Image for post

Continuing on the theme of SSML because of our launch of the new feature for our Alexa Skill earlier this week, there are a few reasons why SSML matters in building voice interaction.

SSML and other speech synthesis markup tools represent a way for speech-to-text to go from robotic to natural. They are also the way these services will get closer to passing Turing tests and create better interactions.

While technologies like WaveNet will make STT services (at least from Google) much more real sounding, SSML will allow developers to add the inflection and correct pronunciations that are needed for these interaction to be convincing.

While SSML is recently supported by Alexa, it’s by no means exclusive to it. Other services offer their own implementation (Google, Microsoft, Nuance, IBM). What may be necessary for future SSML standards is the addition of emotion tags as STT providers become more able to add emotion to their services.

Independent daily thoughts on all things future, voice technologies and AI. More at

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store