If you haven’t read about Project Understood, you should check it out. It’s a great project looking to address the difficulties that those with Down Syndrome or who speak differently than most have with voice assistants. Voice assistants have the ability to provide new independence but not if they can’t understand. Project Understood tries to solve this by collecting samples from those who speak differently. (Also referred to at Google as Project Euphonia)
When we were working on the Ubi, one of the markets that it could have served well was the aging population.
However, we had hit challenges with speech recognition for older people, but more importantly end of speech detection or timing the start of speech. The issue was two-fold.
With other populations, the issue at the time (2012) was that Google was trained on US English and the samples, while broadest of any available, were still limited. That problem has been addressed many times over. Google is robust, other services like Speechmatics have multi-accent versions of their APIs, Amazon is also much better. It’s just a matter of expanding the model to understand a wider range of how things are said.
Project Common Voice also plans to have a wide range of voice samples available, but hasn’t yet expanded to those who speak differently. Unlike with Google, Common Voice will let anyone have access to the recordings to build their own speech recognition service. It doesn’t have the pull of Google (marketing, engineering, research) but it’s more democratic.
Project Understood makes me hopeful. It means that we’re continuing to think about those who might not be the “primary target” of our services but who could gain even more benefit. We all benefit when design is done this way. It also cements a belief that these technologies will be better at understanding us than we are at understanding each other in the not-too-distant future.