Image for post
Image for post

Some might say that it’s “Mission Accomplished” for far field microphone beam forming and acoustic echo cancellation. On thinking about this today, we’re definitely at the point of diminishing returns for performance. Having a mic work 3 more feet away or at 5 dB more will not lead to significantly more joy to the user.

There are still challenges left in far field voice interaction that will be cracked over the next few years: the cocktail party problem, multiple user transcription, and whisper detection, among others. There will be tens of millions of research dollars poured into these resolving these last issues. However, we’re now at a point where far field is good enough for the speech recognition engines to take over the clean up and process the audio.

As the quality of STT in all scenarios gets better, we’ll be pushed more to making the interaction itself and the delight the user gets at it worthwhile and compelling in the first place. This means new applications and services, not just new technologies.

Independent daily thoughts on all things future, voice technologies and AI. More at

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store