It was only a few months ago that Microsoft had announced that they had beat human transcription accuracy levels for voice recognition. Now, IBM is announcing that they’ve even beat Microsoft’s record.
Using a trained speech engine, they were able to perform at 5.5% error rates.
What this means is that we’re on track to getting general availability high accuracy speech to text over the next year. With this, our Echos and Google Homes and potential all the other devices will likely undestand us as well as we understand each other.
These devices will then layer on far field modeling and noise cancelation that will require less audio pre-conditioning.