Starbucks Improves Speech Recognition

… at least for my name.

Last week, I was pleasantly surprised when I received back my cup and it had at least an alternative spelling to my name on it. Maybe it me overly pronouncing the “R”.

If Starbucks wanted to go overboard on this, it could create its own hardware to help baristas with this task. The setup would consist of:

  • A directional microphone array focused at the customer’s mouth
  • A small computer with connectivity
  • A display to show results
  • A clip on mic for the barista
  • Optional: a button for start point and endpoint of speech

A trigger phrase for the start of speech processing could be “can I get your name” that would then stream the audio from the mic array to an ASR API that was constrained to names. The top result and next two with confidence levels would be shown on the display. In the alternative to the clip on mic, a push button could be used by the barista to signal listening for the customer’s name.

If Starbucks app users are found to be within a geofence of that location, results with their names could be further biased.

I won’t hold it against them that they spelled Leor with an “i”.

Earmuffs for the younguns but maybe this is the reason why name issues existed before…

