We’re about to see quite a few devices hit the market with Alexa Voice Service built in but missing the wake up feature that’s found on the Echo. Instead, these devices are “push-to-talk”, meaning that you need to push a button to initiate the voice interaction.
Part of the reason we’ll see this implementation of AVS be initially more popular is because it doesn’t require a review from Amazon in order to deploy to products on the market. Only once there’s a handsfree trigger does Amazon require this.
There are two primary modes when it comes to push-to-talk:
- Press, release, then speak.
- Push, hold, speak, then release.
In the former, the implementations typically will play a beep to indicate that the device is listening and then when there’s an end of speech detected, another acknowledgement will sound, followed by the response. This implementation, especially when there is background noise, can be prone to listening for large periods of time, hoping to detect a quiet period to determine the end of speech.
For the latter, the benefit is that there’s no chance of the speech endpointing missing the end of speech beign detected. The downside is that the user needs to hold down the button while speaking. However, this could also be spun in a positive way — it means that the user must continue to be within arms reach in order to speak to the device.
Personally, I prefer the latter but it ultimately depends on the interface. If the device uses a soft button (one that doesn’t depress significantly when pushed), then the user is more likely to press the speak button once and then release. However, if the device has a physical button with a lot of movement thatis spring loaded, then it makes sense to have a push and hold type of interaction, similar to a CB radio.
Over and out!