Over the next year, more individuals will start to recognize the potential threat of spoofed voices. While Lyrebird’s demo today sounds mechanical, as soon as you adjust prosody and expression, and add cell call distortion, along with a spoofed number, they can be convincing enough to create havoc if applied in the right social engineering contexts.
First, check out Business Insider’s post…
The report does a great demo of trying a few lines on his mother. Within 30 seconds, we could make his recordings sound much more convincing by adding in some pauses or even random “uh huh” or “yeah” statements.
Even random statements and interruptions can create the illusion of speaking to real person. Don’t believe it? Check out these Jolly Roger calls and how they rodeo on telemarketers or scammers.
The amount of brain matter devoted to auditory processing in humans is several orders of magnitude less than that for vision. The result is that it’s much easier to dupe us when we can’t see the person speaking. It’s why we know things are CGI but might find it harder to discern a real voice. For the former, we can detect that things seem “a little off”.
However, the next technology that we’re going to need to deploy is authentication of voice by AI based systems that will soon be better to detect a fake than us. Part of that technology might need to be to establish a chain of authority for all audio capture.