About year ago, our oldest daughter was learning the colours and had bad imprint. In reading Scooby-Doo’s Color Mystery, she confused the words “hat” and “yellow”. The result was that anything that was yellow was now called the colour “hat”. “The bird has a hat beak” or “the banana is hat”.
This reminded me of a few instances where we were first creating our NLU engine for the Ubi. We had a bad set of training data that ended up leading to even worse performance that we had expected. We were building a classifier for timers and alarms “Remind me in 10 minutes to check on the pasta” or “Set an alarm for 2 minutes from now”. Figuring out and catching the intent during an utterance could be very difficult but even more so could be the entity.
In the example about, depending on how much training, the system might take the reminder entity to be “To check on the pasta” or “check on the pasta”. Or, it could identify the reminder entity as “from now” vs. figuring out it was the timing entity as “[present time] + 00:02:00.00”.
Machine learning has huge potential for time savings by short cutting tedious processes through automation however, it’s only good as its training. Bad teaching makes for bad learning but even with good teaching, wrong knowledge can be acquired.