Have you ever looked at a turtle and thought it was a rifle? I’m willing to bet that most of you have not. This may sound like an absurd case, but it is exactly what happened when researchers at MIT (Massachusetts Institute of Technology) were trying to find vulnerabilities in machine learning systems developed by Google. They altered a few pixels of the picture of a 3d-printed plastic turtle, and a seemingly unchanged and harmless plastic turtle was classified by Google’s algorithm as a rifle. The picture still showed a plastic turtle, so where did the algorithm go wrong?
Hallucinations have given the field of artificial intelligence (AI) a few teething pains. Adversarial machine learning is a relatively new field of machine learning. This field of study is trying to break machine learning algorithms by tricking them into thinking that one thing is another or to miss a signal completely by very slightly altering the input. In doing so, researchers hope to reverse engineer what a neural network may be learning from the data it is trained on, by seeing what works to fool the system and what doesn’t.
Machine learning techniques follow a black box model. They are so powerful because they can learn the best way to do something from data sets. The parameters are tweaked to achieve the best result. While the process that goes on in between is relatively opaque, adversarial machine learning researchers are trying to open that black box.
Researchers are aware that flaws and mistakes happen, but we do not understand exactly why or how. The main speculation is that the visual world is very complex and an image is usually made of millions of pixels. Given this large amount of information per image, the algorithm needs to be able to find patterns within the image and then also across images to be able to come up with a classification — in the original case, whether it’s a turtle or rifle. This means that some datasets may not be large enough to capture the essence of objects and the differences between them. Another hypothesis is that humans and computers may be looking at different things when trying to classify an image. In other words, humans might just be missing out on certain pixel-level details which are essential for computer classification.
AI is being given new responsibilities, from self-driving cars to the personal assistants in our pockets. This newfound responsibility highlights why we need to know how machine learning systems work and where their blind spots may lie. We cannot afford to have our machines hallucinate whilst driving and miss the stop signs.
Matsakis, L. (2019). Artificial Intelligence May Not ‘Hallucinate’ After All. Wired. Retrieved 27 October 2020, from https://www.wired.com/story/adversarial-examples-ai-may-not-hallucinate/.
Metz, C. (2016). How To Fool AI Into Seeing Something That Isn’t There. Wired. Retrieved 27 October 2020, from https://www.wired.com/2016/07/fool-ai-seeing-something-isnt/.
Simonite, T. (2018). AI Has a Hallucination Problem That’s Proving Tough to Fix. Wired. Retrieved 27 October 2020, from https://www.wired.com/story/ai-has-a-hallucination-problem-thats-proving-tough-to-fix/.