Editor's Note: This is the second in a series of four articles featured on the UMGC Global Media Center during Cybersecurity Awareness Month.
Artificial intelligence (AI), used across numerous industries to drive decision-making, efficiency and processes, is the future of computing. For all its great potential in sectors such as health care, education, finance, agriculture and automotive, to name just a few, AI does have vulnerabilities that affect its intended use and security.
One such vulnerability that is hard to detect but easy to exploit is “poisoning.” The data used to train AI models can be poisoned. The term “poisoning” is somewhat misleading. It does not disable the AI. Instead, the aim of poisoning is to misdirect the AI, to behave more like smoke and mirrors and less like cyanide.
One recent example of this vulnerability affected Google’s Gmail. In 2018, email spammers used a poisoning attack to bias spam filters to misidentify emails as not spam. This resulted in some malicious email making it into our inboxes when it otherwise should have been marked as dangerous and sent to the spam folder. Another example, also affecting a Google platform, was the use of carefully produced viruses that acted as poisoned data inputs to Google’s VirusTotal system, an aggregator of antivirus programs. Here, the vulnerability resulted in regular computer files being marked as malware and potentially quarantined or even deleted.
A common misconception with poisoning is that the algorithm is the victim. This is not true. The algorithm is fixed in program source and cannot be modified at runtime. The AI will not necessarily generate errors when poisoned. In fact, without dedicated defense mechanisms in place, we may not ever realize we’ve been poisoned.
Poisoning corrupts the emergent model the AI algorithm produces. The poison does this by targeting specific elements within the training data and skewing the outcome. If we think about AI-based decision-making, we can imagine the features and labels—or branches and nodes—of the tree are the affected model components. For example, a poisoned algorithm designed to predict that students will like a certain programming class if they had taken a previous one will inaccurately predict the opposite.
Fortunately, validated methods to protect AI implementations from data poisoning threats are available. Defenses focus on data. For example, we can sanitize training data before allowing it into our algorithm. The concept here is like sanitizing user input in a web application to avoid SQL injection. Further, we can attempt to generalize training data. Specific elements affected by poison are unable to influence the prediction model this way.
These are static defenses, however. Each time new training data is collected we must rerun our sanitization or generalization processes. While effective, the nature of these defenses is awkward and static. Yet, we can establish dynamic defenses, ones that learn and adapt to evolving vulnerabilities if we treat the AI poisoning vulnerability as a problem for AI to solve.
In the future, newer forms of AI, such as generative adversarial networks (GANs) where one AI trains another AI, may hold a dynamic poisoning defense clue. GANs are a powerful form of learning because one AI evaluates what the other AI generates more rapidly and precisely than any human. A GAN can detect when evaluations based on the trained model are misaligned with expected outcomes.
As AI continues to evolve so, too, will its vulnerabilities. Poisoning is just one example of what exists. While no one knows for certain how prevalent poisoning is, we do know how expansive the AI landscape is as of today. Over 90 percent of businesses have some form of AI in their technology portfolio. Other vulnerabilities such as cryptographic backdoors and corrupting language models through text prompts are possible. Certainly, there are future ideas yet to be discovered as well.
Unfortunately, if industry does not begin to adjust cybersecurity defenses to include AI infrastructure, the future may not be as bright as we would hope. The negative effects of poisoning may be worse than spam email or quarantined files. Poisoning could affect stock trading, autonomous vehicles, or even weapons trying to identify targets.
One way forward would be for private-public partnerships to co-develop defensive technologies. Academia has a strong grasp on vulnerabilities in AI and the research holds a variety of potential countermeasures to vulnerabilities such as poisoning. Partnerships will be able to take the theory and concept to an applied science conclusion.
Jason Pittman is a collegiate associate professor at University of Maryland Global Campus in the School of Cybersecurity and Information Technology.