AI and cybersecurity have been inextricably linked for many years. The good guys use AI to analyze incoming data packets and help block malicious activity while the bad guys use AI to find and create gaps in their targets’ security. AI has contributed to the ever-escalating arms race.
AI has been used to strengthen defense systems by analyzing vast amounts of incoming traffic at machine speed, and identifying known and emergent patterns. As criminals, hackers, and nation-states deploy more and more sophisticated attacks, AI tools are used to block some of those attacks, and aid human defenders by escalating only the most critical or complex attack behaviors.
Also: How AI can improve cybersecurity by harnessing diversity
But attackers also have access to AI systems, and they have become more sophisticated both in finding exploits and in using technologies like AI to force-multiply their cadre of criminal masterminds. That sounds hyperbolic, but the bad guys seem to have no shortage of very talented programmers who — motivated by money, fear, or ideology to cause damage — are using their talents to attack infrastructure.
None of this is new, and it’s been an ongoing challenge for years. Here’s what is new: There’s a new class of targets — the business value AI system (we mostly call them chatbots). In this article, I’ll provide some background on how — using firewalls — we’ve protected business value in the past, and how a new breed of firewall is just now being developed and tested to protect challenges unique to operating and relying on AI chatbots in the commercial arena.
Understanding firewalls
The kinds of attacks and defenses practiced by traditional (yes, it’s been long enough that we can call it “traditional”) AI-based cybersecurity occurs in the network and transport layers of the network stack. The OSI model is a conceptual framework developed by the International Organization for Standardization for understanding and communicating the various operational layers of a modern network.
The network layer routes packets across networks, while the transport layer manages data transmission, ensuring reliability and flow control between end systems.
Also: Want to work in AI? How to pivot your career in 5 steps
Occurring in layers 3 and 4 respectively of the OSI network model, traditional attacks have been fairly close to the hardware and wiring of the network and fairly far from layer 7, the application layer. It’s way up in the application layer that most of the applications we humans rely on daily get to do their thing. Here’s another way to think about this: The network infrastructure plumbing lives in the lower layers, but business value lives in layer 7.
The network and transport layers are like the underground chain of interconnecting caverns and passageways connecting buildings in a city, serving as conduits for deliveries and waste disposal, among other things. The application layer is like those pretty storefronts, where the customers do their shopping.
In the digital world, network firewalls have long been on the front lines, defending against layer 3 and 4 attacks. They can scan data as it arrives, determine if there’s a payload hidden in a packet, and block activity from locations deemed particularly troubling.
Also: Employees input sensitive data into generative AI tools despite the risks
But there’s another kind of firewall that’s been around for a while, the web application firewall, or WAF. Its job is to block activity that occurs at the web application level.
A WAF monitors, filters, and blocks malicious HTTP traffic; prevents SQL injection and cross-site scripting (XSS) attacks, injection flaws, broken authentication, and sensitive data exposure; provides custom rule sets for application-specific protections; and mitigates DDoS attacks, among other protections. In other words, it keeps bad people from doing bad things to good web pages.
We’re now starting to see AI firewalls that protect level 7 data (the business value) at the AI chatbot level. Before we can discuss how firewalls might protect that data, it’s useful to understand how AI chatbots can be attacked.
When bad people attack good AI chatbots
In the past year or so, we have seen the rise of practical, working generative AI. This new variant of AI doesn’t just live in ChatGPT. Companies are deploying it everywhere, but especially in customer-facing front ends to user support, self-driven sales assistance, and even in medical diagnostics.
Also: AI is transforming organizations everywhere. How these 6 companies are leading the way
There are four approaches to attacking AI chatbots. Because these AI solutions are so new, these approaches are still mostly theoretical, but expect real-life hackers to go down these paths in the next year or so.
Adversarial attacks: The journal ScienceNews discusses how exploits can attack the ways AI models work. Researchers are constructing phrases or prompts that seem valid to an AI model but are designed to manipulate its responses or cause some kind of error. The goal is to cause the AI model to potentially reveal sensitive information, break security protocols, or respond in a way that could be used to embarrass its operator.
I discussed a very simplistic variation of this sort of attack when a user fed misleading prompts into the unprotected chatbot interface for Chevrolet of Watsonville. Things did not go well.
Indirect prompt injection: More and more chatbots will now read active web pages as part of their conversations with users. Those web pages can contain anything. Normally, when an AI system scrapes a website’s content, it is smart enough to distinguish between human-readable text containing knowledge to process, and supporting code and directives for formatting the web page.
Also: We’re not ready for the impact of generative AI on elections
But attackers can attempt to embed instructions and formatting into those webpages that fool whatever is reading them, which could manipulate an AI model into divulging personal or sensitive information. This is a potentially huge danger, because AI models rely heavily on data sourced from the wide, wild internet. MIT researchers have explored this problem and have concluded that “AI chatbots are a security disaster.”
Data poisoning: This is where — I’m fairly convinced — that developers of large language models (LLMs) are going out of their way to shoot themselves in their virtual feet. Data poisoning is the practice of inserting bad training data into language models during development, essentially the equivalent of taking a geography class about the spherical nature of the planet from the Flat Earth Society. The idea is to push in spurious, erroneous, or purposely misleading data during the formation of the LLM so that it later spouts incorrect information.
My favorite example of this is when Google licensed Stack Overflow’s content for its Gemini LLM. Stack Overflow is one of the largest online developer-support forums with more than 100 million developers participating. But as any developer who has used the site for more than five minutes knows, for every one lucid and helpful answer, there are five to 10 ridiculous answers and probably 20 more answers arguing the validity of all the answers.
Also: The best VPN services of 2024: Expert tested
Training Gemini using that data means that not only will Gemini have a trove of unique and valuable answers to all kinds of programming problems, but it will also have an enormous collection of answers that result in terrible results.
Now, imagine if hackers know that Stack Overflow data will be regularly used to train Gemini (and they do because it’s been covered by ZDNET and other tech outlets): They can construct questions and answers deliberately designed to mislead Gemini and its users.
Distributed denial of service: If you didn’t think a DDoS could be used against an AI chatbot, think again. Every AI query requires an enormous amount of data and compute resources. If a hacker is flooding a chatbot with queries, they could potentially slow down or freeze its responses.
Additionally, many vertical chatbots license AI APIs from vendors like ChatGPT. A high rate of spurious queries could increase the cost for those licensees if they’re paying using metered access. If a hacker artificially increases the number of API calls used, the API licensee may exceed their licensed quota or face substantially increased charges from the AI provider.
Defending against AI attacks
Because chatbots are becoming critical components of business value infrastructure, their continued operation is essential. The integrity of the business value provided must also be protected. This has given rise to a new form of firewall, one specifically designed to protect AI infrastructure.
Also: How does ChatGPT actually work?
We’re just beginning to see generative AI firewalls like the Firewall for AI service announced by edge network security firm Cloudflare. Cloudflare’s firewall sits between the chatbot interface in the application and the LLM itself, intercepting API calls from the application before they reach the LLM (the brain of the AI implementation). The firewall also intercepts responses to the API calls, validating those responses against malicious activity.
Among the protections provided by this new form of firewall is sensitive data detection (SDD). SDD is not new to web application firewalls, but the potential for a chatbot to surface unintended sensitive data is considerable, so enforcing data protection rules between the AI model and the business application adds an important layer of security.
Additionally, this prevents people using the chatbot — for example, employees internal to a company — from sharing sensitive business information with an AI model provided by an external company like OpenAI. This security mode helps prevent information from going into the overall knowledge base of the public model.
Also: Is AI in software engineering reaching an ‘Oppenheimer moment’? Here’s what you need to know
Cloudflare’s AI firewall, once deployed, is also intended to manage model abuses, a form of prompt injection and adversarial attack intended to corrupt the output from the model. Cloudflare specifically calls out this use case:
A common use case we hear from customers of our AI Gateway is that they want to avoid their application generating toxic, offensive, or problematic language. The risks of not controlling the outcome of the model include reputational damage and harm to the end user by providing an unreliable response.
There are other ways that a web application firewall can mitigate attacks, particularly when it comes to a volumetric attack like query bombing, which effectively becomes a special-purpose DDoS. The firewall employs rate-limiting features that slow down the speed and volume of queries, and filter out those that appear to be designed specifically to break the API.
Not entirely ready for prime time
According to Cloudflare, protections against volumetric DDoS-style attacks and sensitive data detection can be deployed now by customers. However, the prompt validation features — basically, the heavily AI-centric features of the AI firewall — are still under development and will enter beta in the coming months.
Also: Generative AI filled us with wonder in 2023 – but all magic comes with a price
Normally, I wouldn’t want to talk about a product at this early stage of development, but I think it’s important to showcase how AI has entered mainstream business application infrastructure use to the point where it’s both a subject of attack, and where substantial work is being done to provide AI-based defenses.
Stay tuned. We’ll be keeping track of AI deployments and how they change the contours of the business application world. We’ll also be looking at the security issues and how companies can keep those deployments safe.
IT has always been an arms race. AI just brings a new class of arms to deploy and defend.
You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter on Substack, and follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.