Bolstering LLM Security to Protect Against OWASP Top Ten Threats

We’re all witness to the buzz around Generative AI (GenAI). These offerings, which convert user prompts into model-generated text, images, and videos, have permeated our professional and personal lives as they become increasingly accessible.

One incredibly prevalent type of GenAI is the Large Language Model (LLM), which includes the likes of ChatGPT and Google Gemini. You may already have used an LLM to write threatening letters of complaint, ghost-write your hilarious yet endearing wedding speeches, or rephrase the impolite things you would like to say to your colleagues, vendors, or customers to sound somewhat more professional. Whether for personal or business reasons, the use of LLMS is more popular than ever – and even more important to how businesses operate.

Regardless of how you’ve used LLMs, you’re not alone. Respondents to the 2024 State of Data Security Report reported that 88% of their employees are using GenAI, whether the company has officially adopted it or not. It’s easy to understand why – LLMs can provide fast, consumable access to information, shorten menial or manual tasks, and help users expand on new ideas, amongst other benefits.

The risks associated with LLMs, however, are often misunderstood or overlooked. And on top of this, about half of today’s data leaders claim that their data security strategy is failing to keep up with the pace of AI evolution. Over my twelve years in technical consulting and implementation, I’ve navigated the security and privacy implications of many exciting, powerful new technologies. In this blog, we’ll look at some of the top threats you need to consider when implementing or using LLMs, and how they can be proactively combated.

[Read More]: DBTA Report: Data Governance & Security for the Cloud & AI Era

OWASP Top 10 for Large Language Model Applications

The Open Worldwide Application Security Project (OWASP) is a community-led nonprofit organization with a mission to foster secure software application development and deployment. Since 2003, OWASP has produced its “Top Ten,” an awareness document that presents a consensus view of the 10 most critical security risks for web applications.

At the end of 2023, OWASP published the Top 10 for Large Language Model Applications to specifically address the challenges facing modern LLM development. By raising awareness of these agreed-upon vulnerabilities, the organization aims to improve the way we develop, deploy, and utilize LLMs moving forward.

Most Pressing Threats to LLM Security

While each of the risks in OWASP’s top 10 list poses a legitimate threat to LLM development and use, we’ll focus on two of what we believe are the most dangerous – model poisoning and prompt injection. By examining these in detail, we can better understand their relevance – and how best to mitigate them.

Model Poisoning Threats

Let’s first consider how LLMs are created. As with any AI or Machine Learning (ML) model, LLMs are trained on large data sets. These data sets are the food that’s fed to the models, and they completely determine how the model behaves and responds to user prompts. A “healthy” diet will result in a model that returns relevant, well-written results. Eat something bad, however, and you might just end up with food poisoning.

Once a model has been trained, it’s not possible to interrogate it to determine which data was used in its training. But, by manipulating the input data, it is possible to alter or skew the outputs. This form of attack is called model poisoning, and its goal is to reduce the model’s accuracy by injecting incorrect, biased, or deceptive data into the training process.

It is extremely difficult to detect model poisoning attacks, especially when the data being used to train the model appears plausible. In addition, LLMs often draw data from non-curated sources, such as the open Internet. An adversary could deliberately plant mass amounts of false information in these public data sources, or modify this data, to poison the models that are trained on them.

To mitigate model poisoning threats, it is imperative to:

  • Properly secure access to training data
  • Verify the veracity of the training data’s lineage
  • Simplify continuous security monitoring and auditing
  • Automate where possible to detect changes to source training data

These measures will ensure there is no unauthorized access to training data, and that any modifications to these data sets are detected and logged, allowing for a retrospective analysis of the data on which a model was trained.

Prompt Injection Threats

Another threat to LLMs comes in the form of prompt injection, which occurs after the model has been trained and deployed. Similar to SQL injection – one of the most common web hacking techniques – LLMs can be manipulated  by injecting malicious inputs disguised as legitimate prompts. An adversary is able to craft an input that appears legitimate, but is designed to elicit a different, often confidential response.

Prompt injection is a particularly concerning threat for Retrieval-Augmented Generation (RAG)-assisted LLMs. RAG-based AI models use a combination of LLMs and external data sources to generate grounded, accurate, and contextually relevant responses. This means that the model has access to both training data and real-time information, making it more vulnerable to a prompt injection attack.

To manipulate the LLM, a malicious user could craft a prompt and directly input it into the LLM, upload files containing harmful instructions, or ask the LLM to summarize documents containing the hostile prompt. LLMs by nature do not distinguish between user instructions and external data – making prompt injection much more difficult to protect against than SQL injection, where inputs can be more easily sanitized.

There are a number of methods for mitigating prompt injection attacks, including:

  • Monitoring for and detecting  malicious prompt attempts
  • Treating your LLMs as you would untrusted users in your data ecosystem
  • Enforcing data security and privacy controls on both training data and real-time data sources

This multifaceted approach to protecting all data the LLM may access will significantly reduce the risk of disclosing sensitive data or harmful information posed by malicious prompts.

Enforcing LLM Security at Every Layer

To properly mitigate threats and empower your organization to leverage GenAI – whether in the form of an LLM, RAG, or any other form of AI/ML – we must consider and three layers of model security:

  1. The storage layer, where the training and retrieval data is stored at rest.
  2. The data layer, where data is transformed or “chunked” to be used for data engineering and model training.
  3. The prompt layer, where users interact directly with the models.

[Read More]: Immuta Introduces Multi-Layered Security for RAG-Based GenAI

As evidenced by the craftiness of model poisoning and prompt injection attacks, protecting solely the data storage layer is no longer sufficient for LLM security. Instead, you need to maintain comprehensive and consistent controls on each layer in order to ensure that the data being ingested and shared by the model is accurate, secure, and appropriate for user consumption.

When enforcing LLM security, there are three core facets to consider:

  1. Data Discovery and Classification – You must understand which data is sensitive to treat and protect it appropriately. This may include manual cataloging, ideally with automated sensitive data discovery and classification, which changes depending on the context in which the data resides.
  2. Data Access-Control – You must enforce fine-grained access control in a way that is transparent, federated, and scalable. Consider using modern access control techniques such as attribute-based access control (ABAC) or purpose-based access control (PBAC), which each add a level of extra flexibility and contextual information to your ability to grant or deny data access.
  3. Monitoring and Auditing – It is essential to ensure that both user and LLM activity are monitored for any suspicious behavior, especially where prompt injection and model poisoning are involved. Keep accurate, available, and transparent records of activity on your data, ideally including simplified review/search and automatic alerting of anomalous actions.

The Immuta Data Security Platform gives you each of these dynamic capabilities, helping to enforce consistent security across the various layers of your RAG-based GenAI and LLM tools. To learn more about enforcing holistic LLM security and mitigating the risk of these attacks, request a demo from our team. For insight into how more than 700 data professionals are leveraging and protecting their AI models, check out The AI Security & Governance Report.

The AI Security & Governance Report

See how 700+ data professionals are leveraging and protecting their AI models.

Read More
Blog

Related stories