What’s the Worst That Could Happen? A Guide to AI Risks

While generative artificial intelligence (AI), foundation models, and Large Language Models (LLMs) are often described as the future of AI as we know it, their mass adoption is not necessarily straightforward. The emergence of these types of AI models has sparked concerns recently, leading to a series of open letters, enforcement orders (against OpenAI in Italy), and legal actions, including AI Stability in the UK and the US, and OpenAI in the US.

This post is the second in a series intended to discuss AI and data security, including some of the regulatory challenges and risks associated with today’s AI space. We’ll unpack the main security and privacy risks associated with AI and highlight regulatory trends.

AI Risks: The Big Picture

Despite the excitement that AI often triggers within the data science community, there are many things that could go wrong. This next generation of AI does not make privacy and security harms disappear. Quite the contrary – they make them more likely.

As customers move to the cloud to increase computing power and build or leverage AI to reduce inefficiencies, they are confronted with a list of significant challenges. From how to strengthen their overall security posture, to how to assure that biases have been properly mitigated, data security, platform, and governance teams often find themselves in uncharted territory.

The Immuta team has been observing and assessing AI practices for a few years now. In a 2019 white paper, in an attempt to unpack the future of privacy and security in the age of Machine Learning (ML), we distinguished between two types of harm:

  • Informational harms relate to the unintended or unanticipated leakage of information, specifically that which is contained within the training data. This could be the result of various attacks performed through ML model querying, such as membership interference, model inversion, and model extraction. ML models’ ability to memorize information about the training data set make this possibility more likely.
  • Behavioral harms, on the other hand, relate to manipulating the behavior of the model itself, impacting its predictions or outcomes. This can be done through poisoning, which is when malicious data is inserted in the training data, or evasion, when input data is fed into an ML system that intentionally causes the system to misclassify that data.
  • Since ML falls within the AI umbrella, these harms are relevant in assessing potential threats of modern AI models. These harms can come about as a result of certain failure modes in model development and deployment, as we’ll explore in the next section.

    AI Failure Modes

    Failure modes refer to the potential ways in which something is liable to fail. In the context of AI risks, it’s important to investigate failure modes for AI so that we can proactively understand how to mitigate them.

    In our white paper on Data Protection by Process, we unpacked a comprehensive list of failure modes which could be associated with these two types of high-level harm, such as overly large training set sizes and unauthorized access to training data. These modes are relevant for ML-based models, including AI applications.

    The primary failure modes to consider are:

    • Confidentiality: Confidentiality requires that data be accessed by authorized users only, whether it is in the form of database tables, documents, models, or others. Access controls need to be in place for each of these in order to mitigate risks of confidentiality failures.
    • Data Minimization: Data Minimization requires that data be strictly necessary, sufficiently narrow, and timely. This ties objectives and data together – broad objectives often call for significantly more data than narrow ones. By implication, general purpose models tend to need more data.
    • Integrity: Integrity requires that data remain unaltered, unless modification is expressly authorized. Failure modes occur if data has either purposefully or inadvertently been altered or corrupted, for instance through data poisoning. This presents a significant challenge to the existing AI workflow, particularly in regards to foundation AI, since those models are trained using bulk data collection of open websites. Data curation and lineage will take on renewed significance to avoid failures of integrity.
    • Fairness: Although there are many definitions of fairness, the focus is usually on the impact of the model output upon an individual addressee. The failure here occurs when a model returns results that violate the individual rights, for instance through inaccurate results or unfounded predictions.
    • AI-As-A-Service

      With the democratization of AI, a new business model is emerging: AI-as-a-service. Large operations like OpenAI host models that can either be directly queried for general purposes or fine-tuned for more specific customer use cases. But it’s important to think about this business through the lens of the failure modes and harms discussed above.

      Let’s consider a case when models are directly queried. Large models are largely focused on satisfying very broad use cases, which requires consuming vast amounts of data using publicly available information. This broadness creates tension between three failure modes: Data Minimization, Data Integrity, and Fairness.

      1. Data Minimization: The volume of data needed in this case can flirt with data minimization failure modes because it’s easy to gorge on more data than needed.
      2. Data Integrity: At this scale, assuring that the data is of sufficient quality is extremely difficult, welcoming failure modes associated with loss of integrity. The awareness that open sources are used to train models makes these public data repositories tempting targets for data poisoning attacks. In addition, the high cost of training a foundation model can limit the frequency that training occurs. As a result, data used to train the model can quickly become stale.
      3. Fairness: Attempting to support innumerable use cases is impossible, making it likely that some will not be well modeled, despite using vast amounts of data. In this case, failure of fairness is likely to occur.

      The fine-tuning workflow can solve some of these challenges, but introduce others. Fine-tuning can shrink the set of supported use cases, allowing for better scoping of the model, type of data that is consumed, and providence of that data. But it is not a panacea for avoiding AI failure modes.

      While LLMs typically train models using data retrieved from the broad internet, fine-tuned models often require domain-specific data. This information is more likely to be sensitive or confidential in nature, and its use invites greater risk of confidentiality failures. The likelihood of this failure is compounded by how the model is trained and where it is housed. A customer will need to know how their data is shared as part of the fine-tuning, or else prompt augmentation processes to control and enhance their security posture. Additionally, where the fine-tuned model lives partially depends on which input model is used as a starting point. OpenAI supports fine-tuning over its API, with the resulting model living within its infrastructure and outside of the model owner’s direct control.

      AI Regulation: What To Watch For

      Regulators across jurisdictions have been carefully monitoring trends, and a handful are considering adopting new rules meant to frame AI practices and prevent harm. The EU is taking a particularly proactive stance towards AI and input models, while the US and UK have been more hands off (so far).

      That said, local legislatures in the US are coming up with their own innovative rules. For example, New York City Local Law 144 requires employers to perform bias audits on automated employment decision tools, and notify employees and candidates about the use of such tools. Outside the US, Canada is trying to push for a more pragmatic stance on AI design, development, and use through its light-touch C-27 Bill, which includes a section dedicated to AI. But the most noteworthy legislation to date is the EU AI Act.

      What Is the EU AI Act?

      The proposed EU AI Act is a horizontal framework that distinguishes the risk levels of different AI applications, and requires more stringent controls based upon increasing risk. The act’s primary risk levels are:

      • Limited/minimal risk: Broadly available and widely used applications, such as spam filters and inventory management systems.
      • High risk: More complicated applications that require careful safety verifications and “conformity assessments.” These are often related to individuals’ rights and daily lives, and may include tools that scan resumes, evaluate test scores, or determine creditworthiness.
      • Unacceptable risk: Applications that are used to manipulate individuals, exploit vulnerabilities, or engage in social scoring that impacts quality of life. Examples may include predictive policing, and deepfakes that are used to spread disinformation.

      This is tied to the impact model failure, where a failed spam filter causes annoyance, a failed applicant screener could cause loss of income, and a failed predictive policing bot could cause loss of freedom. The initial version of the AI Act prohibits practices that exploit vulnerable groups, such as children or persons with disabilities, social scoring by public authorities, or, with some exceptions, the use of ‘real time’ remote biometric identification systems by law enforcement in public spaces.

      The EU AI Act’s latest versions include a set of provisions expressly targeting general purpose or foundation models and imposing transparency obligations upon their providers, which should find their ways into model cards. While providers and deployers are the legislation’s main targets, obligations like data governance have implications for the development phase, be it pre-training or fine-tuning. In addition, with the democratization and general nature of today’s AI, deployers and users will want to make sure actual usage aligns with the model’s intended purpose in order to control their risk levels.

      This legislative effervescence means that informed customers will try to anticipate changes in the regulatory environment and assess their options. Organizations developing or leveraging AI should stay vigilant about their security practices and up-to-speed on how this and other regulations evolve.

      What’s Next?

      There is an awareness and effort to temper the promises of LLMs and other generative AI with their risks. As we’ve explained in this blog, risks are present and failure modes are liable to cause harm. But, they are similar to what we have seen in the past. Now, we need to understand the potential scale of harm and how to mitigate these risks in the rapidly changing world of AI.

      To get a more in-depth look at failure modes, check out Data Protection by Process.

      Types of AI & Input Models

      Missed the first blog in our AI security series?

      Get It Here

Related stories