For more than four years, Sophie Stalla-Bourdillon has been the Senior Privacy Counsel and Legal Engineer at Immuta. She’s also a member of the Brussels Privacy Hub’s executive team at VUB University and a visiting professor at the University of Southampton Law School (UK) where she held the chair in IT law and Data Governance from 2018 to 2022.
Additionally, she’s the author and co-author of several legal articles, chapters, and books on data protection and privacy, as well as the editor-in-chief of the Computer Law and Security Review, a leading international journal of technology law. She’s served as a legal and data privacy expert for the European Commission, the Council of Europe, the Organisation for the Cooperation and Security in Europe, and for the Organisation for Economic Development and Cooperation, and holds a Master’s degree in English and North American Business Law from Panthéon-Sorbonne University (Paris I, France), in addition to an LLM degree from Cornell Law School (NY, USA).
Sophie is a frequent contributor for Immuta, with her work including articles such as The Next Generation of CCPA Compliance Requirements and 5 Steps for Building a GDPR-Compliant Data Strategy to name a couple.
With artificial intelligence (AI) and child data recently making waves with their controversial misuse, I thought Sophie was the ideal colleague to sit down with and discuss what these subjects mean for data and privacy law.
In our chat, we delve into data security compliance challenges, and how we ensure data from new technologies is used legally and responsibly.
Alec Gannon: As our Senior Privacy Counsel and Legal Engineer at Immuta, I thought this would be a great opportunity to discuss what data security challenges organizations are facing, especially in terms of compliance with evolving privacy laws. To start, what are some pressing data security challenges at the moment?
Sophie Stalla-Bourdillon: The first ones I’d note are the discovery of data and the classification of data. It’s a challenge knowing where your data is, how to classify it, and how to attach the sensitivity level to that data. These decisions are hard to make if you don’t have an overview of the dataflows.
Next, you need to be able to protect the data. Data policy authoring is a challenge as it’s not always obvious how high-level guidelines should be translated into operational rules. Data engineers can get lost in translation. That’s why we need interdisciplinary teams to set the rules and define enforcement mechanisms. And then you must detect malicious, unauthorized behavior that may jeopardize the status of your data.
Data is often meant to be reused by different teams within and across organizations, so you want to strike the right balance between reuse and protection. This means avoiding lengthy workflows, but at the same time, making sure that you’ve got proper safeguards in place.
Finally, detecting unauthorized behavior is a challenge. For example, if you have not implemented the least privilege or data minimisation principles, it’s hard to monitor whether someone has the authorization to access the data. Therefore, it’d be difficult to detect malicious behavior as early as possible and react to the insider threat.
AG: You mentioned the importance of classifying data based on levels of sensitivity. Why is child data at the forefront of the sensitive data conversation at the moment?
SSB: There is a lot of attention on child data, and there are several attempts to regulate further processing activities targeting child data. Understanding whether you’ve got child data is a challenge if you don’t have a process in place to detect whether a user is a child or not. This is a hot topic with fines, such as the recent TikTok fine in the U.K.
Also, this is the reason why ChatGPT got banned in Italy. The services weren’t able to distinguish between a child user and an adult user, and therefore could not do their risk assessment properly.
AG: What are your views on the Online Safety Bill that is in development at the moment? Is enough being done to protect children’s data?
SSB: I have some mixed feelings on this topic. It’s one thing to acknowledge the need to protect children, but it’s another thing to find the right instruments to do so. By imposing monitoring obligations on platforms, you are somehow giving them the incentive to do more processing activities and more profiling, which could lead to a variety of harms, including discrimination. It’s hard to find the right balance between protection and free flow.
I don’t think that the Online Safety Bill at the moment has the right balance, because it’s essentially relying on tech companies trying to make them become the content regulators and gatekeepers. There is not much transparency, and they are pushed to use age verification and age estimation techniques.
We also need more privacy-enhancing technologies (PETs) to do these things. Otherwise, we pose the danger of actually pushing for more intensive processing activities, which can be a problem if we end up profiling individuals by trying to observe every minute of their online activity. There is a debate however about the maturity of these PETs, and not all data protection regulators necessarily share the same views. Once we acknowledge that each PET has its own set of assumptions, constraints, and limits, it becomes easier to understand what is really achieved when it is implemented.
AG: You mentioned ChatGPT, which is another hot topic at the moment due to the European Commission proposing the European Union Artificial Intelligence (EU AI) Act. What are your thoughts on the development of this Act and how effective it might be?
SSB: It’s a complex piece of legislation, of which primary aim is to govern the deployment of high-risk AI but the development of high-risk AI will also be impacted. There is an attempt to govern the development and deployment of general-purpose AI which may be used for high-risk applications, but there are also exclusions, which could undermine the approach. This is why the GDPR remains fully relevant, both at the training and service provisions stages. The EU has also been adopting a whole regulatory package recently, including the Digital Services Act, which targets among others things search engines. So it’s a matter of understanding what the service behind the model is, to fully understand how it is regulated.
AG: Do you think that AI is developing more quickly than the regulations can keep up with?
SSB: That has always been the challenge when it comes to law and technology, although the law should have the capability to adapt because it’s meant to be technology-neutral. So in theory, you should be able to apply the law when the technology is evolving. That requires a bit of creativity because you cannot anticipate the technology and the legalese might feel odd at times, but with a principle-based approach as we have in the General Data Protection Regulation (GDPR), the law should still be able to frame practices, even before the deployment of the technology.
That said, the big challenge remains enforcement. You must have the strength to enforce the law, which is challenging because the supervisory authorities in the data protection space don’t always have the means to enforce it. There’s lots of lobbying and pressure, while the authorities don’t have all the necessary resources.
AG: Are there any data regulations that could shift the way organizations will need to think about managing their data?
SSB: Again, it’s about enforcement. A good example is regulators asking for a sudden change of practices, which is what happened with ChatGPT in Italy. They said, “From now on, you need to either comply with the law or stop what you’re doing.” And because ChatGPT couldn’t comply with the law, they decided not to deliver the service. Since then the Italian Supervisory Authority has however given more details about what compliance could look like in this context and we should see whether OpenAI can meet the request.
If supervisory authorities use these kinds of orders more often, then that could become a game changer. There is also a remedy that is potentially very powerful and that has already been used by the FTC in the US , which is called algorithmic disgorgement. This is the idea that not only do the organization has to stop the unlawful processing, delete the improperly obtained data but they also have to destroy the algorithm trained with this data. So if this is used across sectors more often, then that could impact practices significantly.
AG: How might the Immuta Data Security Platform help protect the data of customers that are reused for analytics and AI training purposes?
SSB: Immuta is a tool to govern access to data and create the audit trail. It makes it possible to operationalise the least privilege and the purpose limitation and data minimisation principles. It’s also possible to break down processing activities by phases and refine data access requirements over time, which is particularly useful when privacy-preserving techniques are used. Immuta offers a set of privacy-preserving techniques that can be used for machine learning purposes such as randomized response and k-anonymization. But the use cases go beyond that. So any data-driven business units setting data access rules can use Immuta to govern access and usage. However, it needs to be clear that Immuta is obviously only a piece of software and fundamental questions such as is my processing purpose legitimate or should I accept the residual risk generated by the processing can’t be answered by the software .