Machine Learning’s Dirty Secret

Underneath all the hype and the headlines and the money pouring into artificial intelligence and machine learning, there’s a dirty secret.

That secret? Almost no one knows how to utilize the technology at scale.

More precisely, only a very small handful of organizations truly understand how to manage the risks of machine learning (ML) when implemented widely. Those risks include navigating the legal, reputational, and ethical issues ML can create – from wildly offensive chatbots and image classifiers to furthering racial disparities amongst zip codes, and much, much more. And that’s not even taking into account the deceptively complex requirement of being able to predict how ML models will behave over long periods of time, or new laws like the EU’s GDPR and their impact on ML.

That’s why we’re thrilled to partner with the Future of Privacy Forum to release the first-ever guide to managing risk in machine learning, written specifically for practitioners.

You can download that whitepaper at

We partnered with the FPF because they’re one of the leading non-profit organizations focused on providing concrete guidance on privacy and data governance issues. Their expertise runs extremely deep, and their members represent some of the biggest companies on the planet, which made their input invaluable. That input helped us to tailor our recommendations to solve for the barriers standing in the way of the adoption of ML.

All of which is to say, this whitepaper is the culmination of a huge amount of work and collaboration.

On Immuta’s side, our legal engineering team spent months researching existing risk management frameworks for ML – both from a regulatory and a compliance standpoint, and based on best practices in data science. We dissected regulations like the Federal Reserve Board’s Supervision and Regulation Letter 11-7 and the European Central Bank’s Guide for the Targeted Review of Internal Models, among many others. We spent countless hours talking to practitioners about how they comply with these regulations and manage the risks of ML more generally.

What surprised us over and over again was how widespread that “dirty secret” is, and how hard it is to find practical guidance on the risks of deploying ML at scale – even as the technology receives growing attention from enterprises, the research community, and from the media.

More often than not, data scientists create and deploy machine learning models in relative isolation and in one-off projects within a line of business. There’s frequently no common framework or system to replicate their experiments, or to transition responsibility for the model if they switch teams. (For more on the replicability crisis, I recommend Pete Warden’s fantastic overview of the subject on his blog.) Meanwhile, oversight processes are frequently plagued with inconsistencies. All these problems make it incredibly difficult for organizations to effectively govern those models over extended periods of time.

And so we’ve learned, both from our research and from our direct experience, how big the void is when it comes to practical guidance for managing the risks of ML. We hope this whitepaper helps to fill that void. And, in the process, we hope it allows both data scientists and compliance personnel to create better, more powerful, and more compliant models.

Click here to download the whitepaper, and email us at with any feedback. We’d love to hear from you!