Anonymization and the Future of Data Science

Managing data privacy is becoming an increasingly difficult challenge for massive corporations littered with data silos. New data regulations–from the EU to the US to China— illustrate that this challenge is really just beginning.

This trend underscores the importance of anonymization – one of the most important tools in a data scientist’s “privacy toolbox.” Data anonymization is a technique that can be used to protect private information in your data while preserving, to varying degrees, the utility of that data; however, as we’ll see, this tool is only best put to use in combination with others, and not as a standalone strategy to protect your data.

What’s this thesis based upon? In a few words, the very real limits to anonymization. And, of course, Judd Apatow.

