What is Sensitive Data and How Should I Protect It?

If you’ve done any work with data, you’ve likely heard the term “sensitive data.” Data compliance regulations like the GDPR, HIPAA, CCPA, and more make repeated mentions of “sensitive personal data” and “sensitive personal information,” and many privacy statements include similar language. A quick search of the term “sensitive data” returns more than 1 billion search results.

It’s clear that this term is relevant to modern data practices, across operations like data collection, storage, analysis, information sharing, and overall data usage. But what does it actually mean? We routinely provide our names, addresses, credit card and phone numbers to third parties, but how much of this data is considered sensitive? In this blog, we’ll define sensitive data, describe the difference between sensitive and personal data, and share how organizations can secure it without inhibiting its value.

What is Sensitive Data?

Sensitive data is information that, if exposed, could jeopardize the security, privacy, and/or integrity of the individual(s) or organization(s) at hand. If this data were to be accessed by unauthorized users or bad actors, it would almost certainly have a detrimental effect on the person or group to which it belongs.

Examples of sensitive data include things like:

  • Demographic Information: gender, race, ethnic origin, religion
  • Financial Information: credit card numbers, bank account numbers, account balances
  • Health Data: biometric information, genetic data, health plan numbers

If exposed, these types of information could lead to dangerous repercussions for the individuals to whom they are attributed. This is why sensitive data necessitates a higher bar of data security and privacy than other types of information.

Sensitive Data vs. Personal Data: What’s the Difference?

We know that sensitive data can reveal information about an individual that they would likely not want publicly available. How, then, does it differ from personal data? Where is the line between information that is simply “personal” and that which is vulnerable enough to be deemed “sensitive?”

NIST defines personal data as “Information that can be used to distinguish or trace an individual’s identity, either alone or when combined with other information that is linked or linkable to a specific individual.” Similarly, the GDPR’s definition is “any information which are related to an identified or identifiable natural person.” In each case, we see that personal data is a broader category of information about an individual that could be used to identify them.

Sensitive data is a more refined categorization of personal information. Think of it this way: geometrically speaking, every square also technically fits the definition of a rectangle. However, not every rectangle can be categorized as a square. In the same way, sensitive data can fall under the broader umbrella of personal data, but not all personal data is going to be delicate enough to be called sensitive.

How to Identify Sensitive Data

With the delineation between sensitive data and personal data in mind, how should data security and governance teams identify their most sensitive data? Besides the various legal definitions of the term, there are standard qualifiers that organizations use to determine which of their data is sensitive. NIST’s CIA Triad outlines these qualifiers:

  1. Confidentiality: How confidential is the nature of the data? Confidentiality is akin to data privacy, as it is focused primarily on preventing data from being accessed by or disclosed to those who have no right to see and/or use it. However, it should not limit availability unnecessarily to those who do have the rights to access it. Information considered highly confidential can aptly be described as sensitive data.
  2. Integrity: How has the original nature of the data been preserved? Integrity is measured based on data’s authenticity and how guarded it is against modification and/or destruction. It must be maintained throughout data’s lifecycle, as it is stored, accessed, and analyzed. While this should apply to any data use, sensitive data’s integrity is paramount to its usefulness, as well as the personal safety of the data subject.
  3. Availability: How available is the data to users within a data ecosystem? Availability is critical to data use and analysis. This is where platform and security teams need to strike an effective balance between data access and data security, making sure users can access the data they need while ensuring that it is sufficiently protected from a leak or breach. Sensitive data can be assessed by how available it should be within an ecosystem – the more sensitive the information, the less widely available it should be.

With these various qualifiers in mind, teams can more effectively determine which of their data is the most sensitive–and therefore, the most crucial to protect.

How to Protect Sensitive Data

Given the impact that its leakage could have on an individual’s safety and security, protecting sensitive data is at the core of virtually all modern data security initiatives. Organizations need to know that the sensitive data they collect is not at risk, both to preserve their data subjects’ safety and avoid the penalties and reputational damage associated with a security incident. To identify and proactively protect their sensitive data, data teams should take the following steps:

Discover Sensitive Data

You cannot effectively protect that which you don’t know about. When data is collected and ingested into a modern data stack, it should be reviewed as soon as possible to determine its level of sensitivity. Sensitive data discovery automatically scans, tags, and classifies data to provide a comprehensive view of an organization’s assets. This visibility is crucial to protecting sensitive data. Identifying and tracking sensitive data with consistent classifiers will set a baseline level of knowledge on which to build measures to secure it.

Secure Sensitive Data

Once sensitive data is properly identified, systems can be built to ensure its long-term security, confidentiality, and integrity. Writing and enforcing dynamic attribute-based access controls will help keep your sensitive data secured against unauthorized access. On top of access controls, privacy enhancing technologies (PETs) like data masking, k-anonymization, and differential privacy further protect data against external and internal threats.

Monitor Sensitive Data

Even while protected by security controls, sensitive data needs to be consistently monitored for suspicious activity. Continuous data monitoring and anomaly detection can help teams track and log all data activity within their environment. This activity includes things like user queries, access behaviors, configuration and classification changes, and more, providing insight into who is doing what with which sensitive data. With detection capabilities, any sort of anomalous behavior can be identified and shared with security and governance teams in order to execute a timely and effective response.

To see how the Immuta Data Security Platform gives teams the power to easily identify and secure their sensitive data–without inhibiting access to the users that need it most–schedule a demo with one of our security experts. If you’d like to learn more about protecting your sensitive information, download our Best Practices for Securing Sensitive Data eBook today!

Best Practices for Securing Sensitive Data

A Guide for Teams of Any Data Management Maturity

Download eBook
Blog

Related stories