Immuta Enhances Starburst & Databricks Integrations, and Introduces Dynamic Query Classification 

We are excited to announce Immuta’s latest product updates, including:

  1. General availability of our v2 integration with Starburst (Trino)
  2. Support for Databricks runtime 11.3
  3. Immuta’s Data Security Framework with Dynamic Query Classification

Let’s explore these features in more detail:

General Availability: Enhancements for Starburst & Databricks

Starburst (Trino) Integration

Immuta’s Starburst (Trino) integration is now generally available for all customers. Our goal has always been to make policy orchestration seamless for our customers, and this new integration advances that goal by bypassing the need to go through an Immuta catalog, while optimizing workflows and making it easier to explore existing catalogs.

Key benefits of Immuta’s Starburst integration include:

  • Simplified operations and improved data governance
    With Immuta’s proven policy engine, complex policy statements can be modeled and applied at scale. The integration with Starburst allows users to extend these policies to their queries and ensure data is being used in compliance with regulations, as well as internal rules and policies.
  • Enhanced data security
    Data teams can easily apply dynamic data masking, row-level security, and other security controls to their Trino queries. This ensures that sensitive data is always protected and only accessible only by authorized users.
  • Increased agility and faster data access
    Data teams need to move at the pace of business. By providing secured and governed access to data on Trino, which allows data to be queried regardless of where it is stored, Immuta allows users to quickly access the data they need for exploration and analysis. This ultimately accelerates value delivery and innovation.

Databricks Runtime 11.3 LTS Support

Immuta now supports Databricks Runtime version 11.3 LTS and enables customers to use Immuta’s Databricks Spark integration with Unity Catalog.

Private Preview: Immuta Data Security Framework with Dynamic Query Classification

As part of our Immuta Detect offering, we added a new feature for data sensitivity classification called Dynamic Query Classification. Out of the box, this feature automatically classifies data according to the Immuta Data Security Framework (DSF), attaching sensitivity levels to classifications informed by current global best practices.

The Immuta DSF is informed by global best practices and is designed to cover a wide variety of compliance postures, including those consistent with major data privacy regulations and security standards: GDPR, CCPA, GLBA, HIPAA, and PCI. The classification feature can be customized and extended to match evolving regulation, industry best practices like ISO 27001 or NIST, and bespoke corporate policies.

Data classification is an essential component of a data security platform, and Dynamic Query Classification makes it easier for users to categorize their data based on various dimensions such as entity type, data subject-matter, identifiability level, and data subject type, while taking context into account. Data classification requires two discrete steps: entity identification and sensitivity classification. Immuta’s existing sensitive data discovery (SDD) feature comprises the entity identification layer, while the new Dynamic Query Classification feature builds upon the SDD output to power the sensitivity classification.

Immuta DSF can classify data at rest and powers Dynamic Query Classification, which classifies sensitivity of executed queries. For example, a table of addresses may not be sensitive on its own, but when joined with patient data, the addresses would also be classified as PHI.

Let’s take a look at a hypothetical scenario in which a data team in the healthcare industry may implement a simple framework to solve this complicated HIPAA data classification problem.

To define this HIPAA Framework, the team creates a new framework called HIPAA, introduces a classification tag called HIPAA.PHI, and sets its sensitivity level at “2” – above other forms of non-medical personal information. Next, they introduces the following rules:

  1. Tag columns as HIPAA.PHI that are directly tagged as Immuta DSF.Health
  2. Tag columns as HIPAA.PHI that have neighboring columns tagged as HIPAA.PHI

Using these rules, Immuta’s data classification understands that HIPAA.PHI is any data which is already understood by the DSF to be health data (Rule 1), and other data stored or accessed in conjunction with HIPAA.PHI (Rule 2). This overcomes a common over-tagging problem under HIPAA; if an address column is next to a column tagged HIPAA.PHI, it must also be considered patient health information, while addresses outside of the medical data context should not be treated as such.

What happens if the hospital uses a proprietary medical record number format that is not recognized by Immuta? There are many different approaches but here, the hospital simply leverages a custom pattern under SDD to tag columns matching the format with Hospital.Medical Record Number, and edits the Immuta DSF to include columns tagged as Hospital.Medical Record Number in the definition of Immuta DSF.Health.

Now, whenever PHI is joined to address data in a query, Immuta’s Dynamic Query Classification will classify the address data as PHI and present the occurrence in Immuta Detect.

Dynamic Query Classification can be leveraged to great effect, such as detecting accidental re-identification of de-identified data. For example, suppose de-identified HIV status information is kept as part of a drug trial study and is stored separately from patient contact information. While neither the tables containing patient contact information nor the de-identified HIV status alone are particularly sensitive, the join of the two creates highly sensitive medical information warranting additional audit and control.

The Immuta DSF is available by default when the feature is enabled, providing customers with an out-of-the-box framework pre-populated with rules to classify data discovered and tagged by Immuta’s native SDD.

Overall, Immuta’s Dynamic Query Classification enhances Immuta’s Data Security Platform by providing customers with a scalable and automated way to label their data according to sensitivity levels and meet regulatory compliance requirements.

Self-Managed Updates

For customers managing their own instance of Immuta, we have released the following updates:

  • Snowflake table grants to GA
  • Support for Databricks runtime 11.3 LTS
  • Starburst (Trino) integration to GA
  • Subscription policy default to GA
  • Low row access policy mode for Snowflake to Public Preview

Next Steps

Immuta is committed to revolutionizing data security management by removing the complexities of data access control and governance. Our platform incorporates the latest cutting-edge features that enable data teams to streamline processes and optimize ROI for cloud data platforms such as Starburst, BigQuery, Snowflake, and Databricks. With Immuta, you can easily manage and secure your data, saving time and resources while achieving maximum efficiency.

See for yourself how it works – schedule a demo with our team today.

Try it yourself.

Schedule a demo with our team.

Request a Demo

Related stories