The Immuta team is excited to announce our latest 2.2 release. Below are the highlights of the new major features included in the release.
Prior to Immuta 2.2, all policies were tightly coupled to the data source in Immuta. For example, if you exposed an Oracle table in Immuta, you could mask specifically named columns in that table through our natural language policy builder: “mask the phone number column for everyone”. These policies are typically built by the user exposing the data, our persona known as a “Data Owner”.
However, there were many cases where the Data Owner knows a good deal about the data, but very little about the policies that should be enforced on it. We introduced Global Policies to solve this problem. Global policies allow users with “Governance” permission, typically your legal and/or compliance users, to build policies at a global level that semantically reference where to enforce policies. So going back to our example, rather than the Data Owner having to specifically mask the “phone number” column by name, a Global Policy can say, “mask anywhere there’s a phone number” and that policy will get propagated down to all the appropriate Immuta data sources. To do this, Immuta must understand where all “phone number’s exist, which is why we introduced “Curated Tags”.
Curated tags are a way to mark entity types across data in Immuta. It is possible to build complex hierarchies of entity ontologies as tags, and tag data sources or columns with those tags. Why would you want to do this? First off, it can drive Global Policies described above, but it can also make search and discovery of data easy and intuitive. We also found that many of our customers have already done this work, either manually, or through tools like Atlas, so we’ve added an interface to Immuta to easily plug in external catalogs that contain entity information.
Let’s see this in action, below is a quick demonstration which shows using curated tags to drive global policies:
As mentioned above, information on data can now be sourced from external catalogs and systems through a pluggable interface in Immuta. This enables search and global policies and expands Immuta’s partner ecosystem to data catalog and governance systems. More on those partnerships coming soon!
To fire home the external catalog functionality, below is a quick video that demonstrates an integration with Apache Atlas where tags are synced between Atlas and Immuta and used to drive global policies:
Support for Spark 2.x
In Immuta 2.1 we released Spark 1.6 support, now we support Spark 2.x sessions as well. Why is this important? There’s no limit to how much data you can process, secure, and audit with Immuta policies. With this feature, the policies are enforced natively in SparkSQL during file access. In other words, SparkSQL is still doing it’s raw read from the HDFS files AND enforcing the Immuta controls. No Hadoop out-of-the-box tools exist to dynamically enforce this type of control to enable Spark in highly regulated environments.
You can see our prior Immuta 2.1 / Spark 1.6 demonstration of this feature here:
Database / Cloud Support
We also added more database support:
- Amazon Athena
- Azure SQL Data Warehouse
Athena, Presto/Starburst, and Azure SQL Data Warehouse are quite interesting because they allow SQL queries against data in S3 or Azure Blob Store. Integration of Immuta with those allows complex Immuta policy enforcement on your data while separating compute and storage. There’s also potential for hybrid cloud workloads where Immuta can unify data on-premise with data stored in S3 or Azure SQL Data Warehouse.
We hope you’re as excited as we are about our new 2.2 features which further bridge the gap between data owners, data scientists, and lawyers, enabling them to work together, in unison to accelerate their data science programs. Also, stay tuned for partner announcements with regard to our external catalog support.