This blog was co-authored with Deepak Nelli, Director of Sales Engineering at Alation.
The modern data environment is changing. Cloud data platforms provide data-driven organizations with advanced analytics capabilities alongside much-needed simplicity; yet, greater demand for data, more data consumers and use cases, and a growing body of data use rules and regulations often work against the core simplicity ethos. This in turn has contributed to a vague and loosely defined concept of data governance.
For data-driven organizations, however, providing access to the right data at the right time for the right reason is paramount to maximizing data’s value and staying competitive. Therefore, implementing a data governance framework rooted in overarching data policies and standards is key. Alation and Immuta are at the forefront of delivering such a solution, providing best-in-class data access controls for Snowflake data sets. Together, Alation and Immuta enable you to discover, classify, protect, enforce, and audit data policies in Snowflake.
In this blog, we will walk through a real-world example that highlights how Alation and Immuta work together, and why both platforms are necessary for allowing users to get access to the right data for the right reasons at the right time in modern data stacks.
The Scenario: Masking PII with Alation and Immuta for Snowflake
Imagine it is your first day at a company and you’re getting access to analytics dashboards, including a financial dashboard containing sensitive transactional data. As a data governor, three questions come to mind:
- Where does this data come from?
- What data use agreements or policies are enforced on the data?
- How do I make changes to the classification and policies as rules change?
As you begin to explore the dashboard, you notice that it includes personally identifiable information (PII) that has already been redacted. How is this possible? In your last company, PII was always allowed in the clear because managing data policies for hundreds of users was unscalable!
You also notice that there are only two countries listed on the dashboard. You joined this company because of its broad reach in the market and know it operates in more than just the United States and Canada. So, why can you only see those two countries?
Leveraging Alation Metadata for Sensitive Data Classification
Your manager informs you that all the information around data assets in the company is curated in Alation, the enterprise data catalog. So, after receiving the URL, you log into Alation and search for credit card transactions.
Alation scans all of your organization’s source metadata and brings relevant searches to the foreground. The top table listed contains the information from Tableau, so you can start to investigate the data.
The data has been classified inside of Alation denoting how it can and should be used. You’re also able to quickly identify the current data steward so that you’re able to get permission to maintain the catalog and verify that this table is “Immuta Protected.”
Next, you want to ensure that the credit card number in this table will not be exposed to anyone who should not have access. Drilling into the definition for this column, you can see its classification states that the data is PII and sensitive:
You also notice that this column is specifically protected by a legal definition determining when users in the organization can see this information.
Now that you’ve seen how the data has been classified, you can click the link on the landing page for a description of the policy that has been applied.
Building Data Access Control Policies in Immuta with Alation Metadata
The above screenshot illustrates how Immuta pulls the classification we just configured from Alation. This is important as we start to build data policies because Immuta uses classifications to create data policies.
Data access control policies are critical because they define:
- Why someone can access a data source
- What data they can see if their access request is approved
In Immuta, you can see the policies that have been applied based on the classification discovered in Alation.
At the bottom of the page, you’ll see that there are two “global” policies protecting this data set: The first filters the country codes you’re authorized to see, and the second masks two columns in the data set that are classified as PII.
Modifying Data Policies to Mask PII in Immuta
Now you’re able to see what policies are applied to this data set – but how do you modify the policy should the rules change?
This can be done by clicking on the “Policies” shield on the left side to pull up the global policy window, which shows policies that are applied across data sets in your organization.
Here, you can see which policies were added to the credit card transactions data source and why:
- The segmentation policy applies to any table that has a “Country” code that has been classified in Alation.
- The PII masking, however, is specific to credit card tables and will only apply to tables that have a “Credit Card Number” classification in Alation.
Let’s open this policy to see how you can modify it or create a new one in the future.
Immuta’s policy builder allows you to easily build these policies in plain language. As a data governor, you don’t have to sort through complex SQL logic, decipher complex role names, or talk to DBAs to determine how this policy is enforced. With Immuta, changing the policy is as easy as writing a sentence.
Once the policies are applied, you can test queries using Snowflake’s native query editor. After you log in, you notice these policies are applied on actual tables:
Monitoring Data Use with Purpose-Based Access Control in Immuta Projects
Imagine you receive an alert from your fraud department notifying you of a potential fraudulent incident and tasking you with investigating and identifying fraud while ensuring no sensitive data is leaked. You can start by finding any credit cards that have transactions occurring in more than one country:
Two credit cards in the data set show transactions in more than one country. So, you must run a query that determines if these two transactions occurred within an unreasonable time frame from one country to the other.
Based on the query above, it appears that no credit card transactions occurred in the region where fraudulent activity was detected. In order to investigate further, however, you need access to more data.
Fortunately, a purpose has been assigned to this data set that allows you to act under a specific rationale in order to finish this job. This is done by switching to the “Fraud Detection” purpose in Immuta.
To switch contexts, you must agree to an attestation statement associated with the “Fraud Detection” project, as shown below:
After agreeing to these terms, which are defined by your organization’s compliance team, you can begin acting under this project’s purpose.
While operating under this project, you will only be able to access the “Credit Card Transactions” table with the purpose of preventing a data leakage. If you go back into Snowflake, you’ll be able to run a query to see if any transactions have occurred in an unreasonable amount of time.
After running the query, you see a transaction occurred in two countries within the span of one hour or less: the first transaction took place in Colombia, and the second took place in Indonesia. This is clearly fraudulent because the cardholder could not have been in both locations within one hour. With this information, you can send the hashed credit card number to the compliance team so they can alert the user and let them know they are starting a fraud transaction response.
In this blog, you saw how Alation and Immuta fit into a Snowflake user’s data governance workflow by coordinating metadata to enforce data policies and standards. We walked through how to find data, see its classification, and understand the policy that was transparently protecting it. Snowflake users are able to access and use this protected data because of the policies that utilize Alation’s robust catalog, Immuta’s policy engine, and Snowflake’s simple, scalable compute. Alation and Immuta together allow Snowflake users to get access to the right data for the right reasons at the right time.
To see more, check out this demo on simplifying privacy controls for Snowflake.