Cloud data platforms such as Snowflake provide native security and access controls engineered for platform-specific administrators to manage roles across the organization. Snowflake leverages concepts from Discretionary Access Control (DAC) and Role-Based Access Control (RBAC) models. This provides some flexibility for and control over how users can access securable objects by managing a hierarchy or roles and privileges.
However, administration by a centralized data operations and engineering team can be a bottleneck to scale access for different tenants, or business lines, each with specific domain knowledge over their data, policies, and users. This presents a significant challenge to govern data access in complex data landscapes that can include additional cloud data platforms such as Databricks, Amazon Redshift, or Azure Synapse.
Below are examples of dimensions that contribute to complex data landscape:
This requires a diverse set of roles and stakeholders to implement and manage data access:
- Data Platform Architecture
- Data Engineering & Operations
- Data Owners (technical stewards organized by function or lines of business)
- Data Consumers (data scientists and analysts)
- Legal / Compliance Teams
Key challenges to managing Snowflake access control
These challenges are not necessarily specific to Snowflake, but apply to any centralized cloud data platform and span people, process, and technology. With automated data management approaches, technology is in a unique position to enable the people and process side of it as well.
The challenges are more common in large organizations where Snowflake is managed by a centralized platform team and include:
- Centralized platform teams’ lack of domain expertise over the data
- Manual and slow processes to fulfill data access requests for internal customers
- Difficulty for business and security stakeholders to understand who can access what data and why
Recommended approach to scale data access management
Many organizations have proven value with a specific use case and set of users, but are now looking to expand with more use cases in the cloud and to scale user adoption. However, the data platform team faces bottlenecks when supporting this in complex data landscapes. This requires decentralized management, similar to concepts in the emerging data mesh architecture, using centralized access control infrastructure.
These are the common bottlenecks to user adoption and key approaches you should consider:
|Manual Role Management||Automate global security & privacy controls|
|Static Access Control||Leverage attribute-based access control using policy variables|
|Distributed Domain Stewardship||Empower each data owner to manage roles using data domain knowledge specific to a business line|
Let’s walk through an example:
|Claire||Data Architect||Builds out analytical capabilities on Snowflake|
|Glen||Data Engineer||Responsible for operations such as managing data access requests|
|John||Data Owner||Has domain knowledge of data for the booming vegan fish business unit|
|Ryan||Data Scientist||1 of 20 data scientists for the vegan fish business unit in Germany|
|Nicole||Data Owner||Has domain knowledge of data for the emerging mushroom coffee business unit|
|Deirdre||Senior Counsel||Ensures compliant data use as part of the legal/compliance team|
Step 1 – Implement global controls for new use case
Claire (Data Architect) will work with business stakeholders on the analytical use case to collect the requirements for using data in the cloud. With these inputs, it’s important to set up global policies for data access such that all users can only see data in their country and all SSNs are masked for everyone except privileged HR users.
When Deirdre (Senior Counsel) gets pinged by auditors and needs to answer basic questions on what PII is accessible for this promotion in compliance with the rules for data collection — requests that are usually in legal language, which sounds Greek to Claire and Glen (Data Engineer) — Claire and Glen can share the explainable policies and audit reports in real time with Deirdre. These policies and data audit trails are easy to understand without background knowledge of Snowflake access control modules and declarative commands.
Step 2 – Implement line of business specific controls
Build out a data access control and security model. At minimum, we need to distinguish users by geography and department. Rather than setting up several decision points for each geographic/department combination, and then assigning privileges, let’s consider decoupling the decision points to control access at runtime by checking policies against those geographical and departmental attributes.
Step 3 – Scale use cases and user adoption
Each user and stakeholder will need different interfaces to contribute to the data and analytical operations.
Glen (Data Engineer) supports Databricks and Starburst and will want to integrate the Snowflake controls using code with his existing data operations toolchains.
John (Data Owner) and Nicole (Data Owner) are not Snowflake experts and will want Claire (Data Architect) to isolate their data as separate tenants in the data lake. They will then need to be able to understand and author data access policies based on specific business needs, such as having Ryan (Data Scientist) build a vegan fish prediction model using data in Germany for which mushroom coffee customers would be open to a joint offer.
What this looks like in Immuta
To implement this in a scalable way in complex environments, Claire (Data Architect) needs to architect a flexible data access governance solution, such as Immuta, to enable the rest of the users and stakeholders.
No one starts with a fully automated and scalable solution, but your technology decision should be flexible to scale adoption to maximize the investment in cloud data analytics.
Let’s map the personas to the three steps in the example above, along with each interface and experience mapping using Immuta:
This more specific example shows the experiences and interfaces necessary to scale access to Snowflake across cross-functional stakeholders, each with different domain knowledge.
Data access governance in complex data landscapes requires flexible and innovative user experiences. Immuta was engineered to protect the world’s most sensitive data with legal engineers and UX researchers on the product team to ensure more stakeholders can participate in value creation from analytics and limit bottlenecks.
If your organization is expanding use of cloud, request a demo to learn how other organizations accelerate migration to the cloud using Immuta.