How Immuta Simplifies Sensitive Data Tagging for Snowflake Data Lineage

Data is essential to modern business, thanks to its ability to improve insights and drive competitive decision-making. Yet, while many organizations are adopting cloud data platforms to simplify seamless onboarding of new data sources, managing access to data can significantly slow time-to-value. Should this stop you from adding more data sources? In short, no.

Instead, organizations need to develop a strategy to manage who can access data, when, and why – in a scalable manner. This strategy should include capturing the business context of the data, as well as detecting and tagging its sensitive metadata so data teams can build access policies. Key considerations when applying tags to data are, “what is in this data?” and “where did it come from?” The key to answering these questions is data lineage.

What is Data Lineage?

Data lineage is the process of mapping a “family tree,” so to speak, of data tables and views. You can think of it as a directed graph that connects tables and columns to downstream tables and columns. In many cases, this process is repeated every time the organization creates downstream data assets from existing sources. This has become extremely common practice with the shift from the extract-transform-load (ETL) model to an extract-load-transform (ELT) model, since all transforming (i.e., the creation of derived data products) now happens in-database using the power of the cloud and Snowflake.

Data lineage is important because just as you inherit traits from your parents and grandparents, downstream tables can inherit metadata from root tables. As discussed, this metadata can include sensitive data tags, which are critical to data policy enforcement. If lineage keeps a mapping of those traits that are passed down to child tables, then those child tables can be more authoritatively validated and policies based on that metadata can be more accurate.

Enabling Snowflake Data Lineage Tagging with Immuta

Snowflake Data Cloud offers best-of-breed cloud data management so you can maximize the value of your data. Its recently announced data lineage capabilities help track your data to its source and see what has happened to the data over time.

Immuta has worked closely with the Snowflake team to leverage the platform’s lineage capabilities to make access control management simpler and more accurate. Using Snowflake lineage, Immuta automatically propagates data tags defined in the ancestry tables and columns to downstream tables and columns without manual effort.

This provides three key benefits to the data platform team:

  1. Sensitive data tags need to only be validated on the root tables rather than all tables, significantly reducing validation effort.
  2. Since the tags are propagated to downstream tables and columns, Immuta no longer needs to run sensitive data discovery on those child data sources, and instead can just focus on the root tables, thus significantly reducing Snowflake compute costs and improving accuracy per #1.
  3. Lastly, and potentially most importantly, Immuta policies are built based on tags and now that we have automatic propagation of those tags, policies have better guarantees of accuracy at the time of table creation because those tags are carried through.

Conclusion

Immuta ensures that users have proper and authorized access to the right data at the right time, without impacting performance. With Immuta, data teams are able to speed up data access by 100x, decrease the number of policies required by 75x, and achieve provable compliance goals. Immuta further facilitates fast and reliable access to data by leveraging Snowflake’s data lineage support, reducing both validation efforts and compute costs for data platform teams.

Want to learn more about Immuta’s support for Snowflake data lineage and see how it works for yourself? Schedule a briefing with our team today.

Blog

Related stories