Power in the Modern Data Stack: Taming the Data Dragon

Dragons have long been a topic of fascination for humans. With origins stretching back to the myth of Mušḫuššu in ancient Mesopotamian culture and Apep in ancient Egypt, these mythical creatures still capture contemporary imaginations to the point of breaking television premier records. But what is it that makes dragons so fascinating?

For Dave DeWalt, Founder and Managing Director of advisory firm (and Immuta investor) NightDragon, these beasts have a compelling dual nature. Balancing power with danger, dragons contain immense potential strength that can (quite literally) go up in flames if not managed properly. This same power/danger duality is evident in modern data use.

DeWalt, John Cordo, Principal at NightDragon, Mike Holmberg, Data Privacy Tech Leader at HP, and Matthew Carroll, CEO & Co-Founder of Immuta discussed this concept in a recent webinar, Taming the Data Dragon: Why Managing Access Is Critical for the Future of Data Use. In this blog, we’ll highlight these data-driven leaders’ three key insights about taming the “Data Dragon” – and how to make its power work for you.

Modern Data Stack Challenges: The “Data Dragon”

Before sharing solutions for optimizing data’s power and minimizing risk, the webinar speakers observed the factors contributing to the dangerous “Data Dragon” organizations are facing. These challenges start with the sheer volume of data involved in modern use and analytics. During the advent of data use, information was collected and stored in smaller quantities and often kept separate from other technology by firewalls and other defensive mechanisms. “Now we’ve just had this massive explosion of data,” noted DeWalt. Holmberg echoed his sentiment that there is simply more data now than ever before.

On top of the increasing amount of data, there is a wider array of locations in which it is being stored, accessed, and utilized. This comes from the increasingly common migration to the cloud, as the promises of cloud storage and computing influence more organizations to make the move. In addition, the new “work from anywhere” model of business spurred by the Covid-19 pandemic means that data is accessed from exponentially more locations than ever before. Add in the growing number of data privacy regulations impacting data use, and you’ve got a troublesome stew of sensitive information, users who need to access it, and laws that govern its utilization.

The combination of extensive data points and burgeoning technologies creates a situation where analytical needs are outpacing traditional approaches to managing data access. Carroll highlighted this challenge, noting that “these are nascent technologies that are being adopted at rates that we didn’t see in any [other] time in technology and computing infrastructure.” This is all occurring with “no real cohesive single standard [for] data privacy” according to DeWalt, which presents a massive challenge for data teams who need to respond to increasing pressure from both data users and regulatory requirements. Describing this growing concern, Carroll noted that “more users are able to access more data than ever before, and…typically you have controls in place. We just don’t have those controls yet.”

This is the ultimate risk of the “Data Dragon.” The overwhelming shift to cloud-based data storage and analytics is surpassing the ability for data access to be properly controlled. The forecasted benefits of leveraging the data’s potential are overshadowed by the risks of data leakage, breach, or misuse. As the attack surface for malignant actors widens, organizations must work to ensure that their data is being accessed efficiently – without compromising its security.

How to Tame The “Data Dragon”

With such immediate risk surrounding the future of data and obscuring the abundant benefits of leveraging this resource, what steps can be taken to effectively address these concerns and tame the “Data Dragon?” Here’s what the experts had to say:

1. Separate Policy from Platform

The number of cloud data platform providers, such as Snowflake, Databricks, Starburst, and Google BigQuery, is only going to continue growing and diversifying. As organizations adopt multiple technologies to best suit their needs, there remains one constant: data must be governed wherever it lives. If data access control policies are created and maintained in each of these individual platforms, then the data will not be subject to a consistent standard of protection.

Rather than attaching policies to individual platforms, data teams must implement tools that allow for consistent policy authoring and enforcement across their data ecosystems. Speaking to this necessity, Carroll emphasized that “in this age of modern data security and privacy…you need to be able to separate the policy from those platforms in order to dynamically control [the data] at scale.” To elaborate on this point, he laid out his “three key pieces of modern data security in cloud data infrastructure,” spotlighting the importance of:

  1. Separating data rather than keeping it all in one bucket
  2. Making data de-identification a standard practice
  3. Building and implementing next-gen monitoring capabilities

Keeping policy separate from platform enhances modern data stacks’ versatility and allows data teams to control access wherever the data travels. By choosing technologies that permit this, data teams can build and apply universal policies and head off the challenges of controlling widely-distributed data.

2. Make Data Security a Priority from The Start

“If you don’t have security, you do not get privacy,” expressed Holmberg, while discussing HP’s data security and privacy strategy. The sheer variability of data use in modern environments adds a complex layer to data security. Addressing this complexity, Holmberg noted that “we seem to invent new uses of data [that] have not only jurisdictional variation, but then customer variation [and] product entitlement variation,” almost constantly. Continuously identifying new uses for enterprise data generates a complex data landscape that, while useful for business objectives, still needs to be effectively secured.

While this variability cannot be predicted, it can be accounted for as organizations build their modern data stacks. As data teams take the “lift and shift” approach to cloud migration, security measures should be baked in from the start. DeWalt describes this mentality as “designing in” security capabilities, asking “as we lift and shift that data, are we designing in our access rights [and] policies…to really come together as a team to manage and reduce risk?” Scalable security and privacy methods must be built into the foundation of any modern data stack.

Proactively determining access control and security policies allows teams to avert risk rather than reactively dealing with it. Holmberg referenced HP’s cloud migration story to describe their approach to “designing in” security while building their data stack.

“That is the heart of privacy by design,” Holmberg claimed, “saying ‘Ok, we’re building something new, how do we design this in and make these capabilities fundamental at the start?’”

By asking these questions, and finding the tools and techniques to make foundational security possible, organizations can approach the “Data Dragon” with assurance that their modern data stack is secure and risk-averse.

[Read More] Best Practices for Securing Sensitive Data: A Guide for Data Teams

3. Structure Your Data Teams for Success

While tools and technologies are the organs that keep the modern data stack operating, they can’t be set up and managed without the right people. The final, and most important, front in taming the “Data Dragon” are the people who manage organizations’ data stacks and control the flow of data. Taking on the multifaceted challenge of wrangling secure data use requires teams to be organized and aligned towards the right goals.

“It all starts with org structure,” said Carroll. “Scale starts with people [who] are put into a position where their responsibility is ‘How are we going to scale our data?’”

Automation and proper tooling can do wonders, but it is the people making data access control decisions that have the last word in scaling organizational data success. Roche, a Swiss healthcare company with global operations, implemented such an organizational structure in order to reach its data-driven objectives. Carroll noted that Roche added data product managers into each business line to focus specifically on the company’s data assets. These managers are explicitly responsible for data utilization and security, working internally with a range of relevant players (CSOs, CISOs, etc.) to keep data viable and safe.

“I think people underestimate the amount of people that need to be working together to…execute a program to not only deliver value of data, but oversee the integrity of it and make sure it’s protected,” remarked Carroll. By creating roles that solve for this complexity, as well as the intricacy of the organization’s data architecture, teams can strengthen their personnel and fend off the dangers of the “Data Dragon” with confidence.

Facing Modern Data Stack Challenges Head-On

“This is not a roadblock, we’re actually the roadbuilders,” said Holmberg on building security into the data stack. “We’re trying to pave the highway [so] you can go on as fast as you want, we just don’t want you to go off-road.” This is the ultimate goal of “taming the Data Dragon.”

While the multifarious challenges of increasing data use cases, fast-moving technologies, and widespread data accessibility are unavoidable, they can be tamed with a proactive approach to data security and privacy. By separating policy from platform, proactively “designing in” security for data stacks, and intentionally structuring personnel, you can avert the danger of the “Data Dragon” and safely optimize the power of data.

Choosing the right technologies to achieve these goals is also key to success. Immuta’s Data Access Platform allows users to separate policy creation and orchestration from platform and write comprehensive plain-language access policies that are automatically applied across an entire data ecosystem. These policies can be written, maintained, and understood by any stakeholder, so that personnel throughout the organization have the power and visibility they need to protect data assets. With dynamic attribute-based access controls, these policies can scale and maintain security as new platforms, users, and data are added to the modern data stack.

To see how simple policy creation and implementation can be, try our self-guided walkthrough demo today. And if you’d like to hear more from these data-centric experts, you can watch the full webinar here.

Taming the Data Dragon: Why Managing Access Is Critical for the Future of Data Use

Watch the Webinar

Related stories