Moving from concept to execution can be a complicated process – especially with data mesh architectures. But by understanding and preparing for obstacles, you can make the implementation process as streamlined as possible.
In our recent webinar Data Mesh vs. Data Security: Can You Have Both?, Immuta Senior Product Manager Claude Zwicker, Snowflake Security Field CTO Seth Youssef, and phData Senior Cloud Solution Architect Love Malhotra discussed various data mesh challenges they’ve seen data teams face when making this concept into a reality. What are the common roadblocks to a successful data mesh implementation, and how can they be overcome?
Data Mesh Implementation Challenges
Amongst the myriad challenges a team may face when implementing a data mesh, Malhotra identified three main categories: architectural challenges, technological challenges, and operational challenges.
Let’s see what makes each of these a frequent blocker to data mesh adoption.
One expected benefit of a data mesh implementation is enhanced self-service control over your data. By leveraging unique data domains for project- or purpose-specific needs, you can make accessing data more seamless than ever before – all while maintaining proper data access and security controls.
The issue is that not every organization will have the same requirements for its data mesh architecture. This leads to concept-execution friction, as expectations of the data mesh may differ from implementation. Factors like company size, growth rate, data maturity level, and number of data sources and users can have a major impact on the feasibility of a successful data mesh.
The biggest challenge boils down to how an organization applies access controls across a distributed domain-based architecture. A top-down, centralized approach to data governance can ensure that policy is consistently applied, but it doesn’t work for companies looking to scale beyond a handful of domains.
“[Centralized governance] doesn’t scale very well,” said Youssef. “If you go and work with large companies where they have many data domains, scale is very hard. Every business unit might have different requirements. And these units understand the data more than a central entity in that perspective, so the central entity doesn’t have enough knowledge or skills to actually set policies at that level.”
Centralized governance doesn't scale very well. If you go and work with large companies where they have many data domains, scale is very hard. Every business unit might have different requirements. And these units understand the data more than a central entity in that perspective, so the central entity doesn't have enough knowledge or skills to actually set policies at that level.”
Ultimately, centralized teams might not have enough context about their growing number of data domains to create appropriate policies. If you can’t govern access to domain data at scale, then a data mesh implementation will cause more risks than benefits for your organization.
The original concept of the data mesh sprouted from a desire for an alternative to monolithic data architectures that limited flexibility and opportunities for growth. By distributing data governance across separate domains, you are given more flexibility to scale data-driven initiatives.
This scalability can easily be limited by inadequate tools. If the foundational data platforms that you are building your data mesh on do not offer dynamic and flexible capabilities, they will not support a distributed architecture at scale. Your platforms must be able to support a distributed user base across domains without sacrificing data mesh security, something that many legacy solutions simply do not have the capacity to do.
“The only way for you to grow domains is to work with a proper data platform,” said Youssef. “If you start doing your data in single data clouds – data cloud one, data cloud two, data cloud three – you will end up with some issues that will need to be solved.”
Your cloud platforms must have cross-functional capabilities that enable holistic data governance, data sharing, data monitoring, and observability across domains in order to support a scalable and secure data mesh.
By providing more self-service access to data resources, organizations expect to streamline operational efficiencies through the enablement of widespread data democratization. Democratization is the process of extending access to more of your data users, eliminating complicated frameworks and bottlenecks in the data access pipeline.
“We want to empower domains, data product teams, and data owners to own and manage their own data,” said Zwicker. “We want to have them interact directly with the consumers, shortening communication cycles, having fast feedback cycles, and driving value.”
We want to empower domains, data product teams, and data owners to own and manage their own data. We want to have them interact directly with the consumers, shortening communication cycles, having fast feedback cycles, and driving value.”
Ideally, a domain-based architecture provides a wider range of users with self-service access to the data they need to achieve value-based goals.
The challenge? Data mesh is too often viewed as only a technical change, not a cultural one. In reality, the success of a data mesh architecture relies just as much on engaged data users as it does the effectiveness of the tools and technology on which it is built.
“The fact is, [data mesh] is not like a plug-n-play,” said Malhotra. “You cannot just switch over one day. It requires a lot of commitment from that organization-level culture to succeed.”
The fact is, data mesh is not like a plug-n-play. You cannot just switch over one day. It requires a lot of commitment from that organization-level culture to succeed.”
Without cultural buy-in, changing to a data mesh architecture could easily cause organizational strife and confusion.
Solving Common Data Mesh Implementation Challenges
Bearing in mind these architectural, technical, and operational hurdles, how can teams go about achieving an realistic data mesh implementation? Our webinar panelists had a few key suggestions:
Federating Data Governance
Rather than taking a purely centralized approach to data governance – which we’ve established cannot scale effectively with a growing business – you should apply a form of federated data governance, which is one of the four main pillars of the original data mesh concept.
Governance should be applied at different levels of the data architecture, maintained at different levels by relevant teams who operate closest with the data in question.
“Typically in modern democracies, you have three levels: a federal level, a state level, and a local level. And each of these levels, they have different responsibilities,” said Zwicker. “We have a constitution, we have laws, and we manage things at different levels – but it can never be against what’s said at the higher level. That ensures that from this federal perspective, we have consistency across the board and can ensure that we are compliant.”
How does this apply to a data mesh? Governance responsibilities can be assigned to different teams at the domain level, regional levels, and even on a global level. Global policies are the most general, applying controls that you’d want in effect throughout every domain. The regional- and domain-level controls would only be applied at those specific levels of the architecture. This helps create a hierarchical structure that removes the bottleneck centralized controls, while still ensuring that the proper policies are applied where necessary within the distributed system.
Choosing Supportive Data Platforms
Understanding your current tech stack, as well as platforms you may adopt in the future, is an important step in the data mesh implementation process.
“When selecting and implementing supporting tools, we have to be cognizant of which tools are already there and what kind of budget the team has,” said Malhotra. “We also need to be cognizant of introducing new tools, and how we can see these new tools integrat[ing] with those existing tools for the automation of data discovery, quality, self-service access, and more.”
We also need to be cognizant of introducing new tools, and how we can see these new tools integrating with those existing tools for the automation of data discovery, quality, self-service access, and more.”
This process should not have to be a complete lift-and-shift from legacy tools to brand new platforms. Each organization will start with its own established tech stacks, and should consider its implementation like a home renovation. Some pieces already exist, others might need to go, and new platforms can be added to meet any unfulfilled needs. You don’t need to knock down the whole house, just make the necessary adjustments within the existing infrastructure.
Dynamic data storage and analytics platforms like the Snowflake Data Cloud can provide the support teams need for the creation and upkeep of interconnected data domains. Paired with a data security platform that can automate and federate access control policies across domains, your team can ensure that your tech stack is prepared to handle the creation and expansion of a data mesh ecosystem.
Engage and Enable Key Stakeholders
“It’s very hard to assemble a team with a federated level mentality. That needs a little bit of cultural change,” said Youssef. “But the good thing [is], if you want to go with data mesh, it’s not a big bang. You can start with a single domain, and then you can grow it.”
It's very hard to assemble a team with a federated level mentality. That needs a little bit of cultural change. But the good thing is, if you want to go with data mesh, it’s not a big bang. You can start with a single domain, and then you can grow it."
Building out the technology and then springing it on your team will not garner successful results. Instead, start small and logically by identifying key users and use cases, and establishing some initial purpose-based domains. Once these domains are up and running, you can point to their success as both inspiration for further adoption and a model for new users to reference.
For anyone concerned with the risk of a data mesh, namely IT and security/compliance teams, Zwicker suggests focusing on the positive aspects of a domain-based delegation of responsibilities.
“It’s key that you have an approach to essentially show that, whilst delegating access management towards different different levels in your platform, you can a) provide transparency, b) provide consistency, and c) guarantee that you have policies applied at these different levels that ensure that certain standards are met,” he said.
It's key that you have an approach to essentially show that, whilst delegating access management towards different different levels in your platform, you can a) provide transparency, b) provide consistency, and c) guarantee that you have policies applied at these different levels that ensure that certain standards are met."
Essentially, meeting various organizational stakeholders where they’re at – rather than forcing data mesh upon them as a new required architecture – can help ensure users’ education, understanding, and overall buy-in to this new paradigm.
The Road to a Secure Data Mesh
Ultimately, a successful data mesh requires reflection, observation, collaboration, and empowerment. As long as you’re unified in setting realistic implementation goals, understand the changes needed in your tech stack, and empower your users with information and technical enablement, you can facilitate a streamlined and practical implementation.
To learn more about how the Snowflake Data Cloud and Immuta Data Security Platform help enable these kinds of distributed data architectures – without adding unnecessary risk or sacrificing robust security – watch the full Data Mesh vs. Data Security: Can You Have Both? webinar on-demand. You can also dig deeper into Snowflake data security and access controls in Immuta & phData’s new white paper How Do I Integrate Snowflake Security With My Enterprise Security Strategy?
Watch the Full Webinar
Gain more data mesh implementation insights from Immuta, Snowflake, and phData.Watch Now