A data mesh is a relatively new data platform architecture that moves away from the ‘monolithic’ approach of data warehouses and data lakes in favor of a more decentralized method of data management.
The data mesh architecture is designed to address three key problems inherent in data warehouses and data lakes, namely:
- Who owns data – the team that is the source of the data or the team providing the infrastructure for housing the data?
- Who is responsible for data quality? Typically the infrastructure team assumes this responsibility, but they are not necessarily deeply familiar with the data itself and may therefore require additional resources.
- What happens when an organization needs to scale while avoiding data bottlenecks, a common problem with data warehouses or data lakes?
Data Mesh Architecture
A data mesh treats data as a product within the organization, assigning each data source its own product manager or lead. Data as a product is a commonly used term when referring to the architecture of a data mesh.
Another frequently used term for a data mesh is data infrastructure as a platform. This essentially refers to the necessary storage, transfer pipeline, cataloging, and access control required for all of the data domains within the organization.
The best domains are easily discoverable, highly secure through the use of global access control, and self-service.
Why Is Data Mesh Important?
Many consider data mesh to be the future of data management as companies move away from a single data warehouse (traditional or cloud-based) or data lake, which generally rely on technical specialists, tend to incur technical debt, and provide teams with less control over growing data pools. As unique data use cases proliferate, leaning heavily on a single, centralized platform becomes less appealing.
Organizations increasingly also deal with inefficient data supply chains – data producers that are out of the loop, data consumers who are constantly waiting on data bottlenecks, and data teams that don’t have the resources to keep up with massive volumes of data and fast-paced data demands of their business.
A data mesh has the potential to solve all of these problems by combining a centralized data hub with a series of spokes (domains) that are each responsible for a data pipeline. This theoretically allows for improved data scalability over time,when executed correctly.
That said, for all its potential benefits, the data mesh architecture also has its share of potential drawbacks and unique challenges.
Challenges & Benefits of Data Mesh
Let’s compare some of the potential data mesh implementation challenges with the benefits of a successful switch from a centralized data lake or data warehouse to a data mesh.
Data mesh is not necessarily right for all organizations. For instance, it’s generally not particularly well suited for organizations that do not have large data domains. Organizations with smaller-scale data needs and less potential confusion about who is in control of data may find that implementing a data mesh architecture may not be worth the effort of transitioning from existing processes.
Meanwhile, the tools available for enabling universal data access control in a data mesh architecture are currently limited. Immuta is one of the only solutions that can provide dynamic access controls consistently across the data mesh.
Finally, data virtualization is a process still rife with potential challenges that are outside the scope of this article, but which could stand in the way of a successful data mesh implementation.
Despite a few potential challenges, a data mesh can deliver significant benefits and results. Chief among these is the ability to avoid the politics of ‘data sovereignty,’ or confusion over who is responsible for the data coming in from a range of different sources.
Because a data mesh puts control of data into a series of domains based on those domains’ specific needs, there’s no single ‘owner’ of the data. This not only helps streamline operations and decision-making, but can also reduce confusion and frustration within your organization.
Under a data mesh approach, domain teams enjoy greater autonomy when creating and using relevant data, while data users benefit from global interoperability standards and independent data products with their own value.
A data mesh can also help reduce data management bottlenecks by removing the burden of centralizing responsibility to a single team, and instead allowing various data product managers to control their specific data domains.
Plus, a data mesh architecture can ensure that your data ecosystem can scale as data sources, use cases, and data access models increase.
When to Consider a Data Mesh
When determining whether or not you should move to a data mesh architecture, consider the following questions.
- How many sources of data does your company deal with on a regular basis?
- How large is your data team, including analysts, engineers, product managers, and other roles?
- How many non-data teams (sales, operations, marketing) at your organization use data to make key decisions?
- How many products does your company offer?
- How many products or features currently exist or are being built that will be heavily data-driven?
- How often do bottlenecks slow the momentum of implementing new data products?
- Is data access control and data security a major priority at your organization?
When managing large data sources, various data consumers, and a range of use cases, a data mesh architecture could provide substantial benefits that bypass inefficient processes and allow you to get more out of your data. To find out how Immuta integrates with Starburst to enable secure data mesh architectures, check out our blog.
Looking for a cloud data access control tool that can help effectively and securely implement a data mesh architecture across your organization? Immuta provides self-service data access with automated, dynamic access control that can be applied consistently across all of your cloud data platforms, and easily audited to monitor data access and use across domains. Our features include:
- Universal Cloud Compatibility
- Attribute-Based Access Control
- Sensitive Data Discovery & Classification
- Dynamic Data Masking
- Data Policy Enforcement & Auditing
Ready to learn more? Request an Immuta demo today!