Blog
SQL: The Forgotten API

SQL: The Forgotten API

Chris Brown

Published October 19, 2023

Last edited: April 11, 2025

Data mesh has become a hot topic as organizations look to foster self-service analytics to make evidence-based decisions. There are plenty of articles discussing data mesh, including Zhamak Dehghani’s original blog post introducing the topic from 2019.

To get back to basics, data mesh is a socio-technical approach to creating a decentralized analytical data architecture. It organizes data by domains to make it more accessible by connecting data owners, producers, and consumers. Grounding the data mesh-enabling architecture are four guiding principles:

Domain Ownership
Data as a Product
Self-Service Data Infrastructure
Federated Computational Governance

Data products could be one of the most confusing pillars of the data mesh because they truly force organizations to rethink how they deliver data to customers using product thinking. Product thinking requires a cultural change that is externally focused vs. internally focused, solves a problem vs. takes an order, and builds value vs. breaking down tasks.

This change requires data product owners to have deep knowledge and understanding of the customer to deliver the right data product. It also forces data product owners to own the product over the entire lifecycle, adapting it to the customer’s needs and not just ending the project when the data is delivered. In other words, the data product must achieve product-market fit.

With this in mind, how do organizations deliver data as a product for business analysts and data scientists building dashboards and models? To meet the needs of these users, data needs to be delivered in a way that allows them to continue using their existing tools and processes to deliver the needed analytics. To do that, data products should be delivered using SQL.

What Is a Data Product?

Dehghani calls a data product an “architectural quantum,” and cites Evolutionary Architecture’s definition of the term: “The smallest unit of an architecture that can be independently deployed with high functional cohesion, and includes all the structural elements required for its function.”

If you’re like me, that feels too technical. Put simply, a data product is a valuable set of data for analytical consumption. While many would categorize dashboards, Excel spreadsheets, and ML/AI models as data products, I would argue that they are not. You are not able to use those entities to create new insights by combining them with other data – that would essentially mean a data product is just a data set.

What makes data products different is that they are created using product thinking. In other words, the data consumer is in the center and the product is seen as an ever-changing entity meant to delight them. This is a mindset shift in that just providing data is no longer good enough – you now need to show that the data provides value to the organization.

To do this, new metrics such as net promoter score provide better insights than traditional technical metrics, such as the number of times the data was accessed. In addition, data products must have the following usability characteristics:

Discoverable through a central catalog;
Addressable through a unique global identifier;
Trustworthy and truthful, providing service level objectives, data integrity/quality, lineage, and other key aspects;
Self-describing semantics and syntax, meaning they can be discovered, understood, and consumed without hand-holding;
Interoperable and governed by global standards; and
Secure and governed by global access control

These characteristics enable consumers to easily join, aggregate, and filter data for analysis using data from different domains. The question then becomes, how does a domain deliver a data product?

What’s Wrong with Web Services for Data Mesh?

Having worked with customers on their journey to creating a data mesh, I often see their first instinct is to utilize a web service-based delivery model. Usually, the data product is delivered over a REST interface using JSON, but some customers have looked at using GraphQL or even giving an AWS S3 endpoint for a parquet or CSV file.

Organizations fall back on web services because they are typically the way we deliver data via applications in a distributed architecture, like a data mesh. While these are perfectly valid options that meet the characteristics of a data product, they are not the most efficient way to deliver a data product. The number one reason is that a data mesh is built for analytics use cases, not operational systems.

The issue with using web services is that data is difficult to join, filter, and aggregate, which are essential functions to delivering analytical results. If one data product delivers data via JSON, and another via XML (yes, it is still used), then the data needs to be wrangled to normalize it for analysis. Even if the data is all in JSON, it must be brought into memory to perform the analysis.

This is inefficient because it becomes difficult to scale the BI tools and servers that are performing the work to provide a result. In most cases, the data will be stored in a database that creates a duplicate version. That data now needs to be kept up-to-date via data engineering work, but using APIs forces people into an anti-pattern for data mesh implementation.

Why SQL Is the Future of Data Product Delivery

History says that SQL will win out over web services for data product delivery in data mesh architectures. Why?

Enhanced Data Platform Capabilities

The first release of Hadoop in 2006 ushered in the “big data” and batch processing analytics revolution. Hadoop was exciting because it did not force analysts into strict SQL constructs like traditional relational database data warehouses.

As users worked with the data, they discovered the need to add structure to it. This led to the release of Apache Pig in 2008 and ultimately, Apache Hive in 2010. Hive brought the batch processing movement full circle by introducing SQL to the Hadoop platform.

Organizations forget that SQL can be used for data products because they are stuck on traditional RDBMS data warehouse thinking. Data platforms have significantly evolved since Haddop was released and today’s cloud technologies provide the perfect medium to deliver data products.

These new platforms decouple the compute layer from the storage, saving on the ownership cost of running an analytical data platform. Additionally, modern data formats such as Apache Iceberg, Delta, and Apache Hudi allow you to take advantage of cheap object storage while enabling RDBMS features like ACID compliance and adding the ability for concepts such as time travel. This data platform revolution is seen with Snowflake, Databricks, open source Trino and the commercial version Starburst, Google BigQuery, and others.

Organizations often get stuck in legacy thinking that says you cannot grant access to directly query data. This too has changed, as modern data platforms provide security controls to deliver data products through direct access.

Query Language & Tool Integration

SQL is the industry standard for analytical workloads and an open standard developed by the ISO/IEC Joint Technical Committee (JTC) 1 for Information Technology. Since all SQL systems base their query language on the ISO/IEC 9075, the fear of vendor lock-in is reduced, because the SQL can be used across systems with little to no change. This fact removes the need for using APIs to shield data product consumers from the underlying data platform technology.

SQL also checks all the boxes of a data product. Its functionality with existing programming languages and BI tools allows analysts and data scientists to work within their preferred tools and workflows. And, since SQL lets analysts use underlying platforms to query exactly what they want through joins, filters, and aggregates, it reduces the need for duplication and additional data engineering work that a data mesh is meant to prevent.

Most importantly, using tools such as Immuta to manage privacy and security policies allows you to securely query data products directly in your data platform of choice.

How Immuta Can Help

Immuta is a key component of a data mesh architecture, providing the computational governance to deliver data products as SQL through modern data platforms. Because Immuta enforces the necessary data controls to meet organizational and regulatory requirements, users can safely query for the data products using their existing tools, such as business intelligence tools or Juptyer notebooks. Instead of relying on in-house or custom solutions when implementing data access control, Immuta’s policy automation unlocks the potential of data mesh for self-service analytics.

Because users are directly querying the data, unlike with an API, Immuta improves cybersecurity by logging and enabling zero trust architectures at the data platform level. Combining user query information with network and application log data creates a complete picture of user activity to stop potential threats.

Immuta continually scans the data platform for sensitive data that needs to be protected, and automates access policy enforcement on that data. As new data products are onboarded, organizations can be assured that any sensitive data can be discovered and protected, lessening the possibility of data leakage.

Leveraging SQL to deliver data products within a data mesh architecture is the most practical choice for organizations aiming to achieve scalability and flexibility while unlocking true value from their cloud platforms. Layering Immuta into this equation helps ensure that no sensitive information slips through the cracks. With streamlined, efficient, and secure workflows, organizations will be better able to reap the full benefits of a data mesh architecture.

Simplify data mesh implementation.

Get the experts' advice for securing your data mesh.

Access now

3 Emerging Data Security Laws and What They Mean for You

The past few months have been particularly hectic for lawmakers across the European Union (EU). With Ursula von der Leyen’s leadership of the European Commission set to conclude after the 2024 elections, lawmakers have felt the pressure to advance critical files and policies as quickly as possible. Amid this legal...

3 Key Obstacles of Military Data Ops and What They Indicate

There are many lessons that I learned throughout my civilian and military careers, but one that continues to hold true is that obstacles to accessing the data national and strategic analysts need are not caused by a lack of reporting in the field. Rather, it is the technical burdens, lack...

A Guide to Enabling Inter-Domain Data Sharing

For many, the appeal of a decentralized data architecture relates to its potential for enhanced collaboration. But to achieve this kind of streamlined collaboration, your team must first establish a system of secure, self-service domains. In a previous blog, we explored how to make decentralized data mesh architectures a reality based on phData’s...

your data

Put all your data to work. Safely.

Innovate faster in every area of your business with workflow-driven solutions for data access governance and data marketplaces.

Book a demo

Platform Services

Metadata Registry

Data Discovery & Classification

Policy Entitlement Engine

Unified Audit

Data Domains

Apps

Data Marketplace

Data Access Governance

Ecosystem Partners

Native and API Integrations

Get Started

Take a tour of Access Governance

Take a tour of Data Marketplace

Schedule a live demo

Find a consulting partner

Data problems we solve

Unify data access control

Publish & find data products

Create & enforce policy

Monitor & audit data usage

Speed business innovation

Roles we empower

Data Product Owner

Data Consumer

Data Steward

Data Governor

Data IT

Industries we transform

Financial Services

Health & Life Sciences

Public Sector

Streamlining Data Product Governance with Immuta's New Features

Get in the know

Blog

Resource Center

Data Fundamentals

Get a deeper look

Demo Hub

How-To Guides

Schedule a Live Demo

Get connected

Events & Webinars

Sign Up for Our Newsletter

Get support

Documentation

Customer Support

Get inspired

About us

Who We Are

Leadership

Customers

Partners

News

Connect with us

Careers

Upcoming Events

Contact Us

Customer spotlight