If there’s one lesson from the constant advancement of data use and technology, it’s that there’s always room for improvement. Even the most sophisticated data infrastructure can be tuned up with new and improved capabilities that streamline data access and use.
This was the case for Booking.com, which recently decided it was time to move from a hybrid data infrastructure to a more scalable and self-service cloud-native ecosystem. Luca Falsina, Principal Software Engineer at Booking.com, sat down with Immuta Director of Product Management Zachary Friedman and Amazon Web Services Senior Product Manager Huey Han to discuss his organization’s shift to a simple, secure, and scalable data ecosystem. In this recap, we’ll share how Falsina’s team met their requirements for an accessible cloud-native infrastructure through dynamic platforms and simplified security.
Booking.com’s Data Governance & Access Management Requirements
As one of the largest travel and accommodation services in the world, it was crucial that the Booking.com team make the necessary updates to enable a more scalable, secure, and accessible data ecosystem. Falsina outlined these required changes, which included:
Distributed Data Ownership
Falsina shared that Booking.com’s existing model of data access management was too centralized, with a single infrastructure team that had little contextual knowledge assigning and granting access permissions. Their team needed a way to distribute data ownership responsibilities and remove data access bottlenecks without impacting their holistic data security capabilities.
“We want to run a model of distributed ownership with guardrails. Initially, we were in a state [where] every single request…required that the infrastructure team provision access. And we have so many different pieces of data that most of the time, we don’t know if we should allow it,” said Falsina. “So the owner of the business piece of data is really the person, or team, that should make that call. It shouldn’t be the infrastructure team.”
To distribute ownership, the Booking.com team enacted a tiered data access system using Immuta’s dynamic policies, which can be written in plain language and applied consistently across platforms using attribute-based access controls. This system is helmed by data governors, who write global access policies to govern user access to data sets throughout the ecosystem, as well as review audit logs to ensure compliance.
Below the governors are the data asset owners, who own specific data sets, and are responsible for writing and applying asset-specific access policies. They are also in charge of reviewing and granting ad hoc consumer requests on the assets.
Lastly, data asset consumers discover and request access to these data sets, and are only granted access based on their specific user attributes.
“The owner of the data set is the one that makes the decision on who gets access,” said Falsina. “This really helps us to scale this up, because we don’t need to manage all of those access requests ourselves.”
Future-Proof Client Coverage
Booking.com’s data ecosystem is supported by a range of tools, including Snowflake, Kubernetes, Amazon SageMaker, and other AWS native services. With data stored, accessed, and analyzed across these different parts of the ecosystem, Falsina’s team needed to guarantee that their data governance capabilities would provide coverage for both existing and future platforms.
“We have a quite large spectrum of clients that will need to access our data platform. We have a vast amount of different services that need to access and read data. So we really need to have quite a lot of flexibility and be future-proof,” said Falsina. “We cannot afford to have dedicated client integrations for each one of these platforms, we need to have something foundational that can work across multiple services.”
To solve for various client coverage needs, Booking.com leverages Immuta’s range of native-cloud integrations. For example, Immuta’s Snowflake integration allows the team to write access policies in Immuta and apply them natively in Snowflake data sets, without requiring any sort of proxy.
Similarly, Immuta’s new native integration with Amazon S3 Access Grants allows the team to apply Immuta policies across all structured and unstructured AWS data, including in their Kubernetes and SageMaker instances.
“If you use an S3 client, you can actually have access grants provided everywhere,” said Falsina. ”So we can enforce the same set of controls across all of these different services.”
Limited Data Access Overhead
Falsina emphasized how integral this requirement is to his team’s data access and governance approach. Their architecture needs to enable more seamless access for data users, not slow time-to-data. If users encounter too many delays or performance issues with up-to-the-moment data, they won’t be able to make timely insights or informed data-driven decisions. This can have negative consequences for the business objectives they are pursuing, and lead to missed opportunities or uninformed actions.
“We don’t want access management to slow down runtime query performance. So we really want to avoid proxy architectures, where in order for the end user to access data, they need to go through Immuta in this case to get something at runtime. We really don’t want this pattern. We really want to avoid it as much as possible.”
By using Immuta as a control plane to build and apply policies across their ecosystem, the Booking.com team ensured that no proxy elements would slow their users’ time-to-data. Instead, Immuta pushes policy down to target systems at query time.
“Thanks to that, there’s no overhead for us as clients that are connecting, and we don’t need to change how people connect to the target storage system,” said Falsina. “It’s completely transparent. We just get the grants that we need, and we can just go and query the data.”
In order to adhere to evolving compliance laws and regulations, the Booking.com team needed to apply comprehensive auditing and logging of all user activity in their data ecosystem. And as Falsina emphasized that this had to go beyond just records of different roles or identities that multiple users could leverage. Instead, Booking.com requires all individual user access and usage to be monitored and logged for audit purposes.
“Compliance, logging, and audibility are very important. Understanding who is accessing which data – and again, which specific user is really accessing the data, not which AWS IAM role, because that doesn’t really tell us much.”
By logging user activity across Booking.com’s data stack, Immuta provides their team with a holistic and auditable history of all data access and use across their ecosystem. This includes activity within platforms like Snowflake and Amazon S3. With this unified trail of user activity, Falsina shared that the team can “get the granularity and the kind of reports that we need to prove to auditors who has accessed data in our platform across all the different storage technologies.”
Fine-Grained Data Access
In order to provide users with only the information they require for specific purposes – and nothing more – the Booking.com team desired more fine-grained access controls. This would allow them to approve or deny access on a granular level, instead of a simple “yes” or “no” for an entire data set.
The team also understood that users will likely want to be able to access both structured and unstructured data. This requires a system of governance and access management that can apply the same policies to both types of data.
“We do have interest in providing not just access to entire data sets, but have a bit of finer granularity in terms of row-level filtering and column masking on top of structural tabular data. We first want to focus on structured data, but we do see a lot of use cases of people wanting to ingest into the platform and eventually use unstructured data,” said Falsina. “So we want to have a clear plan to move into supporting unstructured data with the same type of access management framework.”
At the moment, the Booking.com team can use Immuta’s native integration with Snowflake to enforce fine-grained access controls across tabular Snowflake data sets, achieving the desired granular access permissions they desire.
As for the AWS portion of their data stack, Immuta’s Amazon S3 integration currently enables protection at the “object” level. To achieve additional granularity, the team registers all of their structured S3 prefixes as an external table in Snowflake. Another benefit of Immuta’s S3 integration is that it sets Booking.com up with the capacity to scale to supporting unstructured data in the future.
“Even if the data is unstructured, we are very comfortable that Immuta will just work pretty much in the same way, which gives us confidence that we have the plan moving forward,” said Falsina.
How Booking.com Streamlines Snowflake & AWS Data Security
Booking.com’s successful solutions to their cloud-native requirements can be grouped into three core categories:
1. Establishing a Cloud-First Tech Stack with Integrated Tools
By integrating the Immuta Data Security Platform with industry-leading cloud storage and compute platforms like the Snowflake Data Cloud and Amazon S3, Booking.com created a cloud-based data stack with the capacity to securely store and access multiple petabytes of data across platform clients. This removed many of the on-premises constraints of their legacy system, and provides their team with the future-proof capacity to scale securely in the cloud.
2. Implementing Dynamic, Fine-Grained Access Controls
By leveraging Immuta’s dynamic fine-grained access controls, the Booking.com team is able to support their hierarchical data governance structure and allow for more self-service data ownership and use – without sacrificing security. This gives them the ability to apply fine-grained controls at query time, strengthening consistent controls across platforms without inhibiting time-to-data for users in Snowflake, S3, and beyond.
3. Facilitating an Internal Launch & Enablement
By designing and internally launching a cloud-first architecture that streamlines the efforts of their data users, Booking.com has built a growing ecosystem that already boasts over 1,500 users and more than 500 data sources. Coupling user-first designs with internal enablement allowed the team to get users up and running right away, avoiding implementation confusion and driving near-immediate results.
To learn more about Booking.com’s journey to a cloud-native data infrastructure – as well as the powerful capabilities of Amazon S3 and Immuta that support this ecosystem – you can watch the full Beyond Compute Layers: Simplifying Data Security with Amazon S3 and Immuta webinar today.
Watch the Full Webinar
Learn more about how Booking.com leverages Immuta, Amazon S3, and Snowflake for scalable and secure data use.Watch Now