Our 3 Key Takeaways from Databricks’ Data + AI Summit

Databricks’ Data + AI Summit was held virtually over the span of just one week, but, if you look at all the content included, you would guess the event was much longer. From breakout sessions to training classes, demos, networking, and more, it would have been impossible to experience everything offered to attendees.  

Fortunately, our team has you covered. This blog article will get you up to speed on the key themes and can’t-miss details from the Data + AI Summit. 

1. The Lakehouse Architecture Is Going Mainstream

During an event keynote, Rohan Dhupelia, Senior Data Management Manager at Atlassian, shared his experience migrating from multiple data warehouses to a single data lakehouse with multiple petabytes of data and over 3,000 monthly, internal active users. The findings of a recent data engineering survey showed that Atlassian isn’t alone: 75% of respondents reported that they expected to be cloud-based within the next two years, citing the time, cost, and innovation benefits offered by cloud data platforms. When considering the recent popularity of cloud data platforms, Atlassian’s decision to migrate its data from PostgreSQL and Amazon Redshift data warehouses to a single Databricks’ data lakehouse architecture comes as no surprise.

Before this migration, Atlassian struggled with a series of internal challenges, such as data copying, different syntaxes across warehouses, difficulty scaling, and blocked use cases. These resulted in siloed data and outdated architecture, which added complexity to data analytics, delayed time to data access, and even blocked downstream analysis. 

Atlassian partnered with Databricks to build an internal data architecture that allowed for faster queries with self-managed clusters, unified data, non-complex warehouse style commands, and improved experiences for BI use cases. This new architecture significantly reduced costs with cheaper storage, focused data engineering efforts, simplified data governance, and unlocked self-service data access and use for data scientists.   

Don’t Miss: 

Rohan also shared the upcoming projects that he’s most excited to start, including introducing Immuta’s data access and privacy control layer on top of Atlassian’s data lake to enable more sensitive data use cases; moving business intelligence workflows to Databricks SQL; and adopting Databricks Delta Sharing to further reduce data governance complexity.   

2. Lakehouse Architectures Require Data Access Controls

The world of data is exploding, but organizations often struggle to adapt and take full advantage of the latest data innovations and products. A recent Gartner report found that 62% of data teams consider overcoming siloed data use the most challenging aspect of data and analytics governance. 

To maximize ROI on cloud data platform investments, organizations are increasingly exploring lakehouse architectures that leverage Databricks’ innovation and workload support. Immuta’s breakout session at Databricks’ Data + AI Summit outlined best practices for overcoming common data access control challenges across heterogeneous lakehouse environments, highlighting our integrations with Databricks and other leading cloud technologies.  

In one breakout session, Zachary Friedman, Senior Product Manager at Immuta, shared his experience building data access control solutions across lakehouse architectures, touching on the evolution of the most important and valuable capabilities of modern data lakehouse architectures. Zachary recommended adopting a framework for managing access controls, and walked through the unique value of attribute-based access controls and other fine-grained privacy and security controls

Don’t Miss: If you’re interested in learning more about the importance of automated data access control, check out this video to watch an Immuta customer explain the value that it brings to their team. 

3. Cross-Organization Data Sharing Is the Next Frontier

Speaking sessions weren’t the only source of buzz during the event, with both Databricks and Immuta enhancing product capabilities. 

Databricks unveiled Delta Sharing, an open protocol for securely sharing data across organizations in real-time, completely independent of the platform on which the data resides. Delta Sharing allows internal data monetization by enabling data teams to share their data externally with third-parties for stronger and better aggregated insights, and can help organizations gain data-driven competitive advantages. 

Immuta integrates with Databricks to power this new capability, providing automated, advanced anonymization techniques to maximize the utility of your data while protecting its privacy. Immuta for Databricks allows for immediate third-party data sharing with the extra flexibility for ad hoc queries on your company’s own Databricks resources. 

The introduction of Delta Sharing is a testament to Databricks’ progress towards fostering an open, democratized data and AI ecosystem, as well as a future with more collaborative, secure data use.   

Immuta also recently announced our native integration with Databrick SQL Analytics. This means that data engineering and operations teams are better equipped than ever before to centralize data access control across lakehouse architectures, maximize the full value of their cloud investments, and meet contractual and regulatory SLAs for data access and usage. 

Where most other access control solutions rely on each data platform’s specific data access control capabilities, which can lead to substantial data security and protection risks, Immuta provides centralized, universal data access control for cross-platform and native cloud data platforms. This unlocks streamlined data access and use throughout the data lakehouse, without manual processes, data copying, proliferation of views, and inconsistent policy enforcement. 

Don’t Miss: Don’t take our word for it! Watch this short video to hear Databricks’ Senior Product Marketing Manager explain why she recommends Immuta’s cloud data access controls to all data-driven organizations.  

Databricks’ Data + AI Summit was not only chock full of interesting presentations, but it also served as the stage for unveiling the latest industry innovations. With advancements in their cloud data platform, Databricks continues to drive innovation in data and AI. Their recent product updates go beyond improving utility, by democratizing data use and actively incentivizing collaboration. 

The need for strong data governance and access controls also continues to emerge as a recurring theme across industry content. As organizations move to the cloud and adopt heterogeneous data lakehouse architectures, their strategies for data security and governance need to change. Immuta’s universal cloud data access control provides the flexibility needed to successfully protect data stored across diverse data architectures, without sacrificing data security or utility. 

Request a personalized demo of our product to learn more about how we empower successful data use across hybrid, cloud, and multi-compute data platforms. 

Blog

Related stories