The Databricks Data + AI Summit converged on downtown San Francisco for its 10th year, bringing an estimated 25,000 data leaders and practitioners from around the world together for a series of keynotes, breakout sessions, trainings, and networking events. While the audience and industries were diverse, one term echoed throughout every keynote and conversation: generative AI.
This year’s sessions undeniably proved that generative AI isn’t just a fad – it’s driving the future of data use and by extension, how we work and live. How, exactly? In this blog, we’ll recap our top takeaways from the summit about how AI is shaping the way we interact with, manage, and protect data.
1. AI Democratization Will Fundamentally Change How We Work
AI has already impacted seemingly mundane aspects of our lives, from how we unlock our phones to how we deposit checks. But until recently, its influence on our jobs has generally been less obvious. AI was often regarded as something that was confined to the realms of research labs and tech giants, not accessible to just anyone.
Generative AI tools like ChatGPT, Jasper.ai, and Bard have flipped that notion on its head. As Databricks Co-Founder and CEO Ali Ghodsi pointed out in his keynote, the internet existed long before the web browser made it mainstream. Similarly, deep learning has been around for decades, but generative AI technologies have just started to democratize it. Why is that important?
In short, it removes barriers. Tasks that once required specialized knowledge and skill sets now only require an understanding of natural language. Ghodsi quoted OpenAI’s Andrej Karpathy, saying “The hottest new programming language is English” (or your preferred language, as Ghodsi was quick to add). This sentiment was repeated in many forms across different speaking sessions, with the general takeaway being that coding expertise is no longer required to query, analyze, and make sense out of data.
The hottest new programming language is English.
It’s important to look realistically at what this means – and what it doesn’t. With more reliance on data and fewer barriers to using it thanks to AI, organizations can release bottlenecks and enable more efficient, continuous workflows. But that doesn’t translate to technology replacing humans in the workplace.
“Really good programmers will still need to know everything about their role, but they will be much more productive,” said Marc Andreessen, co-founder and general partner at Andreessen Horowitz. “There are always more programs to write, but what they run out of is time and resources. [AI] will free people up to do more productive work.”
This benefit offers potentially massive efficiencies for organizations, so building AI into tools and workflows – like Databricks’ just-released LakehouseIQ – is likely to become more common as AI development continues to mature. Andreessen anticipates that AI tools will become more specialized to address the needs of specific roles, and that workers will have multiple AI “co-pilots” that help increase personal productivity. Still, like managers and their direct reports, these co-pilots will require humans’ time, effort, and attention in order to truly provide a benefit.
2. …But How to Address AI Privacy Issues Remains Unclear
On the other side of every discussion about the evolution and democratization of generative AI, there was one inevitable caveat: Privacy concerns are at odds with AI innovation, and there’s not yet a clear strategy for mediation. Speakers ranging from business leaders to academic researchers voiced concerns about the potential for AI to:
- Accelerate the spread of misinformation and disinformation
- Expose organizations to more privacy and compliance risks
- Generate outputs using models trained on sensitive or proprietary information
- Cause negative outcomes that we haven’t yet anticipated or considered
Eric Schmidt, former CEO and Chairman of Google, stated in a fireside chat with Databricks Co-Founder and SVP of Field Engineering Arsalan Tavakoli that “there’s a general fear that these systems will have an emergent behavior that we’re not aware of yet,” and expects regulations will be based around extreme potential risks. Larry Feinsmith, Managing Director and Head of Global Tech Strategy, Innovation, and Partnerships at JPMorgan Chase, acknowledged the value of AI but does not plan to fully “roll out generative AI until we can mitigate all the risks.”
With so many unknowns about generative AI and the breakneck speed at which it’s evolving, determining an approach to managing data privacy and security is complicated, to say the least. Daniela Rus, Director of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), said in part, organizations need to educate their data users and the public about the strengths and shortcomings of generative AI, saying that people need to understand the tools that are being put in their hands.
But AI pioneers need to ensure their systems have robust data security capabilities to proactively mitigate threats as well. Since regulations are typically slow to follow technological advances, taking control of how sensitive data is protected and used may help stave off inadvertent exposure via AI-assisted workflows. Immuta’s native integration with Databricks and support for Unity Catalog simplifies enforcement of complex security controls across any Databricks workspace, including interactive clusters and Databricks SQL. This allows joint customers to run and train AI models, and develop AI applications, while complying with existing and forthcoming regulations – all without exposing raw data to third parties or unauthorized users.
You can read more about this in Immuta CEO Matt Carroll’s blog about Databricks’ investment in Immuta.
3. Security & Monitoring Are Driving Unity Catalog Optimizations
Databricks announced several enhancements for Unity Catalog during the Data + AI Summit, indicating a greater focus on how data is secured, governed, and monitored within AI-driven workflows. As a core Databricks partner, Immuta is now the first data security platform to natively and fully integrate with Unity Catalog, seamlessly orchestrating access controls across Databricks clusters and Databricks SQL.
For Databricks SQL users, not needing to rely on view-based policy enforcement will greatly reduce the amount of time it takes to author and enforce controls, making it easier to accelerate secure operations across the Databricks Lakehouse Platform. And customers are already reaping the benefits of this seamless, non-invasive orchestration.
“Immuta goes in, takes those [access] rules that we care about, and actually converts them into the source system primitives that we need to gate access in Unity itself,” said Instacart Senior Software Engineer Kieran Taylor. “The end user is going to be going into a Databricks environment, interacting with tables in the same way that they would have previously, and they are unaware that Immuta is the thing which is gating access.”
Building on Unity Catalog’s foundational capabilities, Immuta allows users to:
- Discover and tag sensitive data, centralize and enrich metadata, and in the future, leverage Unity Catalog lineage for tag propagation
- Orchestrate attribute-based access control (ABAC) for table- and row-level security, as well as column masking, via policies that are reflected as native Unity Catalog controls
- Monitor data usage and manage risks through audit logs of queries, policy enforcement, backing storage, changes, and access summaries
Immuta goes in, takes those [access] rules that we care about, and actually converts them into the source system primitives that we need to gate access in Unity itself. The end user is going to be going into a Databricks environment, interacting with tables in the same way that they would have previously, and they are unaware that Immuta is the thing which is gating access.
As Unity Catalog continues to evolve, Immuta’s support for it will also expand so that customers like Instacart can optimize data workflows, securely share data, and capitalize on AI-driven opportunities that will unlock innovation.
Get the details about our new features for Unity Catalog here.
What’s Ahead for AI, Databricks, & Immuta
It’s difficult to distill all the insights from Data + AI Summit down to just three, but it was impossible to ignore the focus on generative AI, its implications for data privacy, and the growing role of Unity Catalog for Databricks customers. Over the coming months, we expect that innovations in these areas will continue to advance at a rapid – and potentially accelerated – clip. We’re excited to grow our partnership with Databricks so customers can unlock more value from their Databricks data and AI initiatives.
To find out more about how Immuta integrates with Databricks Unity Catalog, check out our joint eBook from Immuta Co-Founder CTO Steve Touw and Databricks Senior Director of Product Management Jonathan Keller.
Immuta + Unity Catalog: The Next Frontier of Scalable Data Security
Get an in-depth look at Immuta's integration with Databricks Unity Catalog.
Get the eBook