How To: Reporting & Auditing in Databricks Using Immuta

Reporting and Auditing in Databricks Using Immuta

As a data engineer or architect, your organization’s security and compliance teams look to you for answers about who is accessing your company’s data. But in the wake of data privacy regulations like the CCPA and Europe’s GDPR — not to mention a growing array of other state, federal, and industry rules — requests for data use reports have become all the more urgent. To remain compliant in a highly regulated world, companies have no choice but to enforce stricter, more robust internal data rules and security policies — and to be able to prove that they work as intended.

 Practically speaking, this heightened level of regulatory scrutiny means that you must maintain an accurate, reliable, and highly detailed data audit trail. And while auditing has long been a part of basic data governance hygiene, the reality is that in today’s diverse cloud compute environments, it is often significantly more difficult than it used to be. That’s because when you have different compute layers, security and privacy controls can be enforced in any given layer, passed through to storage, or even implemented in analytical applications. Further complicating matters is the fact that there can be a variety of different people accessing your data, each with their own unique set of data access permissions.

To help overcome these challenges, Immuta offers Databricks users native data auditing and reporting capabilities designed to help ensure data use is transparent, while delivering faster speed to access and preserving your data’s privacy and utility.

A Guide to Automated Data Access

In Databricks Using Immuta

Download Ebook

Data Auditing on Demand

Immuta’s automated auditing and reporting capabilities make monitoring data usage easier and more efficient for Databricks users. By gathering real-time insights and creating detailed reports, you always know exactly what data was accessed, who accessed it, when they did so, and for what purpose. These capabilities include:

  • Scalable global policies. Data teams often face the challenge of having to completely recreate policies as their data sets, data users, and data platforms evolve. Immuta solves this problem by allowing you to author policies that reference metadata catalogs or business glossaries instead of physical tables and columns. That way, you have to create a policy just once, and then enforce it — and scale it dynamically — across Databricks and all of your data sources. To make things even easier, our policies do not require code, making it easier for non-technical stakeholders to understand how rules are being enforced. Additionally, automated data tags map to starter policies for HIPAA and the CCPA, reducing the burden on data engineers.
  • Unified audit logs. Immuta automatically monitors and logs all of the actions that happen in your data ecosystem. That gives you a centralized way to track requests and access to data, policy changes, how data is being used, specific queries users are executing, and more. With unified audit logs, you are able to see data usage across Databricks workspaces, as well as any other cloud data platform in your stack, without having to search multiple policy tiers.
  • Automated reports. You need to be able to tell legal and compliance teams which data consumers accessed specific data sources or exactly how sensitive data was used. However, this can often be a long, complicated process with many stakeholders involved. Immuta avoids this with automated reports and audit logs that can be shared across teams, allowing you to quickly answer these questions and provide total transparency into how data changes over time. The ability to track data sets and how they have changed over time also allows you to see how changes affect data use and output.

Immuta’s automated reporting and auditing for Databricks allow you to gather real-time insights on how data is being used so that you always have total cloud data access control and oversight.

Generating Data Audit Trails Using Immuta

Immuta’s unified access and control layer creates a single, intuitive place to manage all of your data. Even governance personnel can author and apply complex policies on all of your data without having to write memos or code, and can run reports to verify policies’ effectiveness at a glance. Here’s a quick look at how your team can view audit logs and generate reports in Databricks using Immuta:

1. To view all audit logs, click on the Audit icon displayed in the left side panel. Use the Filter box to view audit logs specific to purpose, query ID, user, record type, project, data source, and more. 

https://www.immuta.com/wp-content/uploads/2021/04/audit-sidebar-2.jpg

2. To build a report, click select entity and choose the basis for the report from the dropdown menu. Options include User, Group, Project, Data Source, Purpose, Policy Type, Connection, or Tag.

https://www.immuta.com/wp-content/uploads/2021/04/reports-select-entity-1.png

3. Type your entity name in the enter name field and select the name from the dropdown menu that appears. Once the entity name is selected, reports will populate the center window.

https://www.immuta.com/wp-content/uploads/2021/04/reports-enter-name-1.png

4. To run a report, click a tile with the appropriate description of the report.

https://www.immuta.com/wp-content/uploads/2021/04/reports-options-1.png

5. Once the report has been created, download it using the Export to CSV button. 

https://www.immuta.com/wp-content/uploads/2021/04/reports-export-csv1-1.png

Breathe Easier with Data Access Governance

If you’re a data engineer or architect, one of your top priorities is ensuring that all of your analytics data and data use complies with a growing array of complex regulatory and business rules. Without the right help, that can be a tall order. 

 Immuta’s auditing and reporting capabilities allow you to see and understand who is allowed access to your data, why, and for what purpose, across your Databricks workspaces. For administrators and non-technical stakeholders on your company’s legal and compliance teams, this capability provides the transparency they are looking for. Plus, Immuta shows you how data has changed over time so that you can always demonstrate your organization’s compliance with federal, state, industry, and contractual requirements. For Databricks users, this is all done natively, streamlining data access governance and improving your efficiency and impact.

To learn more, check out our eBook, A Guide to Data Access Governance with Immuta and Databricks, which specifies exactly how you can use Immuta to unlock more value from your data.

Are you a Databricks user interested in trying Immuta for yourself? Get in touch with us today.

Blog

Related stories