Immuta Data Blueprints

Step-by-step guides for implementing dynamic data security for data science and analytics use cases.

As data use rules and regulations become more stringent, it is incumbent upon DataOps, legal, and compliance teams to ensure data use can be monitored and audited on-demand. For Snowflake users, unexpected obstacles can often make this a difficult or convoluted process. For instance, if a user does not have Snowflake admin privileges, they may be unable to access data source queries.

Immuta’s integration with Snowflake streamlines this process by providing the ability to store audit information for native Snowflake queries, allowing authorized users to view audit records whether or not they have admin-level permissions. This is in addition to collecting detailed policy auditing data tracked in Immuta and required for compliant data use or investigations.

In this blog, we’ll walk through how the process works step-by-step. It is important to note that the capabilities we’ll cover are currently available exclusively for customers using Immuta’s native integration with Snowflake’s Enterprise or higher editions.

Enabling Native Snowflake Query Audits

When enabling a native integration in Immuta, click the “Enable Native Query Audit” box. This is not enabled by default but can easily be turned on for any native integration.

Once enabled, the following grant is run during set-up.

GRANT IMPORTED PRIVILEGES ON DATABASE snowflake TO ROLE "Immuta System Role"

Running Snowflake Query Audits

Next, we’ll run a query against Snowflake to retrieve the query histories on our native data sources. It is important to note that no audit ingest is done immediately after enabling the native integration. Once the native integration with native query audit is enabled, the first time Immuta attempts to ingest audits is during an automated nightly job or via a manual “Native Query Audit” button. The nightly job gathers any new queries from the previous 24 hours, while the manual approach initiates the same job, but only searches for new queries created since the latest native query audit.

If there is no existing native Snowflake audit record, then it will query the past 24 hours. It is worth noting that to generate the audit record, we query the Snowflake QUERY_HISTORY and ACCESS_HISTORY tables – the former has a latency of up to 45 minutes and the latter has a latency of up to three hours, so it is possible that queries will not appear in the Immuta audit for 24 hours.

Once queried, the audit records are stored with a recordType of nativeQuery.

Executing a Snowflake Query Audit

Here is an example audit query:

{

“id”: “c4faa567-6dbe-4adc-9cee-92ba036546df”,
“dateTime”: “1632159600391”,
“month”: 1460,
“profileId”: 1,
“userId”: “999111224”,
“dataSourceId”: 18,
“dataSourceName”: “Public Customer”,
“count”: 1,
“recordType”: “nativeQuery”,
“success”: true,
“component”: “nativeSql”,
“accessType”: “query”,
“query”: “select * from IMMUTA.PUBLIC.customer;”,
“queryId”: “019f1464-0401-23fd-0040-d2030014a1be”,
“extra”: {

“handler”: “Snowflake”,
“startTime”: “2021-09-20 17:40:00.391000000 +0000”,
“endTime”: “2021-09-20 17:40:02.421000000 +0000”,
“duration”: “2030”,
“nativeObject”: “IMMUTA.PUBLIC.CUSTOMER”,
“nativeObjectType”: “view”,
“host”: “immuta-product_engineering.snowflakecomputing.com”,
“database”: “IMMUTA”

},
“sqlUser”: “[email protected]”,
“createdAt”: “2021-09-20T17:48:33.441Z”,
“updatedAt”: “2021-09-20T17:48:33.441Z”

}

The dateTime field refers to when the query started in UTC – it is stored as the epoch value and is referenced on the Audit page for the Timestamp column. When ingested, the new nativeQuery audit records will appear in the table, sorted along with all the other audit records.

The createdAt and updatedAt fields respectively reflect when the audit record was created and updated in Immuta. queryText is the query that was run and queryId is the value of the QUERY_ID column in the query_history table in Snowflake. Finally, sqlUser is the Snowflake account that ran the query.

Within the extra field, we’re able to store extra information specific to the native query that was run, such as:

    • handler – The native integration type
    • startTime – When the query started in UTC
    • endTime – When the query completed in UTC
    • duration – How long the query took in milliseconds
    • nativeObject – The fully qualified object that was queried
    • nativeObjectType – The type of the object that was queried (e.g. view or table)
    • host – The host to which the native integration is connected
    • database – The database in which the native object resides

Conclusion

For data teams that need to store audit information for policy management and native Snowflake data source queries, Immuta’s integration with Snowflake helps avoid bottlenecks and save time. In addition, our enhanced Snowflake integration allows users to automate access control for scalable policy management; write policies in plain language for transparency with legal and compliance teams; and audit all policy changes, versions, and query histories. These functions are done from an abstracted layer that’s transparent to Snowflake users, for a seamless user experience.

To explore these integrations and begin storing audit information across policy management and native Snowflake queries, connect to Immuta through Snowflake Partner Connect or start a free trial.