Skip to main content

Snowflake

NOTE: Datafold needs permissions in your Snowflake dataset to read your table data. You will need to be a Snowflake Admin in order to grant the required permissions.

Steps to complete:

Create a user and role for Datafold

A full script can be found at the bottom of this page.

It is best practice to create a separate role for the Datafold integration (e.g., DATAFOLDROLE):

CREATE ROLE DATAFOLDROLE;
CREATE USER DATAFOLD DEFAULT_ROLE = "DATAFOLDROLE" MUST_CHANGE_PASSWORD = FALSE;
GRANT ROLE DATAFOLDROLE TO USER DATAFOLD;

To provide column-level lineage, Datafold needs to read & parse all SQL statements executed in your Snowflake account:

GRANT MONITOR EXECUTION ON ACCOUNT TO ROLE DATAFOLDROLE;
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE DATAFOLDROLE;

Set up password-based authentication

Datafold supports username/password authentication, but also key-pair authentication.

ALTER USER DATAFOLD SET PASSWORD = 'SomethingSecret';

You can set the username/password in the Datafold web UI.

Use key-pair authentication

If you want to use key-pair authentication, please follow the steps of Snowflake. The public key will be set to the Snowflake user:

ALTER USER DATAFOLD SET rsa_public_key='abc..'

The private key needs to be uploaded to Datafold, and the optional passphrase of the private-key can be set to the user.

Create schema for Datafold

Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in the your warehouse.

CREATE SCHEMA <database_name>.DATAFOLD_TMP;
GRANT ALL ON SCHEMA <database_name>.DATAFOLD_TMP TO DATAFOLDROLE;

Give the Datafold role access

Datafold will only scan the tables that it has access to. The snippet below will give Datafold read access to a database. If you have more than one database that you want to use in Datafold, rerun the script below for each one.

/* Repeat for every DATABASE to be usable in Datafold. This allows Datafold to
correctly discover, profile & diff each table */
GRANT USAGE ON WAREHOUSE <warehouse_name> TO ROLE DATAFOLDROLE;
GRANT USAGE ON DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT USAGE ON ALL SCHEMAS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT USAGE ON FUTURE SCHEMAS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT SELECT ON ALL TABLES IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE TABLES IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT SELECT ON ALL VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT SELECT ON ALL MATERIALIZED VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE MATERIALIZED VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

Full Script

--Step 1: Create a user and role for Datafold
CREATE ROLE DATAFOLDROLE;
CREATE USER DATAFOLD DEFAULT_ROLE = "DATAFOLDROLE" MUST_CHANGE_PASSWORD = FALSE;
GRANT ROLE DATAFOLDROLE TO USER DATAFOLD;

GRANT MONITOR EXECUTION ON ACCOUNT TO ROLE DATAFOLDROLE;
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE DATAFOLDROLE;

--Step 2a: Use password-based authentication
ALTER USER DATAFOLD SET PASSWORD = 'SomethingSecret';
--OR
--Step 2b: Use key-pair authentication
--ALTER USER DATAFOLD SET rsa_public_key='abc..'

--Step 3: Create schema for Datafold
CREATE SCHEMA <database_name>.DATAFOLD_TMP;
GRANT ALL ON SCHEMA <database_name>.DATAFOLD_TMP TO DATAFOLDROLE;

--Step 4: Give the Datafold role access to your Data Source
/*
Repeat for every DATABASE to be usable in Datafold. This allows Datafold to
correctly discover, profile & diff each table
*/
GRANT USAGE ON WAREHOUSE <warehouse_name> TO ROLE DATAFOLDROLE;
GRANT USAGE ON DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT USAGE ON ALL SCHEMAS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT USAGE ON FUTURE SCHEMAS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT SELECT ON ALL TABLES IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE TABLES IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT SELECT ON ALL VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

GRANT SELECT ON ALL MATERIALIZED VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE MATERIALIZED VIEWS IN DATABASE <database_name> TO ROLE DATAFOLDROLE;

Validate Snowflake Grants for Datafold

Run these queries to validate that the grants have been set up correctly:

Note: More results may be returned than shown in the screenshots below if you have granted access to multiple roles/users

Example Placeholders:

  • <database_name> = DEV
  • <warehouse_name> = DEMO
-- Validate database usage for the DATAFOLDROLE
SHOW GRANTS ON DATABASE <database_name>;

grants_on_database

-- Validate warehouse usage for the DATAFOLDROLE
SHOW GRANTS ON WAREHOUSE <warehouse_name>;

grants_on_database

-- Validate schema permissions for the DATAFOLDROLE
SHOW GRANTS ON SCHEMA <database_name>.DATAFOLD_TMP;

grants_on_schema

A note on future grants

The above database grants will be insufficient if any future grants have been defined at the schema level, because schema-level grants will override database-level grants. In that case, you will need to execute future grants for every existing schema that Datafold will operate on.

GRANT SELECT ON FUTURE TABLES IN SCHEMA <database_name>.<schema_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA <database_name>.<schema_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON FUTURE MATERIALIZED VIEWS IN SCHEMA <database_name>.<schema_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA <database_name>.<schema_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA <database_name>.<schema_name> TO ROLE DATAFOLDROLE;
GRANT SELECT ON ALL MATERIALIZED VIEWS IN SCHEMA <database_name>.<schema_name> TO ROLE DATAFOLDROLE;

Configure in Datafold

Field NameDescription
NameA name given to the data source within Datafold
Account identifierThe Org name-Account name pair for your Snowflake account. This can be found in the browser address string. It may look like https://orgname-accountname.snowflakecomputing.com or https://app.snowflake.com/orgname/accountname. In the setup form, enter <orgname>-<accountname>.
UserThe username set in the Setup password-based authentication section
PasswordThe password set in the Setup password-based authentication section
Key Pair fileThe key file generated in the Use key-pair authentication section
WarehouseThe Snowflake warehouse name
Schema for temporary tablesThe schema name you created with our script (<database_name>.DATAFOLD_TMP)
RoleThe role you created for Datafold (Typically DATAFOLDROLE)
Default DBA database the role above can access. If more than one database was added, whichever you prefer to be the default

Note: Please review the documentation for the account name. Datafold uses Format 1 (Preferred): https://docs.snowflake.com/en/user-guide/admin-account-identifier#using-an-account-locator-as-an-identifier

Click Create. Your data source is ready!