Snowflake
NOTE: Datafold needs permissions in your Snowflake dataset to read your table data. You will need to be a Snowflake Admin in order to grant the required permissions.
Steps to complete:
- Create a user and role for Datafold
- Setup password-based or Use key-pair authentication
- Create a temporary schema
- Give the Datafold role access to your warehouse
- Configure your data connection in Datafold
Create a user and role for Datafold
A full script can be found at the bottom of this page.
It is best practice to create a separate role for the Datafold integration (e.g., DATAFOLDROLE
):
To provide column-level lineage, Datafold needs to read & parse all SQL statements executed in your Snowflake account:
Set up password-based authentication
Datafold supports username/password authentication, but also key-pair authentication.
You can set the username/password in the Datafold web UI.
Use key-pair authentication
If you want to use key-pair authentication, please follow the steps of Snowflake. The public key will be set to the Snowflake user:
The private key needs to be uploaded to Datafold, and the optional passphrase of the private-key can be set to the user.
Create schema for Datafold
Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in the your warehouse.
Give the Datafold role access
Datafold will only scan the tables that it has access to. The snippet below will give Datafold read access to a database. If you have more than one database that you want to use in Datafold, rerun the script below for each one.
Full Script
Validate Snowflake Grants for Datafold
Run these queries to validate that the grants have been set up correctly:
Note: More results may be returned than shown in the screenshots below if you have granted access to multiple roles/users
Example Placeholders:
<database_name>
=DEV
<warehouse_name>
=DEMO
A note on future grants
The above database grants will be insufficient if any future grants have been defined at the schema level, because schema-level grants will override database-level grants. In that case, you will need to execute future grants for every existing schema that Datafold will operate on.
Configure in Datafold
Field Name | Description |
---|---|
Name | A name given to the data connection within Datafold |
Account identifier | The Org name-Account name pair for your Snowflake account. This can be found in the browser address string. It may look like https://orgname-accountname.snowflakecomputing.com or https://app.snowflake.com/orgname/accountname. In the setup form, enter <orgname>-<accountname>. |
User | The username set in the Setup password-based authentication section |
Password | The password set in the Setup password-based authentication section |
Key Pair file | The key file generated in the Use key-pair authentication section |
Warehouse | The Snowflake warehouse name |
Schema for temporary tables | The schema name you created with our script (<database_name>.DATAFOLD_TMP) |
Role | The role you created for Datafold (Typically DATAFOLDROLE) |
Default DB | A database the role above can access. If more than one database was added, whichever you prefer to be the default |
Note: Please review the documentation for the account name. Datafold uses Format 1 (Preferred): https://docs.snowflake.com/en/user-guide/admin-account-identifier#using-an-account-locator-as-an-identifier
Click Create. Your data connection is ready!