Integrate Datafold with dbt Cloud to automate Data Diffs in your CI pipeline, leveraging dbt jobs to detect changes and ensure data quality before merging.
NOTE
You will need a dbt Team account or higher to access the dbt Cloud API that Datafold uses to connect the accounts.
In dbt Cloud, set up dbt Cloud CI so that your Pull Request job runs when you open or update a Pull Request. This job will provide Datafold information about the changes included in the PR.
The Artifacts job generates production manifest.json
on merge to main/master, giving Datafold information about the state of production. The simplest method is to set up a dbt Cloud job that executes the dbt ls
command on merge to main/master.
Note:
dbt ls
is preferred overdbt compile
as it runs faster and data diffing does not require fully compiled models to work.
Example dbt Cloud artifact job settings and successful run:
Continuous Deployment
If you are interested in continuous deployment, you can use a Merge Trigger Production Job instead of the Artifacts Job listed above.
You will need your access url to connect Datafold to your dbt Cloud account.
To connect Datafold to your dbt Cloud account, you will need to use a Service Token.
info
Please note that the use of User API Keys for this purpose is no longer recommended due to a recent security update in dbt Cloud. Learn more below
Member
or Developer
.All Projects
, or check only the projects you intend to use with Datafold.dbt Cloud is transitioning away from the use of User API Keys for authentication. The User API Key will be replaced by account-scoped Personal Access Tokens (PATs).
This update will affect the functionality of certain API endpoints. Specifically, /v2/accounts
, /v3/accounts
, and /whoami
(undocumented API) will no longer return information about all the accounts tied to a user. Instead, the response will be filtered to include only the context of the specific account in the request.
dbt Cloud users have until April 30, 2024, to implement this change. After this date, all user API keys will be scoped to an account. New customers are required to use the new account-scoped PATs.
For more information, please refer to the dbt Cloud API Documentation.
If you have any questions or require further assistance, please don’t hesitate to contact our support team.
Click save, and that’s it!
Now that you’ve set up a dbt Cloud integration, Datafold will diff your impacted tables whenever you push commits to a PR. A summary of the diff will appear in GitHub, and detailed results will appear in the Datafold app.
Integrate Datafold with dbt Cloud to automate Data Diffs in your CI pipeline, leveraging dbt jobs to detect changes and ensure data quality before merging.
NOTE
You will need a dbt Team account or higher to access the dbt Cloud API that Datafold uses to connect the accounts.
In dbt Cloud, set up dbt Cloud CI so that your Pull Request job runs when you open or update a Pull Request. This job will provide Datafold information about the changes included in the PR.
The Artifacts job generates production manifest.json
on merge to main/master, giving Datafold information about the state of production. The simplest method is to set up a dbt Cloud job that executes the dbt ls
command on merge to main/master.
Note:
dbt ls
is preferred overdbt compile
as it runs faster and data diffing does not require fully compiled models to work.
Example dbt Cloud artifact job settings and successful run:
Continuous Deployment
If you are interested in continuous deployment, you can use a Merge Trigger Production Job instead of the Artifacts Job listed above.
You will need your access url to connect Datafold to your dbt Cloud account.
To connect Datafold to your dbt Cloud account, you will need to use a Service Token.
info
Please note that the use of User API Keys for this purpose is no longer recommended due to a recent security update in dbt Cloud. Learn more below
Member
or Developer
.All Projects
, or check only the projects you intend to use with Datafold.dbt Cloud is transitioning away from the use of User API Keys for authentication. The User API Key will be replaced by account-scoped Personal Access Tokens (PATs).
This update will affect the functionality of certain API endpoints. Specifically, /v2/accounts
, /v3/accounts
, and /whoami
(undocumented API) will no longer return information about all the accounts tied to a user. Instead, the response will be filtered to include only the context of the specific account in the request.
dbt Cloud users have until April 30, 2024, to implement this change. After this date, all user API keys will be scoped to an account. New customers are required to use the new account-scoped PATs.
For more information, please refer to the dbt Cloud API Documentation.
If you have any questions or require further assistance, please don’t hesitate to contact our support team.
Click save, and that’s it!
Now that you’ve set up a dbt Cloud integration, Datafold will diff your impacted tables whenever you push commits to a PR. A summary of the diff will appear in GitHub, and detailed results will appear in the Datafold app.