> ## Documentation Index
> Fetch the complete documentation index at: https://docs.datafold.com/llms.txt
> Use this file to discover all available pages before exploring further.

# dbt Cloud

> Integrate Datafold with dbt Cloud to automate Data Diffs in your CI pipeline, leveraging dbt jobs to detect changes and ensure data quality before merging.

<Note>
  **NOTE**

  You will need a dbt **Team** account or higher to access the dbt Cloud API that Datafold uses to connect the accounts.
</Note>

## Prerequisites

### Set up dbt Cloud CI

In dbt Cloud, [set up dbt Cloud CI](https://docs.getdbt.com/docs/deploy/cloud-ci-job) so that your Pull Request job runs when you open or update a Pull Request. This job will provide Datafold information about the changes included in the PR.

### Create an Artifacts Job in dbt Cloud

The Artifacts job generates production `manifest.json` on merge to main/master, giving Datafold information about the state of production. The simplest method is to set up a dbt Cloud job that executes the `dbt ls` command on merge to main/master.

> Note: `dbt ls` is preferred over `dbt compile` as it runs faster and data diffing does not require fully compiled models to work.

Example dbt Cloud artifact job settings and successful run:

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_artifacts_select_merge_job-590292c72209454e660444ea1a78fb5f.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=cf58a3156778571d811e995c186a60ab" width="1592" height="916" data-path="images/dbt_cloud_artifacts_select_merge_job-590292c72209454e660444ea1a78fb5f.png" />
</Frame>

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_artifacts_job_settings-939f1ce3f456698459c9045115706775.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=6e21263f5fa0b9317a58d335f59d02ec" width="2010" height="1854" data-path="images/dbt_cloud_artifacts_job_settings-939f1ce3f456698459c9045115706775.png" />
</Frame>

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_ls_artifacts_job_example-2839e9104f9a64ca2833966db3900131.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=5889e462f19a3cbba05fd60f7f1a26bf" width="1841" height="1210" data-path="images/dbt_ls_artifacts_job_example-2839e9104f9a64ca2833966db3900131.png" />
</Frame>

<Accordion title="Continuous Deployment">
  If you are interested in continuous deployment, you can use a Merge Trigger Production Job instead of the Artifacts Job listed above.
</Accordion>

### dbt Cloud Access URL

You will need your [access url](https://docs.getdbt.com/docs/cloud/about-cloud/regions-ip-addresses) to connect Datafold to your dbt Cloud account.

### Add dbt Cloud Service Account Token

To connect Datafold to your dbt Cloud account, you will need to use a [Service Token](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens).

info

Please note that the use of User API Keys for this purpose is no longer recommended due to a [recent security update](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens) in dbt Cloud. [Learn more below](/integrations/orchestrators/dbt-cloud#deprecating-user-tokens)

1. Navigate to **Account Settings → Service Tokens → + New Token**.

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_add_service_token-2367d19382e6d25416b452ec5378bbfb.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=a1853f25fa9a05cd5346385fe9de836b" width="2023" height="832" data-path="images/dbt_cloud_add_service_token-2367d19382e6d25416b452ec5378bbfb.png" />
</Frame>

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_add_service_token_permission-9fbdbb501c79f8a0bdee4abbf7483270.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=84146d65087fe8a89d0037d5b158018d" width="1322" height="864" data-path="images/dbt_cloud_add_service_token_permission-9fbdbb501c79f8a0bdee4abbf7483270.png" />
</Frame>

1. Add a Permission Set and select `Member` or `Developer`.
2. Select `All Projects`, or check only the projects you intend to use with Datafold.
3. Save your changes.

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_service_token-5a4c080cb6b778f030eaf02988c36978.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=87fc12ff14d1a74898d821e87b935c77" width="1308" height="886" data-path="images/dbt_cloud_service_token-5a4c080cb6b778f030eaf02988c36978.png" />
</Frame>

1. Navigate to **Your Profile → API Access** and copy the token.

#### Deprecating User Tokens

dbt Cloud is transitioning away from the use of User API Keys for authentication. The User API Key will be replaced by account-scoped Personal Access Tokens (PATs).

This update will affect the functionality of certain API endpoints. Specifically, `/v2/accounts`, `/v3/accounts`, and `/whoami` (undocumented API) will no longer return information about all the accounts tied to a user. Instead, the response will be filtered to include only the context of the specific account in the request.

dbt Cloud users have until April 30, 2024, to implement this change. After this date, all user API keys will be scoped to an account. New customers are required to use the new account-scoped PATs.

For more information, please refer to the [dbt Cloud API Documentation](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens).

If you have any questions or require further assistance, please don't hesitate to contact our support team.

## Create a dbt Cloud Integration in the Datafold app

* Navigate to Settings > Integrations > CI and create a new dbt Cloud integration.

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_setup-b9dab8af8ca813283d0aaa3b99556eb0.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=f7e0c8fb8d7fd554c4fdc36adf746cb7" width="2306" height="496" data-path="images/dbt_cloud_setup-b9dab8af8ca813283d0aaa3b99556eb0.png" />
</Frame>

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_api_key-f3e2f3669695bdedf80f47fa1ccc91b3.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=52c6d10f5b06085543f09dcff9106f97" width="1436" height="640" data-path="images/dbt_cloud_api_key-f3e2f3669695bdedf80f47fa1ccc91b3.png" />
</Frame>

## Configuration

### Basic Settings

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_basic_settings-022522ea2690dcc55c4bc7d3b1e4a411.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=cffb2b8c4bc7893601e6268b88fcbdc3" width="2294" height="1354" data-path="images/dbt_cloud_basic_settings-022522ea2690dcc55c4bc7d3b1e4a411.png" />
</Frame>

* **Repository**: Select a repository that you set up in [the Code Repositories setup step](/integrations/code-repositories).
* **Data Connection**: Select a connection that you set up in [the Data Connections setup step](/integrations/databases).
* **Name**: This can be anything!
* **Primary key tag**: This is a text string that you may use to tag primary keys in your dbt project yaml. Note that to avoid the need for tagging, [primary keys can be inferred from dbt uniqueness tests](/deployment-testing/configuration/primary-key).
* **Account name**: This will be autofilled using your dbt API key.
* **Job that creates dbt artifacts**: This will be [the Artifacts Job that you created](#create-an-artifacts-job-in-dbt-cloud). Or, if you have a dbt production job that runs on each merge to main, select that job.
* **Job that builds pull requests**: This is the dbt CI job that is triggered when you open a Pull Request or Merge Request.

### Advanced Settings

<Frame>
  <img src="https://mintcdn.com/datafold/6zQ11m2yiOVjYXTT/images/dbt_cloud_advanced_settings-c862158fc664963c51377f0daaadaca3.png?fit=max&auto=format&n=6zQ11m2yiOVjYXTT&q=85&s=ac03933bcc83efe52d9fae35874ee500" width="2306" height="1432" data-path="images/dbt_cloud_advanced_settings-c862158fc664963c51377f0daaadaca3.png" />
</Frame>

* **Enable Datafold in CI/CD**: High-level switch to turn Datafold off or on in CI (but we hope you'll leave it on!).
* **Import dbt tags and descriptions**: Populate our Lineage tool with dbt metadata. ⚠️ This feature is in development. ⚠️
* **Slim Diff**: Only diff modified models in CI, instead of all models. [Please read more about Slim Diff](/deployment-testing/best-practices/slim-diff), which is highly configurable using dbt yaml, and each organization will need to set a strategy based on their data environment.
  * Downstream Hightouch models will be diffed even when Slim Diff is turned on.
* **Diff Hightouch Models**: Hightouch customers can see diffs of downstream Hightouch assets in Pull Requests.
* **CI fails on primary key issues**: The existence of null or duplicate primary keys causes the Datafold CI check to fail.
* **Pull Request Label**: For when you want Datafold to *only* run in CI when a label is manually applied in GitHub/GitLab.
* **CI Diff Threshold**: For when you want Datafold to *only* run automatically if the number of diffs doesn't exceed this threshold for a given CI run.
* **Files to ignore**: If at least one modified file doesn’t match the ignore pattern, Datafold CI diffs all changed models in the PR. If all modified files should be ignored, Datafold CI does not run in the PR. ([Additional details.](/deployment-testing/configuration/datafold-ci/on-demand))
* **Custom base branch**: For when you want Datafold to **only** run in CI when a PR is opened against a specific base branch. You might need this if you have multiple environments built from different branches. See [Custom branch](https://docs.getdbt.com/faqs/Environments/custom-branch-settings) in dbt Cloud docs.

Click save, and that's it! <Icon icon="party-horn" />

Now that you've set up a dbt Cloud integration, Datafold will diff your impacted tables whenever you push commits to a PR. A summary of the diff will appear in GitHub, and detailed results will appear in the Datafold app.
