Skip to main content

Datafold Cloud

Datafold Cloud users can kick off diffs from the CLI using data-diff --dbt --cloud, and then gain access to a rich UI where they can explore value-level differences, distribution shifts, and easily share diff results with their team and stakeholders.

Getting started

1. Sign up for Datafold Cloud

2. Set up your dbt project

Install open source data-diff CLI and configure your dbt project as described here.

3. Configure the data source

To connect to your data warehouse, navigate to Settings Integrations Data warehouses and click Add new integration and follow the prompts. For more information, check out our Data Source configuration guides.

After you Test and Save, add the Data Source ID (which can be found on the Data warehouses page) to your dbt_project.yml.

# dbt_project.yml
vars:
data_diff:
...
datasource_id: <DATA_SOURCE_ID>

4. Generate an API key

To generate a personal API key, navigate to Settings Account and click Create API Key.

Copy and export your API Key as an environment variable. We suggest storing it in a file like .zshrc or .bash_profile, but you can also run the command below directly in your project.

export DATAFOLD_API_KEY=XXXXXXXXX
info

If your Datafold instance runs in your company's VPC, you should set an environment variable specifying the Datafold app URL.

export DATAFOLD_HOST=https://datafold.domain.tld

5. Run data-diff --dbt --cloud

After you execute dbt run in your local environment run data-diff --dbt --cloud to see the impact that your model changes had on the data.

dbt run --select <MODEL> && data-diff --dbt --cloud

Development Testing with Datafold Cloud

Datafold Cloud CLI Summary

When you click the link in the terminal, you will be taken to the Datafold Cloud app where you can see the diff results in more detail than the summary in the terminal.

Datafold Cloud Diff Overview

Datafold Cloud diff results are stored for easy sharing with your team and stakeholders.

Value-level insights

Column profiles to understand distribution of changes

Column-level lineage to understand the impact of changes