Skip to main content

Open Source Data Diff

Three Use Cases

Our Open Source Data Diff package, data-diff, has three functions - each with a different use case in mind:

  • dbt for comparing dbt models within the same data source
  • joindiff for comparing tables within the same data source
  • hashdiff for comparing tables across different data sources (e.g., Postgres and Snowflake)

Getting Started

Install

To get started with any of the use cases above, install data-diff and the relevant database connector(s):

pip install data-diff 'data-diff[snowflake]' -U

Run

Once you've installed data-diff, you can run it from the command line (see below) or via Python API (see docs).

caution

If you are a dbt user, check out our docs on Development Testing with Open Source.