What types of data can data diffs compare?
Diffs can compare data in tables, views, SQL queries (in relational databases and data lakes), and even files (e.g. CSV, Excel, Parquet, etc.). Datafold facilitates data diffing by supporting a wide range of basic data types across major database systems like Snowflake, Databricks, BigQuery, Redshift, PostgreSQL, and many more.Creating data diffs
Diffs can be created in several ways:- Interactively through the Datafold app
- Programmatically via our REST API
- As part of a Continuous Integration (CI) workflow for Deployment Testing
How in-database diffing works
When diffing data within the same physical database or data lake namespace, diffs compare data by executing various SQL queries in the target database. It uses severalJOIN
-type queries and various aggregate queries to provide detailed insights into differences at the row, value, and column levels, and to calculate differences in metrics and distributions.