Documentation Index
Fetch the complete documentation index at: https://docs.datafold.com/llms.txt
Use this file to discover all available pages before exploring further.
How does cross-database diffing work?
How does cross-database diffing work?
What kind of information does Datafold output?
What kind of information does Datafold output?
- High-Level Summary:
- Total number of different rows
- Total number of rows (primary keys) that are present in one database, but not the other
- Aggregate schema differences
- Schema Differences: Per-column mapping of data types, column order, etc.
- Primary Key Differences: Sample of specific rows that are present in one database, but not the other
- Value-Level Differences: Sample of differing values for each column with identified discrepancies; full dataset of differences can be downloaded or materialized to the warehouse
How does a user run a data diff?
How does a user run a data diff?
- Via Datafold’s interactive UI
- Via the Datafold API
- On a schedule (as a monitor) with optional alerting via Slack, email, PagerDuty, etc.
Can I run multiple data diffs at the same time?
Can I run multiple data diffs at the same time?
How can I ensure accurate data comparison if my data is changing and being replicated in real-time?
How can I ensure accurate data comparison if my data is changing and being replicated in real-time?
updated_at timestamp).What if the data types do not match between source and target?
What if the data types do not match between source and target?
VARCHAR type with STRING type. When automatic type casting without information loss is not possible, the user can define type casting manually using diffing in Query mode.Can data diff help if the source and target datasets have a different shape/schema/column naming?
Can data diff help if the source and target datasets have a different shape/schema/column naming?
How can data diffs be provisioned at scale, e.g. we need to create hundreds / thousands of data diffs?
How can data diffs be provisioned at scale, e.g. we need to create hundreds / thousands of data diffs?
