Use the 'Authorization' header with the format 'Key
Successful Response
ID of the first data source (Dataset A).
ID of the second data source (Dataset B). Can be the same as data_source1_id.
in_db, cross_db Column names that uniquely identify rows, e.g. ['id'] or ['tenant_id', 'order_id']. Must match actual column names in both datasets.
Diff algorithm. 'join' for same-database diffs, 'fetch_and_join' for cross-database or file diffs. Auto-selected if omitted: 'join' when both data sources are the same, 'fetch_and_join' otherwise.
join, hash, hash_v2_alpha, fetch_and_join Map columns with different names between datasets. List of [column_in_A, column_in_B] pairs.
Columns to compare between datasets. If set, only these columns are diffed (primary key columns are always included). Column names must match the dataset schema.
Compare rows with duplicate primary keys. Defaults to true.
Snowflake session parameters for Dataset A, e.g. {"QUERY_TAG": "datadiff", "WAREHOUSE": "COMPUTE_WH"}.
Snowflake session parameters for Dataset B.
Datetime precision for comparison. 0=seconds, 1=tenths, 2=hundredths, 3=milliseconds, 4=tenth-ms, 5=hundredth-ms, 6=microseconds.
Default tolerance for float comparisons. In absolute mode: values within this distance are equal. In relative mode: fraction of difference allowed.
Per-column tolerance overrides. Each entry: {column_name, tolerance_value (>= 0), tolerance_mode: 'absolute'|'relative'}.
Columns to exclude from comparison. Ignored if include_columns is set.
File URL for Dataset A (s3://, gs://, abfss://, https://). Mutually exclusive with table1 and query1. Requires file1_options.
1File format options for file1 (file_type, delimiter, sheet, skip rows).
File URL for Dataset B (s3://, gs://, abfss://, https://). Mutually exclusive with table2 and query2. Requires file2_options.
1File format options for file2 (file_type, delimiter, sheet, skip rows).
SQL WHERE clause for Dataset A (omit the WHERE keyword), e.g. 'status = 1'.
SQL WHERE clause for Dataset B (omit the WHERE keyword), e.g. 'status = 1'.
Explicit list of columns to compare. If set, only these columns are diffed.
Data source ID where materialized diff results are stored.
Materialize Dataset A before diffing. Improves speed for heavy queries, filtered non-indexed columns, or transformed primary keys.
Materialize Dataset B before diffing. Same use cases as materialize_dataset1.
Skip sampling when materializing results.
ok, alert, error, learning, checking, created, skipped, cancelled SQL query for Dataset A. Mutually exclusive with table1 and file1.
SQL query for Dataset B. Mutually exclusive with table2 and file2.
error, bad-pks, different, missing-pks, identical, empty Run column profiling on diff results.
Sampling confidence level, between 0 and 100 exclusive. Common values: 90, 95, 99, 99.5, 99.9. Use with sampling_tolerance.
Maximum number of rows to sample (absolute count). Alternative to tolerance+confidence and sampling_ratio.
Sample this fraction of rows. Value between 0 and 1 exclusive (e.g. 0.1 = 10% of rows). Alternative to tolerance+confidence.
Minimum row count to activate sampling. Sampling is disabled if the largest table has fewer rows than this.
Sampling tolerance: max fraction of rows with PK errors before sampling is disabled. Value between 0 and 1 exclusive (e.g. 0.001 = 0.1%). Use with sampling_confidence.
interactive, demo_signup, manual, api, ci, schedule, auto needs_confirmation, needs_authentication, waiting, processing, done, failed, cancelled Table path for Dataset A as a list of path components, e.g. ['schema', 'table'] or ['database', 'schema', 'table']. Mutually exclusive with query1 and file1.
Table path for Dataset B as a list of path components, e.g. ['schema', 'table'] or ['database', 'schema', 'table']. Mutually exclusive with query2 and file2.
Table-level modifiers. Use ['case_insensitive_strings'] to ignore string case.
case_insensitive_strings Tags for organizing and filtering diffs.
Time aggregation level when using time_column.
minute, hour, day, week, month, year Column name used for time-based filtering or aggregation.
Time travel point for Dataset A. Accepts: negative integer offset (e.g. -130), UTC timestamp (e.g. '2024-01-15T00:00:00'), or a time point hash. Only supported by Snowflake and Databricks.
Time travel point for Dataset B. Same format as time_travel_point1.
How diff_tolerance is applied: 'absolute' or 'relative'.
absolute, relative