SQL filters can be helpful in two scenarios:

  1. When Production and Staging environment are not built using the same data. I.e., when Staging is built using a subset of production data, filters can be applied to ensure that both environments are on par and can be diffed.
  2. To improve Datafold CI performance by reducing the volume of data compared, e.g., to only last 3 months.

are an effective technique to speed up diffs by narrowing the data diffed. SQL filter adds a WHERE clause to allow you to filter data on both sides using standard SQL filter expressions. SQL filters can be added to dbt YAML under the meta.datafold.datadiff.filter tag:

models:
  - name: users
    meta:
      datafold:
        datadiff:
          filter: "user_id > 2350 AND source_timestamp >= current_date() - 7"