Skip to main content

Metric Monitors

info

Please contact support@datafold.com if you'd like to enable this feature for your organization.

Metric monitors allow you to perform anomaly detection—either automatically using our ML-based algorithm or by setting manual thresholds—on the following metric types:

  1. Standard metrics (e.g. row count, freshness, and cardinality)
  2. Custom metrics (e.g. sales volume per region)

Create a Metric monitor

There are two ways to create a Metric monitor:

  1. Open the Monitors page, select Create new monitor, and then choose Metric.
  2. Clone an existing Metric monitor by clicking Actions and then Clone. This will pre-fill the form with the existing monitor configuration.

Set up your monitor

Select your data connection, then choose the type of metric you'd like: Table, Column, or Custom.

If you select table or column, you have the option to add a SQL filter to refine your dataset. For example, you could implement a 7-day rolling time window with the following: timestamp >= dateadd(day, -7, current_timestamp). Please ensure the SQL is compatible with your selected data connection.

Metric types

Table metrics

MetricDefinitionAdditional Notes
FreshnessTime since table was last updatedMeasured in minutes. Derived from INFORMATION_SCHEMA. Only supported for Snowflake, BigQuery, and Databricks.
Row CountTotal number of rows

Column metrics

MetricDefinitionSupported Column TypesAdditional Notes
CardinalityNumber of distinct valuesAll types
UniquenessProportion of distinct valuesAll typesProportion between 0 and 1
MinimumLowest numeric valueNumeric columns
MaximumHighest numeric valueNumeric columns
AverageMean valueNumeric columns
MedianMedian value (50th percentile)Numeric columns
SumSum of all valuesNumeric columns
Standard DeviationMeasure of data spreadNumeric columns
Fill RateProportion of non-null valuesAll typesProportion between 0 and 1

Custom metrics

Our custom metric framework is extremely flexible and supports several approaches to defining metrics. Depending on the approach you choose, your query should return some combination of the following columns:

  • Metric value (required): a numeric column containing your metric values
  • Timestamp (optional): a date/time column containing timestamps corresponding to your metric values
  • Group (optional): a string column containing groups/dimensions for your metric
info

The names and order of your columns don't matter. Datafold will automatically infer their meaning based on data type.

The following questions will help you decide which approach is best for you:

  1. Do you want to group your metric by the value of a column in your query? For example, if your metric is sales volume per day, rather than looking at a single metric that encompasses all sales globally, it might be more informative to group by country. In this case, Datafold will automatically compute sales volume separately for each country to assist with root cause analysis when there’s an unexpected change.
  2. Will your query return a single metric value (per group, if relevant) on every monitor run, or an entire time series? We generally recommend starting with the simpler approach of providing a single metric value (per group) per monitor run. However, if you’ve already defined a time series elsewhere (e.g. in your BI tool) and simply want to copy/paste that query into Datafold, then you may prefer the latter approach.
info

Datafold will only log a single data point per timestamp per group, which means you should only send data for a particular time period once that period is complete.

  1. If your metric returns a single value per monitor run, will you provide your own timestamps or use the timestamps of monitor runs? If your query returns a single value per run, we generally recommend letting Datafold provide timestamps based on monitor runs unless you have a compelling reason to provide your own. For example, if your metric always lags by one day, you could explicitly associate yesterday's date with each observation.

As you're writing your query, Datafold will let you know if the result set doesn't match one of the accepted patterns. If you have questions, please contact us and we'll be happy to help.

Configure anomaly detection

Enable anomaly detection to get the most out of metric monitors. You have several options:

  • Automatic: our automated anomaly detection uses machine learning to flag metric values that are out of the ordinary. Dial the sensitivity up or down depending on how many alerts you'd like to receive.
  • Manual: specific thresholds beyond which you'd like the monitor to trigger an alert. Fixed Values are specific minimum and/or maximum values, while Percent Change measure the magnitude of change from one observation to the next.

Add a schedule

Customize the frequency and timing of monitor runs. You can choose a specific hourly or daily time in UTC, or input a cron tab expression for more complex scheduling:

Add notifications

Send notifications via Slack or email when your monitor exceeds a threshold (automatic or manual):

Need help?

If you have any questions about how to use Metric monitors, please reach out to our team via Slack, in-app chat, or email us at support@datafold.com.