Metric monitors detect anomalies in your data using ML-based algorithms or manual thresholds, supporting standard and custom metrics for tables or columns.
INFO
Please contact support@datafold.com if you’d like to enable this feature for your organization.
Metric monitors allow you to perform anomaly detection—either automatically using our ML-based algorithm or by setting manual thresholds—on the following metric types:
There are two ways to create a Metric Monitor:
Select your data connection, then choose the type of metric you’d like: Table, Column, or Custom.
If you select table or column, you have the option to add a SQL filter to refine your dataset. For example, you could implement a 7-day rolling time window with the following: timestamp >= dateadd(day, -7, current_timestamp)
. Please ensure the SQL is compatible with your selected data connection.
Metric | Definition | Additional Notes |
---|---|---|
Freshness | Time since table was last updated | Measured in minutes. Derived from INFORMATION_SCHEMA. Only supported for Snowflake, BigQuery, and Databricks. |
Row Count | Total number of rows |
Metric | Definition | Supported Column Types | Additional Notes |
---|---|---|---|
Cardinality | Number of distinct values | All types | |
Uniqueness | Proportion of distinct values | All types | Proportion between 0 and 1 |
Minimum | Lowest numeric value | Numeric columns | |
Maximum | Highest numeric value | Numeric columns | |
Average | Mean value | Numeric columns | |
Median | Median value (50th percentile) | Numeric columns | |
Sum | Sum of all values | Numeric columns | |
Standard Deviation | Measure of data spread | Numeric columns | |
Fill Rate | Proportion of non-null values | All types | Proportion between 0 and 1 |
Our custom metric framework is extremely flexible and supports several approaches to defining metrics. Depending on the approach you choose, your query should return some combination of the following columns:
INFO
The names and order of your columns don’t matter. Datafold will automatically infer their meaning based on data type.
The following questions will help you decide which approach is best for you:
INFO
Datafold will only log a single data point per timestamp per group, which means you should only send data for a particular time period once that period is complete.
As you’re writing your query, Datafold will let you know if the result set doesn’t match one of the accepted patterns. If you have questions, please contact us and we’ll be happy to help.
Enable anomaly detection to get the most out of metric monitors. You have several options:
You can choose to run your monitor daily, hourly, or even input a cron expression for more complex scheduling:
Send notifications via Slack or email when your monitor exceeds a threshold (automatic or manual):
If you have any questions about how to use Metric monitors, please reach out to our team via Slack, in-app chat, or email us at support@datafold.com.
Metric monitors detect anomalies in your data using ML-based algorithms or manual thresholds, supporting standard and custom metrics for tables or columns.
INFO
Please contact support@datafold.com if you’d like to enable this feature for your organization.
Metric monitors allow you to perform anomaly detection—either automatically using our ML-based algorithm or by setting manual thresholds—on the following metric types:
There are two ways to create a Metric Monitor:
Select your data connection, then choose the type of metric you’d like: Table, Column, or Custom.
If you select table or column, you have the option to add a SQL filter to refine your dataset. For example, you could implement a 7-day rolling time window with the following: timestamp >= dateadd(day, -7, current_timestamp)
. Please ensure the SQL is compatible with your selected data connection.
Metric | Definition | Additional Notes |
---|---|---|
Freshness | Time since table was last updated | Measured in minutes. Derived from INFORMATION_SCHEMA. Only supported for Snowflake, BigQuery, and Databricks. |
Row Count | Total number of rows |
Metric | Definition | Supported Column Types | Additional Notes |
---|---|---|---|
Cardinality | Number of distinct values | All types | |
Uniqueness | Proportion of distinct values | All types | Proportion between 0 and 1 |
Minimum | Lowest numeric value | Numeric columns | |
Maximum | Highest numeric value | Numeric columns | |
Average | Mean value | Numeric columns | |
Median | Median value (50th percentile) | Numeric columns | |
Sum | Sum of all values | Numeric columns | |
Standard Deviation | Measure of data spread | Numeric columns | |
Fill Rate | Proportion of non-null values | All types | Proportion between 0 and 1 |
Our custom metric framework is extremely flexible and supports several approaches to defining metrics. Depending on the approach you choose, your query should return some combination of the following columns:
INFO
The names and order of your columns don’t matter. Datafold will automatically infer their meaning based on data type.
The following questions will help you decide which approach is best for you:
INFO
Datafold will only log a single data point per timestamp per group, which means you should only send data for a particular time period once that period is complete.
As you’re writing your query, Datafold will let you know if the result set doesn’t match one of the accepted patterns. If you have questions, please contact us and we’ll be happy to help.
Enable anomaly detection to get the most out of metric monitors. You have several options:
You can choose to run your monitor daily, hourly, or even input a cron expression for more complex scheduling:
Send notifications via Slack or email when your monitor exceeds a threshold (automatic or manual):
If you have any questions about how to use Metric monitors, please reach out to our team via Slack, in-app chat, or email us at support@datafold.com.