Skip to main content

How it works

Datafold Cloud's column-level lineage UI offers a comprehensive visual graph of workflows and the ability to track column-level and specific data point usage, including the impacts of changes made upstream.

column level lineage

How lineage is computed

Datafold computes column-level lineage by:

  1. Ingesting, parsing and analyzing SQL logs from your databases and data warehouses. This allows Datafold to infer dependencies between SQL statements, including those that create, modify, and read data.

  2. Augmenting the metadata graph with data from various sources. This includes metadata from orchestration tools (e.g., dbt), BI tools, and user-provided documentation.

How it works

Overview of lineage

On the Lineage page, you can filter data assets by Data Sources, Tags, Data Owners, and Asset Types (e.g., tables, columns, and BI-created assets such as views, reports, and syncs). You can also search directly to find specific data assets for lineage analysis.

After selecting a table or data asset, the UI will display a graph of table-level lineage by default. You can toggle between Upstream and Downstream perspectives and customize the lineage view by adjusting the Max Depth parameter to your preference.

Column-level lineage

Datafold's column-level lineage helps users trace and document the history, transformations, dependencies, and both downstream and upstream processes of a specific data column within an organization's data assets. This helps users to pinpoint the origins of data validation issues and comprehensively identify downstream data processes and applications.

To view column-level lineage, click on the "Columns" dropdown menu of the selected asset.

To highlight the column path between assets, click the specific column. Reset the view by clicking the Exit the selected path button.

Tabular lineage

Datafold Cloud also offers a tabular lineage view.

Tabular lineage allows users to sort lineage information by depth, asset type, identifier, and owner.

Click on the Actions button to:

  • Focus lineage on current node: Drill down onto the data node or column of interest
  • Show SQL query: Access the SQL query associated with the selected column to understand how the data was queried from the source
  • Show usage details: Access detailed information about the column's read, write, and cumulative read (the sum of read count including read count of downstream columns) for the previous 7 days
  • Show profile: View a data profile that summarizes key tablea and column-level statistics, and any upstream dependencies

Search and filters

Datafold Cloud offers powerful search and filtering capabilities to help users quickly locate specific data assets and isolate data sources of interest.

In both the graphical and tabular lineage views, you can filter by tables or columns within tables, allowing you to go as granular as needed.

Table Filtering: Simply enter the table's name in the search bar to filter and display all relevant information associated with that table.

Column Filtering: To focus specifically on columns, you can search using a combination of keywords. For instance, searching "column table" will display columns associated with a table, while a query like "column dim customer" narrows the search to columns within the "dim customer" table.