Documentation Index
Fetch the complete documentation index at: https://docs.datafold.com/llms.txt
Use this file to discover all available pages before exploring further.
What is SQL Proxy?
SQL Proxy is a middleware that routes SQL queries to different compute resources based on query characteristics. It works with any tool that connects via ODBC/JDBC, including BI tools and dbt. Supported platforms:- Databricks (available now)
- Snowflake (coming soon)
The Problem
Without intelligent routing, each system typically connects to a dedicated warehouse sized for peak load—wasting compute on small queries or causing spills on large ones. Oversized warehouses waste compute. Undersized warehouses spill to disk, hurting performance.With Datafold SQL Proxy
Queries are routed to appropriately-sized compute. Large warehouses only spin up for truly large queries. Small workloads run on cheaper compute. Key points:- Supports passthrough auth (your Databricks credentials) or managed auth (proxy tokens)
- Datafold uses a separate admin account to manage infrastructure
Routing Modes
| Mode | Description |
|---|---|
| Explicit Routing | Control routing via @datafold: SQL comments |
| Smart Routing | ML-based automatic warehouse selection |
Authentication
SQL Proxy supports two authentication modes:| Method | Mode | Description |
|---|---|---|
| PAT | Passthrough | Databricks Personal Access Token forwarded to Databricks |
| M2M OAuth | Passthrough | Databricks service principal credentials forwarded to Databricks |
| Proxy Token | Managed | Token issued by proxy for registered principals |
Networking Requirements
SQL Proxy is deployed in Datafold’s infrastructure. Your data platform must allow inbound connections from Datafold’s IP ranges. Required Databricks Access:- SQL Warehouse connectivity (port 443)
- Jobs API access (for jobs compute routing)
- Unity Catalog access (if used)
Quick Start
- Configure networking - allowlist Datafold IPs in your Databricks workspace
- Register warehouses - add your Databricks warehouses via the Admin API, optionally configure jobs compute
- Register principals - create users/service principals with their Databricks credentials via the Admin API
- Generate tokens - create proxy tokens for principals via the Tokens API
- Update connection - point dbt/BI tools to SQL Proxy endpoint
- Test connectivity - run
dbt debugor a simple query - Add annotations - use
@datafold:directives for routing control
