Beta ProductSQL Proxy is currently in beta. Features and APIs may change.
Connectivity Setup
dbt Core
Update your profile to point to SQL Proxy. Use your existing Databricks credentials:
# profiles.yml
my_project:
outputs:
prod:
type: databricks
host: sqlproxy.your-company.datafold.com
http_path: /sql/1.0/warehouses/proxy
token: "{{ env_var('DATABRICKS_TOKEN') }}" # Your existing PAT
catalog: your_catalog
schema: your_schema
Or with M2M OAuth:
# profiles.yml
my_project:
outputs:
prod:
type: databricks
host: sqlproxy.your-company.datafold.com
http_path: /sql/1.0/warehouses/proxy
auth_type: oauth
client_id: "{{ env_var('DATABRICKS_CLIENT_ID') }}"
client_secret: "{{ env_var('DATABRICKS_CLIENT_SECRET') }}"
catalog: your_catalog
schema: your_schema
dbt Cloud
In your dbt Cloud environment settings:
| Setting | Value |
|---|
| Host | sqlproxy.your-company.datafold.com |
| HTTP Path | /sql/1.0/warehouses/proxy |
| Credentials | Your existing Databricks PAT or service principal |
Test the connection:
Annotations
Add @datafold: directives in SQL comments to control routing.
Warehouse Size
-- @datafold:warehouse_size=L
SELECT * FROM {{ ref('large_source') }}
Jobs Compute
Use jobs compute for long-running transformations. Classic with spot instances is more cost-effective than serverless.
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4
{{ config(materialized='table') }}
SELECT * FROM {{ ref('source') }}
See Routing Modes for full parameter reference.
Environment-Specific Routing
{%- if target.name == 'prod' -%}
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=8 cluster_policy_id=PROD_POLICY
{%- else -%}
-- @datafold:warehouse_size=S
{%- endif -%}
SELECT * FROM {{ ref('source') }}
Setting Defaults
Create a macro for default annotations:
-- macros/datafold_defaults.sql
{% macro datafold_default() %}
{%- if target.name == 'prod' -%}
-- @datafold:jobs_compute type=classic cluster_policy_id={{ var('datafold_policy_id') }}
{%- endif -%}
{% endmacro %}
Validation Steps
After setup, verify connectivity and routing:
1. Basic Connectivity
2. Simple Query
dbt run --select my_simple_model
3. Jobs Compute (Serverless)
-- models/test/test_jobs_serverless.sql
-- @datafold:jobs_compute
{{ config(materialized='table') }}
SELECT 1 as test_column
4. Jobs Compute (Classic)
-- models/test/test_jobs_classic.sql
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=2
{{ config(materialized='table') }}
SELECT 1 as test_column
5. Verify Tagging
-- models/test/test_jobs_tagged.sql
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=2 custom_tags={"test":"validation"}
{{ config(materialized='table') }}
SELECT 1 as test_column
Check Databricks Jobs UI to confirm tags are applied.
6. Job Policy
-- models/test/test_jobs_policy.sql
-- @datafold:jobs_compute type=classic cluster_policy_id=YOUR_POLICY_ID
{{ config(materialized='table') }}
SELECT 1 as test_column
Required Permissions
Since SQL Proxy uses passthrough authentication, your service principal needs:
| Resource | Permission | Required For |
|---|
| SQL Warehouse | CAN USE | Running queries |
| Jobs | CAN MANAGE RUN | Jobs compute routing |
| Unity Catalog | USE CATALOG, USE SCHEMA, SELECT | Data access |
If routing to multiple warehouses, grant CAN USE on each.