Skip to main content
Beta ProductSQL Proxy is currently in beta. Features and APIs may change.

Connectivity Setup

dbt Core

Update your profile to point to SQL Proxy. Use your existing Databricks credentials:
# profiles.yml
my_project:
  outputs:
    prod:
      type: databricks
      host: sqlproxy.your-company.datafold.com
      http_path: /sql/1.0/warehouses/proxy
      token: "{{ env_var('DATABRICKS_TOKEN') }}"  # Your existing PAT
      catalog: your_catalog
      schema: your_schema
Or with M2M OAuth:
# profiles.yml
my_project:
  outputs:
    prod:
      type: databricks
      host: sqlproxy.your-company.datafold.com
      http_path: /sql/1.0/warehouses/proxy
      auth_type: oauth
      client_id: "{{ env_var('DATABRICKS_CLIENT_ID') }}"
      client_secret: "{{ env_var('DATABRICKS_CLIENT_SECRET') }}"
      catalog: your_catalog
      schema: your_schema

dbt Cloud

In your dbt Cloud environment settings:
SettingValue
Hostsqlproxy.your-company.datafold.com
HTTP Path/sql/1.0/warehouses/proxy
CredentialsYour existing Databricks PAT or service principal
Test the connection:
dbt debug

Annotations

Add @datafold: directives in SQL comments to control routing.

Warehouse Size

-- @datafold:warehouse_size=L
SELECT * FROM {{ ref('large_source') }}

Jobs Compute

Use jobs compute for long-running transformations. Classic with spot instances is more cost-effective than serverless.
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4
{{ config(materialized='table') }}
SELECT * FROM {{ ref('source') }}
See Routing Modes for full parameter reference.

Environment-Specific Routing

{%- if target.name == 'prod' -%}
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=8 cluster_policy_id=PROD_POLICY
{%- else -%}
-- @datafold:warehouse_size=S
{%- endif -%}

SELECT * FROM {{ ref('source') }}

Setting Defaults

Create a macro for default annotations:
-- macros/datafold_defaults.sql
{% macro datafold_default() %}
  {%- if target.name == 'prod' -%}
    -- @datafold:jobs_compute type=classic cluster_policy_id={{ var('datafold_policy_id') }}
  {%- endif -%}
{% endmacro %}

Validation Steps

After setup, verify connectivity and routing:

1. Basic Connectivity

dbt debug

2. Simple Query

dbt run --select my_simple_model

3. Jobs Compute (Serverless)

-- models/test/test_jobs_serverless.sql
-- @datafold:jobs_compute
{{ config(materialized='table') }}
SELECT 1 as test_column

4. Jobs Compute (Classic)

-- models/test/test_jobs_classic.sql
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=2
{{ config(materialized='table') }}
SELECT 1 as test_column

5. Verify Tagging

-- models/test/test_jobs_tagged.sql
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=2 custom_tags={"test":"validation"}
{{ config(materialized='table') }}
SELECT 1 as test_column
Check Databricks Jobs UI to confirm tags are applied.

6. Job Policy

-- models/test/test_jobs_policy.sql
-- @datafold:jobs_compute type=classic cluster_policy_id=YOUR_POLICY_ID
{{ config(materialized='table') }}
SELECT 1 as test_column

Required Permissions

Since SQL Proxy uses passthrough authentication, your service principal needs:
ResourcePermissionRequired For
SQL WarehouseCAN USERunning queries
JobsCAN MANAGE RUNJobs compute routing
Unity CatalogUSE CATALOG, USE SCHEMA, SELECTData access
If routing to multiple warehouses, grant CAN USE on each.