Skip to main content
Beta ProductSQL Proxy is currently in beta. Features and APIs may change.

Explicit Routing

Control routing using @datafold: directives in SQL comments.

Warehouse Size

-- @datafold:warehouse_size=L
SELECT * FROM large_table
Sizes: 2XS, XS, S, M, L, XL, 2XL, 3XL, 4XL

Specific Warehouse

-- @datafold:warehouse=prod-analytics
SELECT * FROM sales.transactions

Jobs Compute

For long-running transformations. Classic compute with spot instances is typically more cost-effective than serverless (which charges by DBU hours).

Serverless

-- @datafold:jobs_compute
CREATE TABLE result AS SELECT ...
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4
CREATE TABLE result AS SELECT ...

Full Parameter Reference

All Databricks Jobs API cluster parameters are supported:
ParameterDescriptionExample
typeserverless (default) or classictype=classic
node_typeEC2 instance typenode_type=i3.xlarge
workersNumber of workers (fixed)workers=4
autoscale_minMin workers (autoscaling)autoscale_min=2
autoscale_maxMax workers (autoscaling)autoscale_max=10
cluster_policy_idDatabricks cluster policy IDcluster_policy_id=ABC123
custom_tagsJSON object for resource taggingcustom_tags={"team":"data"}
runtime_enginestandard or photonruntime_engine=photon
spark_versionDatabricks runtime versionspark_version=14.3.x-scala2.12
aws_attributesAWS config (spot, EBS)See below

Using Job Policies

Databricks cluster policies auto-apply defaults (tagging, instance types, etc.):
-- @datafold:jobs_compute type=classic cluster_policy_id=0123456789ABCDEF

Tagging

-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4 custom_tags={"cost_center":"12345","team":"analytics"}

AWS Spot Instances

-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4 aws_attributes={"availability":"SPOT_WITH_FALLBACK","spot_bid_price_percent":100}

Smart Routing

When enabled, smart routing uses ML to predict optimal warehouse size for queries without explicit directives. Contact your Datafold administrator to enable.

Default Behavior

Queries without routing directives use the default warehouse configured for your connection.