Beta ProductSQL Proxy is currently in beta. Features and APIs may change.
Explicit Routing
Control routing using @datafold: directives in SQL comments.
Warehouse Size
-- @datafold:warehouse_size=L
SELECT * FROM large_table
Sizes: 2XS, XS, S, M, L, XL, 2XL, 3XL, 4XL
Specific Warehouse
-- @datafold:warehouse=prod-analytics
SELECT * FROM sales.transactions
Jobs Compute
For long-running transformations. Classic compute with spot instances is typically more cost-effective than serverless (which charges by DBU hours).
Serverless
-- @datafold:jobs_compute
CREATE TABLE result AS SELECT ...
Classic (Recommended for Cost)
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4
CREATE TABLE result AS SELECT ...
Full Parameter Reference
All Databricks Jobs API cluster parameters are supported:
| Parameter | Description | Example |
|---|
type | serverless (default) or classic | type=classic |
node_type | EC2 instance type | node_type=i3.xlarge |
workers | Number of workers (fixed) | workers=4 |
autoscale_min | Min workers (autoscaling) | autoscale_min=2 |
autoscale_max | Max workers (autoscaling) | autoscale_max=10 |
cluster_policy_id | Databricks cluster policy ID | cluster_policy_id=ABC123 |
custom_tags | JSON object for resource tagging | custom_tags={"team":"data"} |
runtime_engine | standard or photon | runtime_engine=photon |
spark_version | Databricks runtime version | spark_version=14.3.x-scala2.12 |
aws_attributes | AWS config (spot, EBS) | See below |
Using Job Policies
Databricks cluster policies auto-apply defaults (tagging, instance types, etc.):
-- @datafold:jobs_compute type=classic cluster_policy_id=0123456789ABCDEF
Tagging
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4 custom_tags={"cost_center":"12345","team":"analytics"}
AWS Spot Instances
-- @datafold:jobs_compute type=classic node_type=i3.xlarge workers=4 aws_attributes={"availability":"SPOT_WITH_FALLBACK","spot_bid_price_percent":100}
Smart Routing
When enabled, smart routing uses ML to predict optimal warehouse size for queries without explicit directives. Contact your Datafold administrator to enable.
Default Behavior
Queries without routing directives use the default warehouse configured for your connection.