Transforms
Define SQL models that build on your raw data and each other.
What are Transforms?
Transforms are named SQL SELECT statements that rawquery materializes into
tables. They work like dbt models: you write SELECT queries, reference other
transforms with {{ ref('name') }}, and rawquery handles
execution order, dependency resolution, and materialization.
Creating a Transform
Each transform has:
- Name - unique identifier (alphanumeric + underscores)
- SQL - a SELECT statement, optionally referencing other transforms
- Schedule - cron expression for automatic runs (optional)
- Materialization -
table(full refresh),view(virtual), orincremental - Unique Key - required for incremental, used for deduplication
Referencing Other Transforms
Use the {{ ref('transform_name') }} syntax to reference
other transforms in your SQL. rawquery will:
- Resolve the reference to the actual table name
- Build a dependency graph (DAG) from all references
- Execute transforms in the correct order
-- staging_orders: clean raw dataSELECT id, customer_id, amount / 100.0 as amount_dollars, created_atFROM my_stripe.chargesWHERE status = 'succeeded'
-- monthly_revenue: aggregate (references staging_orders)SELECT date_trunc('month', created_at) as month, count(*) as order_count, sum(amount_dollars) as revenueFROM {{ ref('staging_orders') }}GROUP BY 1-- staging_orders: clean raw dataSELECT id, customer_id, amount / 100.0 as amount_dollars, created_atFROM my_stripe.chargesWHERE status = 'succeeded'
-- monthly_revenue: aggregate (references staging_orders)SELECT date_trunc('month', created_at) as month, count(*) as order_count, sum(amount_dollars) as revenueFROM {{ ref('staging_orders') }}GROUP BY 1Materialization Strategies
Table (default)
Drops and recreates the table on every run. Use this for aggregations, snapshots, or any query where you want fresh results.
View (virtual)
Runs the SQL at query time without materializing any data. Use this for lightweight transformations or aliases. Views have no schedule and are not visible in the Lakehouse since they don't produce physical tables.
Incremental
Inserts new rows and deduplicates on unique_key. Use this for
append-heavy tables where you only want to process new data.
Execution
Transforms can be run in three ways:
- Manual - click "Run" on a single transform or "Run All"
- Scheduled - set a cron expression and rawquery runs it automatically
- API - POST to the transforms/run endpoint
When running multiple transforms, rawquery resolves the DAG and executes in dependency order. If a transform fails, execution stops.
DAG Visualization
The Transforms page includes a dependency graph view. Toggle it with the "Show DAG" button. Nodes are color-coded by last run status:
- Green border - last run succeeded
- Red border - last run failed
- Gray border - never run
Each node shows its execution order number and dependencies.
Querying Transform Results
Table and Incremental transforms are persisted to Iceberg and appear in the
Lakehouse under the transforms schema. You can query them directly:
SELECT * FROM "transforms"."monthly_revenue" LIMIT 100SELECT * FROM "transforms"."monthly_revenue" LIMIT 100View transforms are not materialized and do not appear in the Lakehouse.
They are evaluated at query time when referenced by other transforms
via {{ ref('name') }}.
API Reference
GET /api/v1/workspaces/{workspace_id}/transforms # List transformsPOST /api/v1/workspaces/{workspace_id}/transforms # Create transformGET /api/v1/workspaces/{workspace_id}/transforms/dag # Get DAGPOST /api/v1/workspaces/{workspace_id}/transforms/run # Run all/selectedGET /api/v1/workspaces/{workspace_id}/transforms/{id} # Get transformPATCH /api/v1/workspaces/{workspace_id}/transforms/{id} # Update transformDELETE /api/v1/workspaces/{workspace_id}/transforms/{id} # Delete transformPOST /api/v1/workspaces/{workspace_id}/transforms/{id}/run # Run singleGET /api/v1/workspaces/{workspace_id}/transforms/{id}/runs # Run historyGET /api/v1/workspaces/{workspace_id}/transforms # List transformsPOST /api/v1/workspaces/{workspace_id}/transforms # Create transformGET /api/v1/workspaces/{workspace_id}/transforms/dag # Get DAGPOST /api/v1/workspaces/{workspace_id}/transforms/run # Run all/selectedGET /api/v1/workspaces/{workspace_id}/transforms/{id} # Get transformPATCH /api/v1/workspaces/{workspace_id}/transforms/{id} # Update transformDELETE /api/v1/workspaces/{workspace_id}/transforms/{id} # Delete transformPOST /api/v1/workspaces/{workspace_id}/transforms/{id}/run # Run singleGET /api/v1/workspaces/{workspace_id}/transforms/{id}/runs # Run history