No credit cardStart free

Transforms

Define SQL models that build on your raw data and each other.

What are Transforms?

Transforms are named SQL SELECT statements that rawquery materializes into tables. They work like dbt models: you write SELECT queries, reference other transforms with {{ ref('name') }}, and rawquery handles execution order, dependency resolution, and materialization.

Creating a Transform

Each transform has:

  • Name - unique identifier (alphanumeric + underscores)
  • SQL - a SELECT statement, optionally referencing other transforms
  • Schedule - cron expression for automatic runs (optional)
  • Materialization - table (full refresh), view (virtual), or incremental
  • Unique Key - required for incremental, used for deduplication

Referencing Other Transforms

Use the {{ ref('transform_name') }} syntax to reference other transforms in your SQL. rawquery will:

  1. Resolve the reference to the actual table name
  2. Build a dependency graph (DAG) from all references
  3. Execute transforms in the correct order
sql
-- staging_orders: clean raw data
SELECT
id,
customer_id,
amount / 100.0 as amount_dollars,
created_at
FROM my_stripe.charges
WHERE status = 'succeeded'
-- monthly_revenue: aggregate (references staging_orders)
SELECT
date_trunc('month', created_at) as month,
count(*) as order_count,
sum(amount_dollars) as revenue
FROM {{ ref('staging_orders') }}
GROUP BY 1

Materialization Strategies

Table (default)

Drops and recreates the table on every run. Use this for aggregations, snapshots, or any query where you want fresh results.

View (virtual)

Runs the SQL at query time without materializing any data. Use this for lightweight transformations or aliases. Views have no schedule and are not visible in the Lakehouse since they don't produce physical tables.

Incremental

Inserts new rows and deduplicates on unique_key. Use this for append-heavy tables where you only want to process new data.

Execution

Transforms can be run in three ways:

  • Manual - click "Run" on a single transform or "Run All"
  • Scheduled - set a cron expression and rawquery runs it automatically
  • API - POST to the transforms/run endpoint

When running multiple transforms, rawquery resolves the DAG and executes in dependency order. If a transform fails, execution stops.

DAG Visualization

The Transforms page includes a dependency graph view. Toggle it with the "Show DAG" button. Nodes are color-coded by last run status:

  • Green border - last run succeeded
  • Red border - last run failed
  • Gray border - never run

Each node shows its execution order number and dependencies.

Querying Transform Results

Table and Incremental transforms are persisted to Iceberg and appear in the Lakehouse under the transforms schema. You can query them directly:

sql
SELECT * FROM "transforms"."monthly_revenue" LIMIT 100

View transforms are not materialized and do not appear in the Lakehouse. They are evaluated at query time when referenced by other transforms via {{ ref('name') }}.

API Reference

text
GET /api/v1/workspaces/{workspace_id}/transforms # List transforms
POST /api/v1/workspaces/{workspace_id}/transforms # Create transform
GET /api/v1/workspaces/{workspace_id}/transforms/dag # Get DAG
POST /api/v1/workspaces/{workspace_id}/transforms/run # Run all/selected
GET /api/v1/workspaces/{workspace_id}/transforms/{id} # Get transform
PATCH /api/v1/workspaces/{workspace_id}/transforms/{id} # Update transform
DELETE /api/v1/workspaces/{workspace_id}/transforms/{id} # Delete transform
POST /api/v1/workspaces/{workspace_id}/transforms/{id}/run # Run single
GET /api/v1/workspaces/{workspace_id}/transforms/{id}/runs # Run history