No credit cardStart free

Custom Connectors

Connect any HTTP or GraphQL API with a declarative JSON spec. rawquery handles auth, pagination, rate limits, and schema evolution - no code required.

Quick Start

Create a JSON spec describing the API, then run one command:

json
{
"version": "1",
"base_url": "https://api.example.com/v1",
"auth": { "type": "bearer" },
"streams": [
{
"name": "users",
"endpoint": "/users",
"data_path": "data",
"primary_key": ["id"]
}
]
}
bash
# Connect with an API token
rq connect my-api --spec spec.json --auth-token $API_TOKEN
# With a sync schedule
rq connect my-api --spec spec.json --auth-token $API_TOKEN --schedule daily
# Update the spec later (no need to recreate)
rq connections update my-api --spec updated-spec.json

After creation, manage it like any other connection:

bash
rq connections sync my-api # trigger sync
rq connections status my-api # per-stream status
rq connections delete my-api # remove

Spec Format

The spec is a JSON document with these top-level fields:

FieldRequiredDescription
versionYesAlways "1"
base_urlYesAPI base URL
typeNo"rest" (default) or "graphql"
authYesAuthentication configuration
streamsYesArray of endpoints/resources to sync
rate_limitNoRate limiting and retry configuration

Each stream defines an endpoint to extract data from:

FieldDescription
nameTable name in rawquery
endpointURL path (REST) or defaults to /graphql
data_pathJSON path to the array of records (e.g. data.results)
primary_keyArray of field names for deduplication
cursor_fieldField for incremental sync (e.g. updatedAt)
paginationPagination configuration (see below)
paramsStatic query parameters
queryGraphQL query string (GraphQL specs only)

Authentication

API Key / Bearer / Basic

Static tokens. Credentials passed via CLI flags and stored encrypted server-side.

json
// API key in header
{"auth": {"type": "api_key", "header": "X-API-Key"}}
// API key in query parameter
{"auth": {"type": "api_key", "query_param": "api_key"}}
// Bearer token (Authorization: Bearer <token>)
{"auth": {"type": "bearer"}}
// Basic auth (username:password)
{"auth": {"type": "basic"}}
bash
rq connect my-api --spec spec.json --auth-token $TOKEN # api_key or bearer
rq connect my-api --spec spec.json --auth-user me --auth-pass $PASS # basic

OAuth2

Token refresh handled automatically before each sync. Refreshed tokens persisted.

json
{
"auth": {
"type": "oauth2",
"token_url": "https://provider.com/oauth/token",
"refresh_url": "https://provider.com/oauth/token",
"grant_type": "refresh_token"
}
}
bash
rq connect my-api --spec spec.json \
--client-id $CLIENT_ID --client-secret $CLIENT_SECRET \
--refresh-token $REFRESH_TOKEN

Custom Auth

For APIs that require multi-step auth (session tokens, HMAC, custom headers):

json
{
"auth": {
"type": "custom",
"pre_request": {
"url": "https://api.example.com/auth/session",
"method": "POST",
"body": {"username": "{{auth.username}}", "password": "{{auth.password}}"},
"extract": {"session_token": "response.token"}
},
"headers": {
"X-Session-Token": "{{session_token}}"
}
}
}

Pagination

Configure per-stream. If omitted, rawquery assumes a single-page response.

json
// Offset-based
{"pagination": {"type": "offset", "param": "offset", "limit_param": "limit", "page_size": 100}}
// Cursor-based (read cursor from response, pass as query param)
{"pagination": {"type": "cursor", "cursor_path": "meta.next_cursor", "param": "cursor"}}
// Page number
{"pagination": {"type": "page_number", "param": "page", "page_size_param": "per_page", "page_size": 50, "start_page": 1}}
// Link header (RFC 5988, GitHub-style)
{"pagination": {"type": "link_header"}}
// Next URL in response body
{"pagination": {"type": "next_url", "next_url_path": "paging.next"}}

Data Transformation

API responses are rarely flat tables. These options let you shape the data before it lands in Iceberg.

Flatten Nested Objects

json
{
"flatten": {
"address": "prefix", // {"address": {"city": "Paris"}} -> {"address_city": "Paris"}
"metadata": "lift", // {"metadata": {"key": "val"}} -> {"key": "val"}
"raw_json": "json" // keeps as JSON string (default)
}
}

Array Expansion

Expand nested arrays into separate child tables with foreign keys:

json
{
"expand": {
"line_items": {"primary_key": "id", "parent_key": "order_id"}
}
}

Field Selection and Renaming

json
{
"fields": {
"include": ["id", "name", "email", "created_at"],
"rename": {"created_at": "created_date", "id": "external_id"}
}
}

Type Coercion

Override auto-inferred types when needed:

json
{
"types": {
"created_at": "timestamp",
"amount": "float",
"is_active": "boolean",
"metadata": "json"
}
}

Supported types: string, integer, float, boolean, timestamp, date, json.

GraphQL

Set "type": "graphql" at the top level. Each stream uses a query field instead of endpoint. Relay cursor pagination is supported natively.

json
{
"version": "1",
"base_url": "https://api.github.com",
"type": "graphql",
"auth": {"type": "bearer"},
"streams": [
{
"name": "repositories",
"query": "query($cursor: String) { viewer { repositories(first: 100, after: $cursor) { nodes { id name createdAt stargazerCount } pageInfo { hasNextPage endCursor } } } }",
"data_path": "data.viewer.repositories.nodes",
"pagination": {
"type": "cursor",
"cursor_variable": "cursor",
"cursor_path": "data.viewer.repositories.pageInfo.endCursor",
"has_more_path": "data.viewer.repositories.pageInfo.hasNextPage"
},
"primary_key": ["id"]
}
]
}

Rate Limiting

Optional. Configures request throttling, retries, and backoff:

json
{
"rate_limit": {
"requests_per_second": 10,
"retry_on": [429, 500, 502, 503],
"backoff": "exponential",
"max_retries": 5,
"respect_retry_after": true
}
}

The connector respects Retry-After headers, uses exponential backoff with jitter, and isolates failures per stream.

Schema Evolution

rawquery detects schema changes on every sync:

  • New fields - automatically added as nullable columns. Existing rows get nulls.
  • Removed fields - columns kept in the table, new rows get nulls.
  • Type conflicts - logged as warnings, existing types preserved.

Schema changes are reported in sync results. To update the spec (add endpoints, change fields):

bash
rq connections update my-api --spec updated-spec.json

Full Example

A complete spec with OAuth2, cursor pagination, data transformation, and incremental sync:

json
{
"version": "1",
"base_url": "https://api.hubspot.com",
"auth": {
"type": "oauth2",
"token_url": "https://api.hubspot.com/oauth/v1/token",
"refresh_url": "https://api.hubspot.com/oauth/v1/token",
"grant_type": "refresh_token"
},
"rate_limit": {
"requests_per_second": 10,
"retry_on": [429, 500, 502, 503],
"backoff": "exponential",
"max_retries": 5
},
"streams": [
{
"name": "contacts",
"endpoint": "/crm/v3/objects/contacts",
"params": {"limit": 100, "properties": "firstname,lastname,email"},
"pagination": {
"type": "cursor",
"cursor_path": "paging.next.after",
"param": "after"
},
"data_path": "results",
"primary_key": ["id"],
"cursor_field": "updatedAt",
"sync_modes": ["full_refresh", "incremental"],
"flatten": {"properties": "lift"},
"fields": {
"include": ["id", "firstname", "lastname", "email", "updatedAt"]
}
}
]
}