Connections

Connect your data sources to start syncing and querying. Use built-in connectors, connect any HTTP or GraphQL API with a JSON spec, or push data from files. PostgreSQL and MySQL also support Live mode.

Supported Sources

Custom HTTP / GraphQL

Connect any API with a declarative JSON spec - any auth type, auto-pagination, schema evolution

PostgreSQL

Connect to any PostgreSQL database - Sync or Live mode

MySQL

Connect to any MySQL or MariaDB database - Sync or Live mode

Stripe

Customers, charges, invoices, subscriptions

HubSpot

Contacts, companies, deals from your CRM

Salesforce

Accounts, contacts, opportunities, and more (Experimental)

Shopify

Orders, products, customers from your store

Google Sheets

Sync data from Google Sheets spreadsheets

Push Data (rq push)

Push JSON, JSONL, or CSV from files or stdin - one-time imports, scripts, migrations

Setup Wizard

Adding a connection follows a guided wizard:

Select type - Pick your data source (Stripe, Postgres, etc.)
Enter credentials - API keys, database credentials, or OAuth
Choose mode - Sync or Live (Postgres and MySQL only)
Test connection - Verify that credentials work before going further
Discover tables - See all available tables/streams and select which ones to sync (sync mode only)
Schedule - Choose sync frequency: manual, hourly, every 6h, daily, or weekly (sync mode only)
First sync - Trigger the initial data sync and start querying (sync mode only)

If the connection test fails, you can fix your credentials and retry without leaving the wizard. If you cancel mid-setup, the connection is cleaned up automatically.

How Syncing Works

Click Sync on any connection to open the sync modal. You can select which tables to sync, see previous sync stats, and track progress in real time.

Pick tables - Select which tables/streams to sync (or select all)
Submit - The sync job runs in the background; you can close the modal
Track progress - The modal shows elapsed time and auto-updates when done
Review results - See per-table record counts and any errors

Syncs are asynchronous - large datasets won't block the UI. For each table, rawquery uses either a full refresh or incremental sync depending on what the source supports.

Data is stored in the open Apache Iceberg format (Parquet under the hood). This means:

No vendor lock-in - export your data anytime
Efficient columnar storage - fast queries on large datasets
Standard format - works with any tool that reads Iceberg or Parquet

Table Naming

Tables are namespaced by the connection name you choose. For example, if you name your Stripe connection "my_stripe", the tables will be:

text

my_stripe.customers
my_stripe.charges
my_stripe.invoices
my_stripe.subscriptions
...

my_stripe.customers
my_stripe.charges
my_stripe.invoices
my_stripe.subscriptions
...

After syncing, browse all your tables in the Lakehouse view -- see schemas, row counts, column types, and preview data.