Vector Search

Turn any text column into a semantic search engine. rawquery embeds your data, creates a search endpoint, and optionally publishes a public search page. No vector database, no OpenAI key, no infrastructure.

How It Works

Connect and sync your data source (any connector works)
Embed a text column — rawquery generates vector embeddings for each row
Create a search — binds your data to a search configuration
Search — text queries are embedded and matched against your data using vector similarity

Users never see vectors, models, or dimensions. You type a search query, you get results.

Quick Start

bash

# 1. Embed a text column (creates {table}_embedded with FLOAT[768] vectors)
rq embed my_schema.my_table --column description

# 2. Create a search configuration
rq searches create product-search \
  --table my_schema.my_table \
  --column description \
  --display "name,category,price"

# 3. Search
rq search product-search "lightweight wireless headphones"

# 4. Publish a public search page (optional)
rq searches publish product-search

# 1. Embed a text column (creates {table}_embedded with FLOAT[768] vectors)
rq embed my_schema.my_table --column description

# 2. Create a search configuration
rq searches create product-search \
  --table my_schema.my_table \
  --column description \
  --display "name,category,price"

# 3. Search
rq search product-search "lightweight wireless headphones"

# 4. Publish a public search page (optional)
rq searches publish product-search

Embedding

rq embed reads a table, generates vector embeddings for one or more text columns, and writes a new table with an embedding column (FLOAT[768]).

bash

# Single column
rq embed crm.contacts --column notes

# Multiple columns (concatenated)
rq embed crm.contacts --column first_name --column company --column notes

# Custom output table name
rq embed crm.contacts --column notes --output crm.contacts_vectors

# Single column
rq embed crm.contacts --column notes

# Multiple columns (concatenated)
rq embed crm.contacts --column first_name --column company --column notes

# Custom output table name
rq embed crm.contacts --column notes --output crm.contacts_vectors

The embedding runs as an async job. The CLI polls and shows progress:

Embedding crm.contacts_embedded... 4500/12000 (37%)

Model: nomic-embed-text-v1.5 (768 dimensions, Apache 2.0 license). Good quality, runs on CPU.

Embedding Quotas

Each plan has a separate embedding quota (independent from SQL queries):

Plan	Embeddings/month	Rate limit
Free	50	5/min
Team	5,000	30/min
Business	50,000	60/min

Check your usage with rq usage.

Search Configuration

A search config binds your embedded table to a search endpoint:

bash

rq searches create my-search \
  --table my_schema.my_table \
  --column description \
  --display "name,category,price" \
  --limit 10 \
  --title "Product Search"

rq searches create my-search \
  --table my_schema.my_table \
  --column description \
  --display "name,category,price" \
  --limit 10 \
  --title "Product Search"

This auto-creates:

An embedded table reference ({table}_embedded)
A saved query with an embedding parameter (visible in Saved Queries with a blue "search" badge)
A search endpoint at /api/v1/workspaces/{workspace_id}/searches/{name}/run

Managing Searches

bash

rq searches                          # List all searches
rq searches show my-search           # Show details
rq searches delete my-search         # Delete (also deletes the saved query)

rq searches                          # List all searches
rq searches show my-search           # Show details
rq searches delete my-search         # Delete (also deletes the saved query)

Running Searches

CLI

bash

rq search my-search "lightweight wireless headphones"

rq search my-search "lightweight wireless headphones"

API

bash

curl -X POST \
  https://api.rawquery.dev/api/v1/workspaces/{workspace_id}/searches/{name}/run \
  -H "Authorization: Bearer rq_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"query": "lightweight wireless headphones", "limit": 10}'

curl -X POST \
  https://api.rawquery.dev/api/v1/workspaces/{workspace_id}/searches/{name}/run \
  -H "Authorization: Bearer rq_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"query": "lightweight wireless headphones", "limit": 10}'

Dashboard

Go to Search in the sidebar, expand a search, and use the test input.

Public Search Pages

Publish a search to get a public URL with a search input and results table:

bash

rq searches publish my-search
# -> https://rawquery.dev/s/{token}

# With password protection
rq searches publish my-search --password "secret123"

# Unpublish
rq searches unpublish my-search

rq searches publish my-search
# -> https://rawquery.dev/s/{token}

# With password protection
rq searches publish my-search --password "secret123"

# Unpublish
rq searches unpublish my-search

Public pages are server-rendered HTML with no JavaScript framework. They work on any device and can be embedded in iframes.

Use Cases

Customer search — "find contacts similar to enterprise SaaS decision maker"
Knowledge base — "what's our refund policy for enterprise clients?"
Product catalog — "lightweight running shoes under $100"
Support tickets — "authentication errors after password reset"
RAG retrieval — use the search API as the retrieval step in a RAG pipeline

RAG Integration

rawquery can be the retrieval backend in a RAG stack. The search API returns text results that you feed into your LLM:

bash

# Your app calls rawquery for retrieval
curl -X POST .../searches/knowledge-base/run \
  -d '{"query": "refund policy enterprise"}'
# -> returns top 10 matching text chunks

# Your app stuffs chunks into LLM prompt
# -> LLM generates grounded answer

# Your app calls rawquery for retrieval
curl -X POST .../searches/knowledge-base/run \
  -d '{"query": "refund policy enterprise"}'
# -> returns top 10 matching text chunks

# Your app stuffs chunks into LLM prompt
# -> LLM generates grounded answer

Five vendors (ingestion, chunking, embedding, vector store, retrieval API) replaced by one rawquery workspace.

Known Limitations

Embedding data processing location. Text is sent to our self-hosted EU embedding server for vector generation. The resulting vectors are stored in your EU-hosted lakehouse. All processing is EU-only. No data is retained by the embedding service after processing.

Search performance at scale. Vector search currently uses brute-force similarity scan (no index). This is fast up to ~500,000 rows (sub-second queries). Beyond that, query latency increases linearly with row count. We plan to add HNSW indexing for larger datasets in a future release.