Vector Search
Turn any text column into a semantic search engine. rawquery embeds your data, creates a search endpoint, and optionally publishes a public search page. No vector database, no OpenAI key, no infrastructure.
How It Works
- Connect and sync your data source (any connector works)
- Embed a text column — rawquery generates vector embeddings for each row
- Create a search — binds your data to a search configuration
- Search — text queries are embedded and matched against your data using vector similarity
Users never see vectors, models, or dimensions. You type a search query, you get results.
Quick Start
# 1. Embed a text column (creates {table}_embedded with FLOAT[768] vectors)rq embed my_schema.my_table --column description
# 2. Create a search configurationrq searches create product-search \ --table my_schema.my_table \ --column description \ --display "name,category,price"
# 3. Searchrq search product-search "lightweight wireless headphones"
# 4. Publish a public search page (optional)rq searches publish product-search# 1. Embed a text column (creates {table}_embedded with FLOAT[768] vectors)rq embed my_schema.my_table --column description
# 2. Create a search configurationrq searches create product-search \ --table my_schema.my_table \ --column description \ --display "name,category,price"
# 3. Searchrq search product-search "lightweight wireless headphones"
# 4. Publish a public search page (optional)rq searches publish product-searchEmbedding
rq embed reads a table, generates vector embeddings for one or more text columns, and writes a new table with an embedding column (FLOAT[768]).
# Single columnrq embed crm.contacts --column notes
# Multiple columns (concatenated)rq embed crm.contacts --column first_name --column company --column notes
# Custom output table namerq embed crm.contacts --column notes --output crm.contacts_vectors# Single columnrq embed crm.contacts --column notes
# Multiple columns (concatenated)rq embed crm.contacts --column first_name --column company --column notes
# Custom output table namerq embed crm.contacts --column notes --output crm.contacts_vectorsThe embedding runs as an async job. The CLI polls and shows progress:
Embedding crm.contacts_embedded... 4500/12000 (37%)
Model: nomic-embed-text-v1.5 (768 dimensions, Apache 2.0 license). Good quality, runs on CPU.
Embedding Quotas
Each plan has a separate embedding quota (independent from SQL queries):
| Plan | Embeddings/month | Rate limit |
|---|---|---|
| Free | 50 | 5/min |
| Team | 5,000 | 30/min |
| Business | 50,000 | 60/min |
Check your usage with rq usage.
Search Configuration
A search config binds your embedded table to a search endpoint:
rq searches create my-search \ --table my_schema.my_table \ --column description \ --display "name,category,price" \ --limit 10 \ --title "Product Search"rq searches create my-search \ --table my_schema.my_table \ --column description \ --display "name,category,price" \ --limit 10 \ --title "Product Search"This auto-creates:
- An embedded table reference (
{table}_embedded) - A saved query with an
embeddingparameter (visible in Saved Queries with a blue "search" badge) - A search endpoint at
/api/v1/workspaces/{workspace_id}/searches/{name}/run
Managing Searches
rq searches # List all searchesrq searches show my-search # Show detailsrq searches delete my-search # Delete (also deletes the saved query)rq searches # List all searchesrq searches show my-search # Show detailsrq searches delete my-search # Delete (also deletes the saved query)Running Searches
CLI
rq search my-search "lightweight wireless headphones"rq search my-search "lightweight wireless headphones"API
curl -X POST \ https://api.rawquery.dev/api/v1/workspaces/{workspace_id}/searches/{name}/run \ -H "Authorization: Bearer rq_your_api_key" \ -H "Content-Type: application/json" \ -d '{"query": "lightweight wireless headphones", "limit": 10}'curl -X POST \ https://api.rawquery.dev/api/v1/workspaces/{workspace_id}/searches/{name}/run \ -H "Authorization: Bearer rq_your_api_key" \ -H "Content-Type: application/json" \ -d '{"query": "lightweight wireless headphones", "limit": 10}'Dashboard
Go to Search in the sidebar, expand a search, and use the test input.
Public Search Pages
Publish a search to get a public URL with a search input and results table:
rq searches publish my-search# -> https://rawquery.dev/s/{token}
# With password protectionrq searches publish my-search --password "secret123"
# Unpublishrq searches unpublish my-searchrq searches publish my-search# -> https://rawquery.dev/s/{token}
# With password protectionrq searches publish my-search --password "secret123"
# Unpublishrq searches unpublish my-searchPublic pages are server-rendered HTML with no JavaScript framework. They work on any device and can be embedded in iframes.
Use Cases
- Customer search — "find contacts similar to enterprise SaaS decision maker"
- Knowledge base — "what's our refund policy for enterprise clients?"
- Product catalog — "lightweight running shoes under $100"
- Support tickets — "authentication errors after password reset"
- RAG retrieval — use the search API as the retrieval step in a RAG pipeline
RAG Integration
rawquery can be the retrieval backend in a RAG stack. The search API returns text results that you feed into your LLM:
# Your app calls rawquery for retrievalcurl -X POST .../searches/knowledge-base/run \ -d '{"query": "refund policy enterprise"}'# -> returns top 10 matching text chunks
# Your app stuffs chunks into LLM prompt# -> LLM generates grounded answer# Your app calls rawquery for retrievalcurl -X POST .../searches/knowledge-base/run \ -d '{"query": "refund policy enterprise"}'# -> returns top 10 matching text chunks
# Your app stuffs chunks into LLM prompt# -> LLM generates grounded answerFive vendors (ingestion, chunking, embedding, vector store, retrieval API) replaced by one rawquery workspace.
Known Limitations
Embedding data processing location. Text is sent to our self-hosted EU embedding server for vector generation. The resulting vectors are stored in your EU-hosted lakehouse. All processing is EU-only. No data is retained by the embedding service after processing.
Search performance at scale. Vector search currently uses brute-force similarity scan (no index). This is fast up to ~500,000 rows (sub-second queries). Beyond that, query latency increases linearly with row count. We plan to add HNSW indexing for larger datasets in a future release.