No credit cardStart free

Push Data

Push JSON, JSONL, CSV, or Parquet into Iceberg tables from files or stdin. No connector setup, no scheduling - just pipe data in.

Quick Start

bash
# Push a JSON file
rq push --table crm.contacts --file data.json
# With a primary key (for deduplication on subsequent pushes)
rq push --table crm.contacts --file data.json --pk id
# Replace the table entirely
rq push --table crm.contacts --file data.json --mode overwrite
# Push a Parquet file (up to 2 GB, streaming)
rq push --table raw.migration --file dump.parquet

Supported Formats

Format is auto-detected from the file extension. Override with --input-format.

bash
# JSON array (auto-detected from .json)
rq push --table raw.users --file users.json
# JSON lines - one object per line (auto-detected from .jsonl)
rq push --table raw.events --file events.jsonl
# CSV with headers (auto-detected from .csv)
rq push --table raw.products --file products.csv
# Parquet (auto-detected from .parquet) — typed, streaming, up to 2 GB
rq push --table raw.orders --file orders.parquet

Parquet

Parquet takes a dedicated streaming upload path. The file is sent as-is — no JSON serialisation, no 100k-rows-per-request ceiling — and the server writes it out as one Iceberg snapshot containing N data files.

  • Max size: 2 GB per push. Split larger files and push in sequence.
  • Stdin not supported: Parquet requires --file <path>.
  • Type fidelity: column types are taken from the Parquet schema, not inferred.
  • Schema evolution: new columns are added automatically on append; conflicting types are rejected (use --mode overwrite to replace the table instead).

Typical use cases:

bash
# Migrate from another warehouse (BigQuery / Snowflake unload → Parquet)
rq push --table analytics.events --file bq_export.parquet
# Bulk-load a pandas dataframe
# (df.to_parquet("out.parquet"))
rq push --table raw.scored --file out.parquet
# Replace a staging table
rq push --table staging.orders --file latest.parquet --mode overwrite

Stdin

Pipe data from any command (text formats only). Specify --input-format when reading from stdin.

bash
# Pipe from curl
curl -s https://api.example.com/users | rq push --table raw.users --input-format json
# Pipe from a script
python generate_data.py | rq push --table raw.generated --input-format jsonl
# Pipe CSV
cat export.csv | rq push --table imports.q4 --input-format csv

Write Modes

ModeBehaviour
appendAdd rows to the table (default)
overwriteReplace the entire table contents

Schema Inference

For JSON / JSONL / CSV, types are inferred from the data:

  • Integers and floats detected from values
  • Mixed int + float fields promote to float
  • Mixed types fall back to string
  • Nested objects stored as JSON strings
  • Null values are handled gracefully

Text formats are auto-chunked into 5,000-record batches for upload. Parquet is streamed row-group by row-group; no batching is needed on the client side.

Flags

FlagDescription
--tableTarget table (schema.name format, required)
--filePath to data file (omit for stdin; required for Parquet)
--pkPrimary key column for deduplication (text formats only)
--input-formatjson, jsonl, csv, parquet (auto-detected from extension)
--modeappend (default) or overwrite