# q — Run SQL Directly on CSV, TSV, and Log Files > q (harelba/q) runs full SQLite-compatible SQL queries against CSVs, TSVs, and any delimited text file on disk — turning ad-hoc log digging into a one-line SQL statement. ## Install Save in your project root: # q — Run SQL Directly on CSV, TSV, and Log Files ## Quick Use ```bash brew install q # Sum a CSV column q -H -d , "SELECT SUM(amount) FROM orders.csv WHERE status = 'paid'" # Join two CSVs q -H -d , "SELECT o.id, c.email FROM orders.csv o JOIN customers.csv c ON o.cid = c.id" # Query stdin cat access.log | q "SELECT c1, COUNT(*) FROM - GROUP BY c1 ORDER BY 2 DESC LIMIT 10" ``` ## Introduction `q` bridges the gap between awk and a full SQLite database: it loads your CSV/TSV/delimited file into an in-memory SQLite table on the fly, runs your SQL, and prints the result. Perfect for when you have a log or export and you'd rather write `SELECT GROUP BY` than compose a pipeline of `cut | sort | uniq -c | sort -rn | head`. ## What q Does - Treats CSV/TSV/any-delimited file as a SQL table. - Supports SQLite's full SQL, including JOINs, window functions, aggregates. - Reads from stdin (`-`) for pipeline use. - Auto-detects header row with `-H`. - Caches parsed data in `.qsql` files for repeat queries. ## Architecture Overview q is a single Python script. It parses the SQL, identifies file references, loads each into a SQLite in-memory table (with type inference), rewrites the query against the in-memory names, and executes. Optionally, it persists the in-memory DB to disk as a `.qsql` file for instant re-use. ## Self-Hosting & Configuration - Install via brew, apt, dnf, pip. - `-d ,` sets delimiter; `-D ,` for output delimiter. - `-T` for tab. - `-C readwrite` enables `.qsql` caching. - `.qrc` at home dir stores defaults. ## Key Features - Full SQL against flat files — no `awk` gymnastics. - JOINs across multiple files. - Stdin-friendly (`FROM -`). - Caches parsed data for fast repeated queries. - Type inference for numeric columns. ## Comparison with Similar Tools - **csvkit** — Python-based; more one-purpose tools, no SQL. - **miller** — verb grammar; q gives you real SQL instead. - **duckdb CLI** — columnar SQL; faster on huge files, heavier install. - **sqlite3 + .import** — DIY version of what q automates. - **xsv** — fast Rust CSV ops; no SQL. ## FAQ **Q: How big a file can it handle?** A: Up to available RAM; switch to duckdb for 10GB+. **Q: Numeric types?** A: Auto-inferred; override with SQL CASTs. **Q: Works with gzipped files?** A: `zcat file.csv.gz | q -H -d , "..."`. **Q: Persistent tables?** A: Save to `.qsql` with `-C readwrite`. ## Sources - https://github.com/harelba/q - https://harelba.github.io/q --- Source: https://tokrepo.com/en/workflows/4475af2f-38c4-11f1-9bc6-00163e2b0d79 Author: AI Open Source