# q — Run SQL Directly on CSV, TSV, and Log Files

> q (harelba/q) runs full SQLite-compatible SQL queries against CSVs, TSVs, and any delimited text file on disk — turning ad-hoc log digging into a one-line SQL statement.

## Install

Save in your project root:

# q — Run SQL Directly on CSV, TSV, and Log Files

## Quick Use
```bash
brew install q
# Sum a CSV column
q -H -d , "SELECT SUM(amount) FROM orders.csv WHERE status = 'paid'"
# Join two CSVs
q -H -d , "SELECT o.id, c.email FROM orders.csv o JOIN customers.csv c ON o.cid = c.id"
# Query stdin
cat access.log | q "SELECT c1, COUNT(*) FROM - GROUP BY c1 ORDER BY 2 DESC LIMIT 10"
```

## Introduction
`q` bridges the gap between awk and a full SQLite database: it loads your CSV/TSV/delimited file into an in-memory SQLite table on the fly, runs your SQL, and prints the result. Perfect for when you have a log or export and you'd rather write `SELECT GROUP BY` than compose a pipeline of `cut | sort | uniq -c | sort -rn | head`.

## What q Does
- Treats CSV/TSV/any-delimited file as a SQL table.
- Supports SQLite's full SQL, including JOINs, window functions, aggregates.
- Reads from stdin (`-`) for pipeline use.
- Auto-detects header row with `-H`.
- Caches parsed data in `.qsql` files for repeat queries.

## Architecture Overview
q is a single Python script. It parses the SQL, identifies file references, loads each into a SQLite in-memory table (with type inference), rewrites the query against the in-memory names, and executes. Optionally, it persists the in-memory DB to disk as a `.qsql` file for instant re-use.

## Self-Hosting & Configuration
- Install via brew, apt, dnf, pip.
- `-d ,` sets delimiter; `-D ,` for output delimiter.
- `-T` for tab.
- `-C readwrite` enables `.qsql` caching.
- `.qrc` at home dir stores defaults.

## Key Features
- Full SQL against flat files — no `awk` gymnastics.
- JOINs across multiple files.
- Stdin-friendly (`FROM -`).
- Caches parsed data for fast repeated queries.
- Type inference for numeric columns.

## Comparison with Similar Tools
- **csvkit** — Python-based; more one-purpose tools, no SQL.
- **miller** — verb grammar; q gives you real SQL instead.
- **duckdb CLI** — columnar SQL; faster on huge files, heavier install.
- **sqlite3 + .import** — DIY version of what q automates.
- **xsv** — fast Rust CSV ops; no SQL.

## FAQ
**Q: How big a file can it handle?**
A: Up to available RAM; switch to duckdb for 10GB+.

**Q: Numeric types?**
A: Auto-inferred; override with SQL CASTs.

**Q: Works with gzipped files?**
A: `zcat file.csv.gz | q -H -d , "..."`.

**Q: Persistent tables?**
A: Save to `.qsql` with `-C readwrite`.

## Sources
- https://github.com/harelba/q
- https://harelba.github.io/q

---
Source: https://tokrepo.com/en/workflows/4475af2f-38c4-11f1-9bc6-00163e2b0d79
Author: AI Open Source