q — Run SQL Directly on CSV, TSV, and Log Files
q (harelba/q) runs full SQLite-compatible SQL queries against CSVs, TSVs, and any delimited text file on disk — turning ad-hoc log digging into a one-line SQL statement.
Agent 可直接安装
这个资产可安装;Agent 先选择当前运行时、检查安装计划,再运行匹配命令。
npx -y tokrepo@latest install 4475af2f-38c4-11f1-9bc6-00163e2b0d79 --target codex先 dry-run 确认安装计划,再运行此命令。
What it is
q is a command-line tool that lets you run full SQLite-compatible SQL queries directly against CSV, TSV, and any delimited text file on disk. Instead of importing data into a database or writing awk/sed pipelines, you write SQL and q handles parsing, column detection, and query execution. It supports JOIN operations across multiple files, aggregations, and subqueries.
q targets data analysts, SREs, and developers who frequently analyze structured text files. It is particularly useful for one-off log analysis, CSV exploration, and quick data investigations where setting up a database would be overkill.
How it saves time or tokens
q eliminates the database import step for ad-hoc queries. Instead of creating a table, loading data, querying, and cleaning up, you write a single SQL statement against the file. For log analysis during incidents, this means getting answers in seconds rather than minutes. JOIN operations across multiple CSV files replace multi-step shell pipelines with a single readable query.
How to use
- Install q:
brew install q
- Run a SQL query against a CSV file:
q -H -d , "SELECT SUM(amount) FROM orders.csv WHERE status = 'paid'"
- Join multiple files:
q -H -d , "SELECT o.id, c.email FROM orders.csv o JOIN customers.csv c ON o.cid = c.id"
Example
Analyze web server access logs with SQL:
# Top 10 IPs by request count
q "SELECT c1 as ip, COUNT(*) as hits FROM access.log GROUP BY c1 ORDER BY hits DESC LIMIT 10"
# 404 errors in the last hour
q "SELECT c7 as path, COUNT(*) FROM access.log WHERE c9 = '404' GROUP BY c7 ORDER BY 2 DESC"
# Join access logs with a blocklist CSV
q -H "SELECT a.c1, b.reason FROM access.log a JOIN blocklist.csv b ON a.c1 = b.ip"
Columns without headers are referenced as c1, c2, c3, etc.
Related on TokRepo
- AI Tools for Database — Database tools including SQL clients and query optimizers
- AI Tools for DevOps — DevOps tools for log analysis and infrastructure management
Common pitfalls
- Without the
-Hflag, q treats the first row as data, not headers. Always use-Hfor CSV files with header rows. - The
-dflag sets the delimiter. Forgetting it for CSV files (default is whitespace) produces incorrect column parsing. - q loads the entire file into memory for query execution. Files larger than available RAM will cause out-of-memory errors. Use streaming tools for very large files.
常见问题
q uses SQLite's SQL dialect. This includes SELECT, WHERE, GROUP BY, HAVING, ORDER BY, JOIN, subqueries, and most SQLite functions. Window functions and CTEs are supported in newer versions.
Yes. q supports JOIN operations across multiple files. Use standard SQL JOIN syntax with file paths as table names. This replaces multi-step pipelines of grep, awk, and paste with a single SQL statement.
No. q only reads files and outputs query results. It never modifies the source files. All operations are read-only, making it safe to use on production log files.
When files lack headers (or when -H is not used), q assigns automatic column names: c1, c2, c3, etc. You reference these in your SQL queries. For files with headers, use -H to enable named columns.
For one-off queries, yes. q eliminates the import step entirely. For repeated queries on the same data, a proper database with indexes will be faster. q is designed for ad-hoc analysis, not recurring workloads.
引用来源 (3)
- q GitHub— q runs SQL queries directly on CSV and text files
- SQLite Documentation— SQLite-compatible SQL dialect
- q Documentation— Command-line data analysis tools
讨论
相关资产
watchexec — Run Commands When Files Change, with Smart Defaults
watchexec watches a directory tree and runs a command when anything changes. Cross-platform, respects .gitignore, debounces, restarts long-running processes — the swiss-army knife of file-watching.
sqlc — Generate Type-Safe Go Code from SQL
sqlc generates fully type-safe Go code from your SQL queries. Write SQL, run sqlc generate, and get Go functions with proper types for parameters and results — no ORM, no reflection, just compile-time safe database access.
WebLLM — Run Large Language Models Directly in the Browser
WebLLM is an MLC project that brings LLM inference to web browsers using WebGPU. It runs models like LLaMA, Mistral, and Phi entirely client-side with no server required, enabling private AI chat and text generation from any modern browser.
PostgresML — Machine Learning Inside PostgreSQL
PostgresML brings machine learning directly into PostgreSQL, allowing you to train models, run inference, and manage embeddings using SQL. No separate ML infrastructure needed — your database is your ML engine.