CLI ToolsMay 11, 2026·3 min read

sqlite-utils — Python + CLI for ETL Into SQLite

Simon Willison's Python library + CLI for getting messy CSV/JSON/YAML into SQLite. Auto-schema, upserts, joins, FTS indexing one-liners.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes the CLI command, metadata JSON, install plan, and raw content links so agents can judge fit, risk, and next actions.

Stage only · 15/100Stage only
Target
Claude Code, Codex, Gemini CLI
Kind
CLI Tool
Install
Single
Trust
Trust: New
Entrypoint
Asset
CLI install command
npx tokrepo install 19225473-fe66-4ace-9440-b908b703af81 --target codex
Intro

sqlite-utils is the Python library + CLI that Simon Willison pairs with Datasette — get messy CSV / JSON / YAML / JSONL into SQLite with auto-schema inference, then enrich, transform, upsert, and index for full-text search in one-line commands. Best for: data journalism imports, log ingestion, agent memory stores, any 'I have data, I want it queryable' task. Works with: any modern Python + SQLite 3.31+. Setup time: 2 minutes.


Install

pip install sqlite-utils

CSV → SQLite (one command)

# Auto-detect column types from data, including dates and integers
sqlite-utils insert data.db articles articles.csv --csv

# JSON array of objects
curl https://api.example.com/articles | sqlite-utils insert data.db articles -

# JSONL stream
sqlite-utils insert data.db logs logs.jsonl --nl

Add indexes + full-text search

# B-tree index on a column
sqlite-utils create-index data.db articles category

# FTS5 full-text search index across columns
sqlite-utils enable-fts data.db articles title body --create-triggers
# Now this is fast:
sqlite-utils search data.db articles "machine learning"

Python API (richer than CLI)

import sqlite_utils
db = sqlite_utils.Database("data.db")

# Upsert with composite PK
db["articles"].upsert_all([
    {"id": 1, "title": "Hello", "category": "intro"},
    {"id": 2, "title": "World", "category": "intro"},
], pk="id")

# Add a computed column
db["articles"].add_column("word_count", int)
for row in db["articles"].rows:
    db["articles"].update(row["id"], {"word_count": len(row["body"].split())})

# Transform schema in place
db["articles"].transform(
    drop={"old_column"},
    rename={"body": "content"},
    column_order=["id", "title", "content"],
)

Pair with Datasette

sqlite-utils insert data.db log access.csv --csv
datasette serve data.db   # instant web UI + JSON API for the imported data

FAQ

Q: Why not just use pandas + to_sql? A: pandas is fine if you already use it. sqlite-utils is built for shell-first workflows and unfamiliar data — auto-schema, upserts, FTS, and JSON line streams are first-class CLI commands, not flags. Most journalism / log-wrangling tasks finish faster from the shell.

Q: Can it handle multi-million-row imports? A: Yes — SQLite handles billions of rows. sqlite-utils streams JSONL/CSV without loading the full file into memory. For very large imports, use --batch-size 1000 to control transaction size.

Q: Does it work with sqlite3 in WAL mode? A: Yes — sqlite-utils respects existing pragmas. For Datasette-served DBs, WAL mode is recommended so readers don't block writers during import.


Quick Use

  1. pip install sqlite-utils
  2. sqlite-utils insert data.db tablename file.csv --csv (or .json)
  3. sqlite-utils enable-fts data.db tablename col1 col2 --create-triggers for full-text

Intro

sqlite-utils is the Python library + CLI that Simon Willison pairs with Datasette — get messy CSV / JSON / YAML / JSONL into SQLite with auto-schema inference, then enrich, transform, upsert, and index for full-text search in one-line commands. Best for: data journalism imports, log ingestion, agent memory stores, any 'I have data, I want it queryable' task. Works with: any modern Python + SQLite 3.31+. Setup time: 2 minutes.


Install

pip install sqlite-utils

CSV → SQLite (one command)

# Auto-detect column types from data, including dates and integers
sqlite-utils insert data.db articles articles.csv --csv

# JSON array of objects
curl https://api.example.com/articles | sqlite-utils insert data.db articles -

# JSONL stream
sqlite-utils insert data.db logs logs.jsonl --nl

Add indexes + full-text search

# B-tree index on a column
sqlite-utils create-index data.db articles category

# FTS5 full-text search index across columns
sqlite-utils enable-fts data.db articles title body --create-triggers
# Now this is fast:
sqlite-utils search data.db articles "machine learning"

Python API (richer than CLI)

import sqlite_utils
db = sqlite_utils.Database("data.db")

# Upsert with composite PK
db["articles"].upsert_all([
    {"id": 1, "title": "Hello", "category": "intro"},
    {"id": 2, "title": "World", "category": "intro"},
], pk="id")

# Add a computed column
db["articles"].add_column("word_count", int)
for row in db["articles"].rows:
    db["articles"].update(row["id"], {"word_count": len(row["body"].split())})

# Transform schema in place
db["articles"].transform(
    drop={"old_column"},
    rename={"body": "content"},
    column_order=["id", "title", "content"],
)

Pair with Datasette

sqlite-utils insert data.db log access.csv --csv
datasette serve data.db   # instant web UI + JSON API for the imported data

FAQ

Q: Why not just use pandas + to_sql? A: pandas is fine if you already use it. sqlite-utils is built for shell-first workflows and unfamiliar data — auto-schema, upserts, FTS, and JSON line streams are first-class CLI commands, not flags. Most journalism / log-wrangling tasks finish faster from the shell.

Q: Can it handle multi-million-row imports? A: Yes — SQLite handles billions of rows. sqlite-utils streams JSONL/CSV without loading the full file into memory. For very large imports, use --batch-size 1000 to control transaction size.

Q: Does it work with sqlite3 in WAL mode? A: Yes — sqlite-utils respects existing pragmas. For Datasette-served DBs, WAL mode is recommended so readers don't block writers during import.


Source & Thanks

Built by Simon Willison. Licensed under Apache-2.0.

simonw/sqlite-utils — ⭐ 1,700+

🙏

Source & Thanks

Built by Simon Willison. Licensed under Apache-2.0.

simonw/sqlite-utils — ⭐ 1,700+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets