Cette page est affichée en anglais. Une traduction française est en cours.
CLI ToolsMay 11, 2026·3 min de lecture

sqlite-utils — Python + CLI for ETL Into SQLite

Simon Willison's Python library + CLI for getting messy CSV/JSON/YAML into SQLite. Auto-schema, upserts, joins, FTS indexing one-liners.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Stage only · 15/100Stage only
Surface agent
Tout agent MCP/CLI
Type
CLI Tool
Installation
Single
Confiance
Confiance : New
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install 19225473-fe66-4ace-9440-b908b703af81
Introduction

sqlite-utils is the Python library + CLI that Simon Willison pairs with Datasette — get messy CSV / JSON / YAML / JSONL into SQLite with auto-schema inference, then enrich, transform, upsert, and index for full-text search in one-line commands. Best for: data journalism imports, log ingestion, agent memory stores, any 'I have data, I want it queryable' task. Works with: any modern Python + SQLite 3.31+. Setup time: 2 minutes.


Install

pip install sqlite-utils

CSV → SQLite (one command)

# Auto-detect column types from data, including dates and integers
sqlite-utils insert data.db articles articles.csv --csv

# JSON array of objects
curl https://api.example.com/articles | sqlite-utils insert data.db articles -

# JSONL stream
sqlite-utils insert data.db logs logs.jsonl --nl

Add indexes + full-text search

# B-tree index on a column
sqlite-utils create-index data.db articles category

# FTS5 full-text search index across columns
sqlite-utils enable-fts data.db articles title body --create-triggers
# Now this is fast:
sqlite-utils search data.db articles "machine learning"

Python API (richer than CLI)

import sqlite_utils
db = sqlite_utils.Database("data.db")

# Upsert with composite PK
db["articles"].upsert_all([
    {"id": 1, "title": "Hello", "category": "intro"},
    {"id": 2, "title": "World", "category": "intro"},
], pk="id")

# Add a computed column
db["articles"].add_column("word_count", int)
for row in db["articles"].rows:
    db["articles"].update(row["id"], {"word_count": len(row["body"].split())})

# Transform schema in place
db["articles"].transform(
    drop={"old_column"},
    rename={"body": "content"},
    column_order=["id", "title", "content"],
)

Pair with Datasette

sqlite-utils insert data.db log access.csv --csv
datasette serve data.db   # instant web UI + JSON API for the imported data

FAQ

Q: Why not just use pandas + to_sql? A: pandas is fine if you already use it. sqlite-utils is built for shell-first workflows and unfamiliar data — auto-schema, upserts, FTS, and JSON line streams are first-class CLI commands, not flags. Most journalism / log-wrangling tasks finish faster from the shell.

Q: Can it handle multi-million-row imports? A: Yes — SQLite handles billions of rows. sqlite-utils streams JSONL/CSV without loading the full file into memory. For very large imports, use --batch-size 1000 to control transaction size.

Q: Does it work with sqlite3 in WAL mode? A: Yes — sqlite-utils respects existing pragmas. For Datasette-served DBs, WAL mode is recommended so readers don't block writers during import.


Quick Use

  1. pip install sqlite-utils
  2. sqlite-utils insert data.db tablename file.csv --csv (or .json)
  3. sqlite-utils enable-fts data.db tablename col1 col2 --create-triggers for full-text

Intro

sqlite-utils is the Python library + CLI that Simon Willison pairs with Datasette — get messy CSV / JSON / YAML / JSONL into SQLite with auto-schema inference, then enrich, transform, upsert, and index for full-text search in one-line commands. Best for: data journalism imports, log ingestion, agent memory stores, any 'I have data, I want it queryable' task. Works with: any modern Python + SQLite 3.31+. Setup time: 2 minutes.


Install

pip install sqlite-utils

CSV → SQLite (one command)

# Auto-detect column types from data, including dates and integers
sqlite-utils insert data.db articles articles.csv --csv

# JSON array of objects
curl https://api.example.com/articles | sqlite-utils insert data.db articles -

# JSONL stream
sqlite-utils insert data.db logs logs.jsonl --nl

Add indexes + full-text search

# B-tree index on a column
sqlite-utils create-index data.db articles category

# FTS5 full-text search index across columns
sqlite-utils enable-fts data.db articles title body --create-triggers
# Now this is fast:
sqlite-utils search data.db articles "machine learning"

Python API (richer than CLI)

import sqlite_utils
db = sqlite_utils.Database("data.db")

# Upsert with composite PK
db["articles"].upsert_all([
    {"id": 1, "title": "Hello", "category": "intro"},
    {"id": 2, "title": "World", "category": "intro"},
], pk="id")

# Add a computed column
db["articles"].add_column("word_count", int)
for row in db["articles"].rows:
    db["articles"].update(row["id"], {"word_count": len(row["body"].split())})

# Transform schema in place
db["articles"].transform(
    drop={"old_column"},
    rename={"body": "content"},
    column_order=["id", "title", "content"],
)

Pair with Datasette

sqlite-utils insert data.db log access.csv --csv
datasette serve data.db   # instant web UI + JSON API for the imported data

FAQ

Q: Why not just use pandas + to_sql? A: pandas is fine if you already use it. sqlite-utils is built for shell-first workflows and unfamiliar data — auto-schema, upserts, FTS, and JSON line streams are first-class CLI commands, not flags. Most journalism / log-wrangling tasks finish faster from the shell.

Q: Can it handle multi-million-row imports? A: Yes — SQLite handles billions of rows. sqlite-utils streams JSONL/CSV without loading the full file into memory. For very large imports, use --batch-size 1000 to control transaction size.

Q: Does it work with sqlite3 in WAL mode? A: Yes — sqlite-utils respects existing pragmas. For Datasette-served DBs, WAL mode is recommended so readers don't block writers during import.


Source & Thanks

Built by Simon Willison. Licensed under Apache-2.0.

simonw/sqlite-utils — ⭐ 1,700+

🙏

Source et remerciements

Built by Simon Willison. Licensed under Apache-2.0.

simonw/sqlite-utils — ⭐ 1,700+

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires