Esta página se muestra en inglés. Una traducción al español está en curso.
CLI ToolsMay 11, 2026·3 min de lectura

sqlite-utils — Python + CLI for ETL Into SQLite

Simon Willison's Python library + CLI for getting messy CSV/JSON/YAML into SQLite. Auto-schema, upserts, joins, FTS indexing one-liners.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone el comando CLI, metadata JSON, plan de instalación y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Stage only · 15/100Stage only
Destino
Claude Code, Codex, Gemini CLI
Tipo
CLI Tool
Instalación
Single
Confianza
Confianza: New
Entrada
Asset
Comando CLI de instalación
npx tokrepo install 19225473-fe66-4ace-9440-b908b703af81 --target codex
Introducción

sqlite-utils is the Python library + CLI that Simon Willison pairs with Datasette — get messy CSV / JSON / YAML / JSONL into SQLite with auto-schema inference, then enrich, transform, upsert, and index for full-text search in one-line commands. Best for: data journalism imports, log ingestion, agent memory stores, any 'I have data, I want it queryable' task. Works with: any modern Python + SQLite 3.31+. Setup time: 2 minutes.


Install

pip install sqlite-utils

CSV → SQLite (one command)

# Auto-detect column types from data, including dates and integers
sqlite-utils insert data.db articles articles.csv --csv

# JSON array of objects
curl https://api.example.com/articles | sqlite-utils insert data.db articles -

# JSONL stream
sqlite-utils insert data.db logs logs.jsonl --nl

Add indexes + full-text search

# B-tree index on a column
sqlite-utils create-index data.db articles category

# FTS5 full-text search index across columns
sqlite-utils enable-fts data.db articles title body --create-triggers
# Now this is fast:
sqlite-utils search data.db articles "machine learning"

Python API (richer than CLI)

import sqlite_utils
db = sqlite_utils.Database("data.db")

# Upsert with composite PK
db["articles"].upsert_all([
    {"id": 1, "title": "Hello", "category": "intro"},
    {"id": 2, "title": "World", "category": "intro"},
], pk="id")

# Add a computed column
db["articles"].add_column("word_count", int)
for row in db["articles"].rows:
    db["articles"].update(row["id"], {"word_count": len(row["body"].split())})

# Transform schema in place
db["articles"].transform(
    drop={"old_column"},
    rename={"body": "content"},
    column_order=["id", "title", "content"],
)

Pair with Datasette

sqlite-utils insert data.db log access.csv --csv
datasette serve data.db   # instant web UI + JSON API for the imported data

FAQ

Q: Why not just use pandas + to_sql? A: pandas is fine if you already use it. sqlite-utils is built for shell-first workflows and unfamiliar data — auto-schema, upserts, FTS, and JSON line streams are first-class CLI commands, not flags. Most journalism / log-wrangling tasks finish faster from the shell.

Q: Can it handle multi-million-row imports? A: Yes — SQLite handles billions of rows. sqlite-utils streams JSONL/CSV without loading the full file into memory. For very large imports, use --batch-size 1000 to control transaction size.

Q: Does it work with sqlite3 in WAL mode? A: Yes — sqlite-utils respects existing pragmas. For Datasette-served DBs, WAL mode is recommended so readers don't block writers during import.


Quick Use

  1. pip install sqlite-utils
  2. sqlite-utils insert data.db tablename file.csv --csv (or .json)
  3. sqlite-utils enable-fts data.db tablename col1 col2 --create-triggers for full-text

Intro

sqlite-utils is the Python library + CLI that Simon Willison pairs with Datasette — get messy CSV / JSON / YAML / JSONL into SQLite with auto-schema inference, then enrich, transform, upsert, and index for full-text search in one-line commands. Best for: data journalism imports, log ingestion, agent memory stores, any 'I have data, I want it queryable' task. Works with: any modern Python + SQLite 3.31+. Setup time: 2 minutes.


Install

pip install sqlite-utils

CSV → SQLite (one command)

# Auto-detect column types from data, including dates and integers
sqlite-utils insert data.db articles articles.csv --csv

# JSON array of objects
curl https://api.example.com/articles | sqlite-utils insert data.db articles -

# JSONL stream
sqlite-utils insert data.db logs logs.jsonl --nl

Add indexes + full-text search

# B-tree index on a column
sqlite-utils create-index data.db articles category

# FTS5 full-text search index across columns
sqlite-utils enable-fts data.db articles title body --create-triggers
# Now this is fast:
sqlite-utils search data.db articles "machine learning"

Python API (richer than CLI)

import sqlite_utils
db = sqlite_utils.Database("data.db")

# Upsert with composite PK
db["articles"].upsert_all([
    {"id": 1, "title": "Hello", "category": "intro"},
    {"id": 2, "title": "World", "category": "intro"},
], pk="id")

# Add a computed column
db["articles"].add_column("word_count", int)
for row in db["articles"].rows:
    db["articles"].update(row["id"], {"word_count": len(row["body"].split())})

# Transform schema in place
db["articles"].transform(
    drop={"old_column"},
    rename={"body": "content"},
    column_order=["id", "title", "content"],
)

Pair with Datasette

sqlite-utils insert data.db log access.csv --csv
datasette serve data.db   # instant web UI + JSON API for the imported data

FAQ

Q: Why not just use pandas + to_sql? A: pandas is fine if you already use it. sqlite-utils is built for shell-first workflows and unfamiliar data — auto-schema, upserts, FTS, and JSON line streams are first-class CLI commands, not flags. Most journalism / log-wrangling tasks finish faster from the shell.

Q: Can it handle multi-million-row imports? A: Yes — SQLite handles billions of rows. sqlite-utils streams JSONL/CSV without loading the full file into memory. For very large imports, use --batch-size 1000 to control transaction size.

Q: Does it work with sqlite3 in WAL mode? A: Yes — sqlite-utils respects existing pragmas. For Datasette-served DBs, WAL mode is recommended so readers don't block writers during import.


Source & Thanks

Built by Simon Willison. Licensed under Apache-2.0.

simonw/sqlite-utils — ⭐ 1,700+

🙏

Fuente y agradecimientos

Built by Simon Willison. Licensed under Apache-2.0.

simonw/sqlite-utils — ⭐ 1,700+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados