# files-to-prompt — Concat Files Into LLM-Ready Prompts

> Simon Willison's CLI that walks a directory and concats files into one LLM-ready prompt with path markers. Pipes straight into Claude or LLM CLI.

## Install

Copy the content below into your project:

## Quick Use

1. `pip install files-to-prompt`
2. `files-to-prompt src/ --cxml > prompt.xml`
3. Pipe into Claude: `files-to-prompt src/ | llm -m claude-3-5-sonnet 'your question'`

---

## Intro

files-to-prompt is Simon Willison's CLI that walks a directory tree and concatenates every file into one LLM-ready prompt — with path markers, gitignore awareness, and Claude-style XML wrapping options. Pipes naturally into Simon's LLM CLI or any model that takes stdin. Best for: 'paste my whole repo into Claude' workflows, codebase Q&A, refactor briefings, custom RAG ingestion. Works with: any shell, Python 3.10+. Setup time: 1 minute.

---

### Install + basic use

```bash
pip install files-to-prompt

# Concat a whole repo
files-to-prompt . > prompt.txt

# Specific extensions
files-to-prompt . --extension .py --extension .ts > code.txt

# Honor .gitignore
files-to-prompt . --ignore-gitignore=false > clean.txt
```

### Pipe into Claude / LLM CLI

```bash
# Via Simon's llm CLI
files-to-prompt src/ | llm -m claude-3-5-sonnet "Where is the auth bug?"

# Via plain curl to Anthropic
files-to-prompt src/ | jq -Rs '{model:"claude-3-5-sonnet-20241022",max_tokens:4096,messages:[{role:"user",content:.}]}' \
  | curl -X POST https://api.anthropic.com/v1/messages \
    -H "anthropic-version: 2023-06-01" \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "content-type: application/json" \
    -d @-
```

### Claude-friendly XML wrapping

```bash
files-to-prompt src/ --cxml > prompt.xml
# Wraps each file in <document index="N"><source>path</source><document_content>...</document_content></document>
# This format gets cited cleanly by Claude
```

### Exclude noise

```bash
files-to-prompt . \
  --extension .py \
  --ignore "*test*" \
  --ignore "*.pyc" \
  --ignore "venv/" \
  --ignore "node_modules/" \
  > prompt.txt
```

### Output format

```
path: src/main.py
---
def main():
    ...

path: src/utils.py
---
def helper():
    ...
```

LLMs handle this format reliably — every chunk has its origin file path inline.

---

### FAQ

**Q: How big a repo can I dump?**
A: Bounded by your model's context window. With Claude's 200K context, ~500K characters works. For 1M+ context models (Grok-3, Gemini), entire mid-size repos fit. For larger: chunk + RAG, or use Claude's prompt caching.

**Q: Versus repomix / aider?**
A: Simpler. repomix has token counting and AI-aware splitting; aider runs as an interactive coding agent. files-to-prompt does one thing — concat. Pipe its output anywhere. Pick repomix when you need token math, aider when you want an actual coding agent.

**Q: Binary files?**
A: Auto-skipped. files-to-prompt detects binary content and excludes by default. Override with `--include-binary` if you want everything (rarely useful for LLM prompts).

---

## Source & Thanks

> Built by [Simon Willison](https://github.com/simonw). Licensed under Apache-2.0.
>
> [simonw/files-to-prompt](https://github.com/simonw/files-to-prompt) — ⭐ 750+

---

<!-- ZH -->

## 快速使用

1. `pip install files-to-prompt`
2. `files-to-prompt src/ --cxml > prompt.xml`
3. 管道进 Claude：`files-to-prompt src/ | llm -m claude-3-5-sonnet '你的问题'`

---

## 简介

files-to-prompt 是 Simon Willison 的 CLI —— 遍历目录树把每个文件拼成一份 LLM 就绪的 prompt，带路径标记、识别 gitignore、可选 Claude 风格 XML 包装。天然管道接 Simon 的 LLM CLI 或任何吃 stdin 的模型。适合「把整个仓库粘给 Claude」工作流、代码库问答、重构 briefing、自定义 RAG 入库。任何 shell、Python 3.10+ 都行。装机时间 1 分钟。

---

### 安装 + 基础用

```bash
pip install files-to-prompt

# 拼整个仓库
files-to-prompt . > prompt.txt

# 指定扩展名
files-to-prompt . --extension .py --extension .ts > code.txt

# 遵守 .gitignore
files-to-prompt . --ignore-gitignore=false > clean.txt
```

### 管道接 Claude / LLM CLI

```bash
# 通过 Simon 的 llm CLI
files-to-prompt src/ | llm -m claude-3-5-sonnet "auth bug 在哪？"

# 通过 curl 直接打 Anthropic
files-to-prompt src/ | jq -Rs '{model:"claude-3-5-sonnet-20241022",max_tokens:4096,messages:[{role:"user",content:.}]}' \
  | curl -X POST https://api.anthropic.com/v1/messages \
    -H "anthropic-version: 2023-06-01" \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "content-type: application/json" \
    -d @-
```

### Claude 友好的 XML 包装

```bash
files-to-prompt src/ --cxml > prompt.xml
# 每个文件包成 <document index="N"><source>路径</source><document_content>...</document_content></document>
# 这格式 Claude 引用最干净
```

### 排除噪音

```bash
files-to-prompt . \
  --extension .py \
  --ignore "*test*" \
  --ignore "*.pyc" \
  --ignore "venv/" \
  --ignore "node_modules/" \
  > prompt.txt
```

### 输出格式

```
path: src/main.py
---
def main():
    ...

path: src/utils.py
---
def helper():
    ...
```

LLM 对这种格式处理稳定 —— 每个 chunk 内联带来源文件路径。

---

### FAQ

**Q: 能 dump 多大的 repo？**
A: 受模型上下文窗口限制。Claude 200K 上下文约能装 50 万字符。1M+ 上下文模型（Grok-3 / Gemini）整个中型 repo 都装得下。更大就分块 + RAG，或者用 Claude prompt cache。

**Q: 跟 repomix / aider 比？**
A: 更简单。repomix 带 token 计数和 AI 感知分割；aider 是交互式编码 agent。files-to-prompt 只做一件事 —— 拼接。输出管道到任何地方。要 token 数学选 repomix，要真正编码 agent 选 aider。

**Q: 二进制文件？**
A: 自动跳过。files-to-prompt 检测二进制内容并默认排除。`--include-binary` 强制全包（LLM prompt 很少需要）。

---

## 来源与感谢

> Built by [Simon Willison](https://github.com/simonw). Licensed under Apache-2.0.
>
> [simonw/files-to-prompt](https://github.com/simonw/files-to-prompt) — ⭐ 750+


---
Source: https://tokrepo.com/en/workflows/files-to-prompt-concat-files-into-llm-ready-prompts
Author: Simon Willison