ConfigsApr 15, 2026·3 min read

Tree-sitter — Incremental Parser Generator for Editors

Tree-sitter is a general parser generator + incremental parsing library that powers fast, robust syntax highlighting, code folding, and structural edits in editors like Neovim, Zed, Helix, and GitHub.

Introduction

Tree-sitter generates fast, incremental parsers and a small C runtime you can embed in editors. Given a declarative grammar.js, it produces an LR(1) + GLR parser that re-parses only the changed ranges on every keystroke — the key trick that makes language-aware editing feel instant. It powers syntax highlighting and structural navigation in Neovim, Zed, Helix, Atom (where it was born), GitHub code view, and Sourcegraph. Over 24,000 GitHub stars.

What Tree-sitter Does

  • Generates a parser in C from a JavaScript grammar definition — deterministic and reproducible.
  • Parses source incrementally: a tiny edit triggers re-parsing of just the affected subtree.
  • Ships a query language (S-expressions) for extracting highlights, locals, folds, and text objects.
  • Handles error recovery gracefully so you always get a (partial) tree, even for syntactically broken code.
  • Runs the same parser in editors (C runtime), browsers (WASM build), and CLIs.

Architecture Overview

Tree-sitter compiles an LR(1) table with GLR fallback for ambiguous productions, emitted as a single C file. At runtime the parser walks the input, producing a compact concrete syntax tree (CST) stored as packed node structs. On edit, an edit range tells the parser which nodes can be reused; it stitches the old tree with a fresh parse of the dirty range. Queries are compiled to bytecode and matched against subtrees to produce highlight captures or structural edits. There are over 200 community grammars — C, Rust, Go, Python, TypeScript, HTML, Bash, TOML, and more.

Self-Hosting & Configuration

  • Use the Rust/Node CLI to author grammars — tree-sitter generate is the only required command.
  • Embed the generated C parser in your editor or tool with the ~100 KB runtime (C or WASM).
  • Keep highlight queries in queries/highlights.scm — most editors load them automatically.
  • Use tree-sitter test with corpus fixtures to verify grammar changes don't regress.
  • Publish grammars to npm or crates as prebuilt binaries for fast installs.

Key Features

  • Truly incremental parsing — sub-millisecond re-parse on keystrokes.
  • Rich query language for highlights, folds, indents, and textobjects.
  • Error-resilient trees for editor workflows where code is constantly incomplete.
  • C, Rust, Python, Go, Node, Swift, and WASM bindings.
  • Massive grammar ecosystem maintained by language communities.

Comparison with Similar Tools

  • Pygments/TextMate grammars — regex-based tokenizers; fast but fragile, no real AST.
  • LSP servers (rust-analyzer, etc.) — give semantic info, but heavier and language-specific; tree-sitter complements them.
  • ANTLR — powerful parser generator, better for compilers; no incremental parsing story.
  • Lezer (CodeMirror 6) — similar incremental parser in TypeScript; tighter CodeMirror integration.
  • Lark/Parsec — general parsers for tooling; neither targets editor latency.

FAQ

Q: Do I need to write a grammar to use tree-sitter? A: No — if a grammar already exists for your language, just install it. Writing grammars is for when you need new language support.

Q: Can tree-sitter replace an LSP? A: It replaces the lexer and parser pieces but not semantic analysis. Most editors run both side by side.

Q: How big is the runtime? A: Around 100 KB of C; each grammar adds 100-500 KB depending on complexity.

Q: Is it usable in the browser? A: Yes — tree-sitter.wasm runs in all modern browsers and Node.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets