KnowledgeMay 14, 2026·3 min read

Awesome-Agent-Harness — Survey of Agent Harnesses

A curated survey-style list for agent harness evaluation and tooling, useful as a reading map; verified 199★, pushed 2026-05-14.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Native · 94/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Memory
Install
None
Trust
Trust: Established
Entrypoint
Open README
Universal CLI install command
npx tokrepo install 80ea0ac1-ca1a-5377-83ed-406b4e47c497
Intro

A curated survey-style list for agent harness evaluation and tooling, useful as a reading map; verified 199★, pushed 2026-05-14.

Best for: Agent builders who want a grounded reading list for harness design and evaluation

Works with: Any stack; this is a survey/awesome list you can browse and cite in docs

Setup time: 2-5 minutes

Key facts (verified)

  • GitHub: 199 stars · 6 forks · pushed 2026-05-14.
  • License: CC-BY-4.0 · owner avatar + repo URL verified via GitHub API.
  • README-backed entrypoint: Open README.

Main

  • Use it as a sourcing index: jump from the list to primary papers/repos, then build your own benchmark set.

  • Extract evaluation dimensions: turn repeated criteria into a checklist for your harness (context, tools, memory, safety).

  • Keep a local notes file: for each referenced harness, record setup time, supported tools, and failure modes.

  • Prefer primary citations: when copying claims into docs, link to the original repo/paper, not a secondary summary.

README (excerpt)

English | 中文

Agent Harness for Large Language Model Agents: A Survey

GitHub Stars License Papers Version HuggingFace DOI

H=(E,T,C,S,L,V) Six-Component Architecture

This repo is actively maintained. If you find it useful, please star the repo to stay updated and help others find it.


The agent execution harness — not the model — is the primary determinant of agent reliability at scale.
This survey formalizes the harness as a first-class architectural object H = (E, T, C, S, L, V), surveys 110+ papers, blogs and reports across 23 systems, and maps 9 open technical challenges.
📄 Read the Paper
🌐 Preprints Version (v3)
✉️ Corrections & suggestions: gloriamenng@gmail.com (Qianyu Meng); wangyanan@mail.dlut.edu.cn (Yanan Wang); chenliyi@xiaohongshu.com (Liyi Chen)

If you find this survey useful, please cite:

@article{meng2026agentharness,  
  title     = {Agent Harness for Large Language Model Agents: A Survey},  
  author    = {Meng, Qianyu and Wang, Yanan and Chen, Liyi and Wu, Wei and  
               Li, Yihang and Jiang, Wenyuan and Wang, Qimeng and  
               Lu, Chengqiang and Gao, Yan and Wu, Yi and Hu, Yao},  
  year      = {2026},  
  doi       = {10.20944/preprints202604.0428.v3},  
  url       = {https://www.preprints.org/manuscript/202604.0428/v3},


### Source-backed notes

- The repo is CC-BY-4.0 licensed (verified via GitHub API).
- GitHub API verification confirms the repo URL and recent push date.
- README functions as a curated survey/reading map (content is primarily links and structure).

### FAQ

- **Is it an implementation?**: No—it's primarily a survey/awesome list to help you find harness tools and papers.
- **Can I reuse content?**: Yes—license is CC-BY-4.0; attribute appropriately when reusing text.
- **How do I turn it into action?**: Pick 3–5 harnesses, run the same questions/tasks, and record results as your baseline benchmark.
🙏

Source & Thanks

Created by Gloriaameng. Licensed under CC-BY-4.0.

Gloriaameng/Awesome-Agent-Harness — ⭐ 199

Thanks to the upstream maintainers and contributors for publishing this work under an open license.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets