[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"workflow-coze-loop-agent-prompt-eval-and-observability-hub-68d7a657":3,"seo:featured-workflow:68d7a657-2e23-506c-8963-368882308d34:zh":39,"workflow-related-coze-loop-agent-prompt-eval-and-observability-hub-68d7a657-68d7a657-2e23-506c-8963-368882308d34":82},{"id":4,"uuid":5,"slug":6,"title":7,"description":8,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":14,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":16,"files":23,"tags":24,"has_voted":30,"visibility":19,"share_token":13,"is_featured":12,"content_hash":31,"asset_kind":28,"target_tools":32,"install_mode":36,"entrypoint":37,"risk_profile":38,"dependencies":40,"verification":45,"agent_metadata":48,"agent_fit":59,"trust":70,"provenance":78,"created_at":80,"updated_at":81},3275,"68d7a657-2e23-506c-8963-368882308d34","coze-loop-agent-prompt-eval-and-observability-hub","Coze Loop — Agent Prompt, Eval, and Observability Hub","Coze Loop unifies prompt iteration, evaluation, and trace observability, helping agent teams debug workflows without jumping across separate tools.","8a910fec-3180-11f1-9bc6-00163e2b0d79","Agent Toolkit","https:\u002F\u002Ftokrepo.com\u002Fapple-touch-icon.png",0,"",12,"en",[17],{"id":18,"step_order":19,"title":20,"description":13,"prompt_template":21,"variables":13,"depends_on":22,"expected_output":13},3838,1,"Asset","## Quick Use\n\n1. Clone and enter the repo:\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fcoze-dev\u002Fcoze-loop.git\n   cd coze-loop\n   ```\n2. Configure `release\u002Fdeployment\u002Fdocker-compose\u002Fconf\u002Fmodel_config.yaml`, then boot locally:\n   ```bash\n   make compose-up\n   ```\n3. Verify:\n   - Open `http:\u002F\u002Flocalhost:8082` and confirm prompt playground, evaluation, and trace views load.\n\n## Intro\n\nCoze Loop unifies prompt iteration, evaluation, and trace observability, helping agent teams debug workflows without jumping across separate tools.\n\n- **Best for:** teams that want prompt debugging, evals, and traces in a single operator console\n- **Works with:** Docker Compose, Helm\u002FKubernetes, model config files, browser-based ops UI\n- **Setup time:** 30-60 minutes\n\n## Practical Notes\n\n- Quant: the README exposes two deployment paths out of the box: Docker Compose for local trials and Helm for Kubernetes.\n- Quant: the default local entry lands on `http:\u002F\u002Flocalhost:8082`, which makes environment verification straightforward.\n\n## Why it matters\n\nCoze Loop is useful when agent teams are already iterating prompts and evaluations, but the evidence is scattered across notebooks, ad-hoc dashboards, and model vendor consoles.\n\n- Prompt development, evaluation, and observability are described as connected modules, which matches how real agent incidents are usually debugged.\n- The repo documents both Compose and Helm flows, so it can graduate from a local lab to a shared cluster without switching products.\n- The README includes explicit security warnings for public deployments, a good sign that the project understands exposure risks.\n\n## Rollout pattern\n\n- Start in Docker Compose with one model configuration and one evaluation set before you touch Helm.\n- Promote only after you can tie a bad answer back to a prompt diff or trace record inside the same system.\n- Treat internet exposure as a separate security review because the maintainers call out SSRF and privilege risks directly.\n\n## Watchouts\n\nThis is not a no-config SaaS toy: you need to own model credentials, deployment topology, and public-network hardening before treating it as a shared operations plane.\n\n### FAQ\n\n**Q: Can I try it without Kubernetes?**\nA: Yes. The README puts Docker Compose first for local deployment and Helm second for cluster rollout.\n\n**Q: What problem does it solve best?**\nA: It centralizes prompt iteration, evaluation data, and traces so debugging is less fragmented.\n\n**Q: What is the main risk?**\nA: Publishing it on a public network without hardening; the maintainers explicitly warn about security review first.\n\n## Source & Thanks\n\n> Source: https:\u002F\u002Fgithub.com\u002Fcoze-dev\u002Fcoze-loop\n> License: Apache-2.0\n> GitHub stars: 5,452 · forks: 755\n\n---\n\n\u003C!-- ZH -->\n\n## 快速使用\n\n1. 克隆并进入仓库：\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fcoze-dev\u002Fcoze-loop.git\n   cd coze-loop\n   ```\n2. 配置 `release\u002Fdeployment\u002Fdocker-compose\u002Fconf\u002Fmodel_config.yaml`，然后本地启动：\n   ```bash\n   make compose-up\n   ```\n3. 验证：\n   - 打开 `http:\u002F\u002Flocalhost:8082`，确认 Prompt Playground、评测与 Trace 页面都能访问。\n\n## 简介\n\nCoze Loop 把提示词调试、评测实验与全链路观测放进同一个开源平台，适合希望在同一控制台里同时定位 Agent 工作流问题、比较实验结果并长期保留执行证据、排障线索和复盘沉淀资料的团队。\n\n- **适合谁：** 希望把提示词调试、评测和执行链路观测合并到一个控制台里的团队\n- **可搭配：** Docker Compose、Helm\u002FKubernetes、模型配置文件与浏览器运维界面\n- **准备时间：** 30-60 分钟\n\n## 实战建议\n\n- 量化信息：README 直接给了两种部署路径，分别是本地 Docker Compose 和面向 Kubernetes 的 Helm。\n- 量化信息：本地默认入口是 `http:\u002F\u002Flocalhost:8082`，环境验证路径清晰。 \n\n## 为什么值得收录\n\n如果你的团队已经在做提示词迭代和 Agent 评测，但证据散落在多个控制台和脚本里，Coze Loop 就会显得很有价值。\n\n- 它把 Prompt 开发、评测与观测写成互相关联的模块，更贴近真实 Agent 故障定位流程。\n- 同时提供 Compose 与 Helm 路径，意味着它既能本地试用，也能逐步迁移到共享集群。\n- README 对公网部署风险写得很直接，说明项目对安全暴露面有清醒认识。\n\n## 落地路径\n\n- 先在 Docker Compose 中固定 1 套模型配置与 1 套评测集，不要一开始就上 Helm。\n- 只有当你能在同一个系统里把坏结果追溯到 prompt diff 或 trace 记录时，再考虑推广。\n- 公网暴露必须当成单独的安全审查事项，因为维护者明确提到了 SSRF 与权限风险。 \n\n## 注意事项\n\n它不是“零配置即用”的玩具控制台，模型密钥、部署拓扑和公网加固都需要你自己负责。\n\n### FAQ\n\n**没有 Kubernetes 也能试吗？**\n答：可以。README 先给了 Docker Compose，本地跑通后再考虑 Helm。\n\n**它最适合解决什么问题？**\n答：把提示词迭代、评测数据和执行链路放回同一个系统里，减少排障碎片化。\n\n**最大的风险是什么？**\n答：未经加固就直接公网部署；维护者已经明确提醒先做安全评估。\n\n## 来源与感谢\n\n> Source: https:\u002F\u002Fgithub.com\u002Fcoze-dev\u002Fcoze-loop\n> License: Apache-2.0\n> GitHub stars: 5,452 · forks: 755\n","0",[],[25],{"id":26,"name":27,"slug":28,"icon":29},11,"Scripts","script","📜",false,"e0bbbc4113b5a8dd3aea2a5429bb0ba26cc72455c170d4f877d2b4842f19642d",[33,34,35],"claude_code","codex","gemini_cli","single","make compose-up",{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},null,{"npm":41,"pip":42,"brew":43,"system":44},[],[],[],[],{"commands":46,"expected_files":47},[],[20],{"asset_kind":28,"target_tools":49,"install_mode":36,"entrypoint":37,"risk_profile":50,"dependencies":51,"content_hash":31,"verification":56},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":52,"pip":53,"brew":54,"system":55},[],[],[],[],{"commands":57,"expected_files":58},[],[20],{"target":34,"score":60,"status":61,"policy":61,"why":62,"asset_kind":28,"install_mode":36},29,"stage_only",[63,64,65,66,67,68,69],"target_tools includes codex","asset_kind script","install_mode single","markdown-only","policy stage_only","asset_kind script is not activated directly for Codex","trust established",{"author_trust_level":71,"verified_publisher":30,"asset_signed_hash":31,"signature_status":72,"install_count":12,"report_count":12,"dangerous_capability_badges":73,"review_status":74,"signals":75},"established","hash_only",[28],"unreviewed",[76,77],"author has published assets","content hash available",{"owner_uuid":9,"owner_name":10,"source_url":79,"content_hash":31,"visibility":19,"created_at":80,"updated_at":81},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fcoze-loop-agent-prompt-eval-and-observability-hub","2026-05-12 22:02:43","2026-05-14 00:40:01",[83,133,176,219],{"id":84,"uuid":85,"slug":86,"title":87,"description":88,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":89,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":90,"files":39,"tags":91,"has_voted":30,"visibility":19,"share_token":13,"is_featured":12,"content_hash":93,"asset_kind":28,"target_tools":94,"install_mode":36,"entrypoint":95,"risk_profile":96,"dependencies":97,"verification":102,"agent_metadata":105,"agent_fit":116,"trust":118,"provenance":121,"created_at":123,"updated_at":124,"__relatedScore":125,"__relatedReasons":126,"__sharedTags":131},3236,"ee57174a-de3d-4b53-85c4-34bb754e90d1","judgeval-tracing-evaluation-for-agent-apps","Judgeval — Tracing + Evaluation for Agent Apps","Judgeval adds tracing and evaluation to agent apps, helping teams score behavior and monitor live traffic with a small SDK and dashboard workflow.",18,[],[92],{"id":26,"name":27,"slug":28,"icon":29},"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",[33,34,35],"judgeval",{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":98,"pip":99,"brew":100,"system":101},[],[],[],[],{"commands":103,"expected_files":104},[],[],{"asset_kind":28,"target_tools":106,"install_mode":36,"entrypoint":95,"risk_profile":107,"dependencies":108,"content_hash":93,"verification":113},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":109,"pip":110,"brew":111,"system":112},[],[],[],[],{"commands":114,"expected_files":115},[],[],{"target":34,"score":60,"status":61,"policy":61,"why":117,"asset_kind":28,"install_mode":36},[63,64,65,66,67,68,69],{"author_trust_level":71,"verified_publisher":30,"asset_signed_hash":93,"signature_status":72,"install_count":12,"report_count":12,"dangerous_capability_badges":119,"review_status":74,"signals":120},[28],[76,77],{"owner_uuid":9,"owner_name":10,"source_url":122,"content_hash":93,"visibility":19,"created_at":123,"updated_at":124},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fjudgeval-tracing-evaluation-for-agent-apps","2026-05-12 16:06:05","2026-05-14 10:51:51",90.91813040142924,[127,128,129,130],"topic-match","same-kind","same-target","same-author",[28,132],"scripts",{"id":134,"uuid":135,"slug":136,"title":137,"description":138,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":26,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":139,"files":39,"tags":140,"has_voted":30,"visibility":19,"share_token":13,"is_featured":12,"content_hash":93,"asset_kind":28,"target_tools":142,"install_mode":36,"entrypoint":143,"risk_profile":144,"dependencies":145,"verification":150,"agent_metadata":153,"agent_fit":164,"trust":166,"provenance":169,"created_at":171,"updated_at":172,"__relatedScore":173,"__relatedReasons":174,"__sharedTags":175},3216,"2414f9d2-b727-454b-9613-f45278226743","agents-cli-agent-build-eval-deploy-skills-for-coders","agents-cli — Agent Build\u002FEval\u002FDeploy Skills for Coders","agents-cli installs a CLI + skills so your coding assistant can scaffold, evaluate, and deploy production agents on Google Cloud with repeatable commands.",[],[141],{"id":26,"name":27,"slug":28,"icon":29},[33,34,35],"README.md",{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":146,"pip":147,"brew":148,"system":149},[],[],[],[],{"commands":151,"expected_files":152},[],[],{"asset_kind":28,"target_tools":154,"install_mode":36,"entrypoint":143,"risk_profile":155,"dependencies":156,"content_hash":93,"verification":161},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":157,"pip":158,"brew":159,"system":160},[],[],[],[],{"commands":162,"expected_files":163},[],[],{"target":34,"score":60,"status":61,"policy":61,"why":165,"asset_kind":28,"install_mode":36},[63,64,65,66,67,68,69],{"author_trust_level":71,"verified_publisher":30,"asset_signed_hash":93,"signature_status":72,"install_count":12,"report_count":12,"dangerous_capability_badges":167,"review_status":74,"signals":168},[28],[76,77],{"owner_uuid":9,"owner_name":10,"source_url":170,"content_hash":93,"visibility":19,"created_at":171,"updated_at":172},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fagents-cli-agent-build-eval-deploy-skills-for-coders","2026-05-12 13:29:47","2026-05-14 10:50:39",89.61877186907144,[127,128,129,130],[28,132],{"id":177,"uuid":178,"slug":179,"title":180,"description":181,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":182,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":183,"files":39,"tags":184,"has_voted":30,"visibility":19,"share_token":13,"is_featured":12,"content_hash":93,"asset_kind":28,"target_tools":186,"install_mode":36,"entrypoint":143,"risk_profile":187,"dependencies":188,"verification":193,"agent_metadata":196,"agent_fit":207,"trust":209,"provenance":212,"created_at":214,"updated_at":215,"__relatedScore":216,"__relatedReasons":217,"__sharedTags":218},3153,"73cd67c3-9db6-48ed-8a31-c082f618168e","agent-evaluation-test-virtual-agents-in-ci","Agent Evaluation — Test Virtual Agents in CI","Agent Evaluation is a Python framework that runs repeatable, scored tests for virtual agents, so teams can catch regressions automatically in CI.",14,[],[185],{"id":26,"name":27,"slug":28,"icon":29},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":189,"pip":190,"brew":191,"system":192},[],[],[],[],{"commands":194,"expected_files":195},[],[],{"asset_kind":28,"target_tools":197,"install_mode":36,"entrypoint":143,"risk_profile":198,"dependencies":199,"content_hash":93,"verification":204},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":200,"pip":201,"brew":202,"system":203},[],[],[],[],{"commands":205,"expected_files":206},[],[],{"target":34,"score":60,"status":61,"policy":61,"why":208,"asset_kind":28,"install_mode":36},[63,64,65,66,67,68,69],{"author_trust_level":71,"verified_publisher":30,"asset_signed_hash":93,"signature_status":72,"install_count":12,"report_count":12,"dangerous_capability_badges":210,"review_status":74,"signals":211},[28],[76,77],{"owner_uuid":9,"owner_name":10,"source_url":213,"content_hash":93,"visibility":19,"created_at":214,"updated_at":215},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fagent-evaluation-test-virtual-agents-in-ci","2026-05-12 07:08:04","2026-05-14 08:17:15",87.76413688858352,[127,128,129,130],[28,132],{"id":220,"uuid":221,"slug":222,"title":223,"description":224,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":14,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":225,"files":39,"tags":226,"has_voted":30,"visibility":19,"share_token":13,"is_featured":12,"content_hash":93,"asset_kind":28,"target_tools":228,"install_mode":36,"entrypoint":143,"risk_profile":229,"dependencies":230,"verification":235,"agent_metadata":238,"agent_fit":249,"trust":251,"provenance":254,"created_at":256,"updated_at":257,"__relatedScore":258,"__relatedReasons":259,"__sharedTags":260},3138,"d0c42c72-3d97-4d61-8b02-a98f6a54e9c3","ag2-open-source-agentos-for-multi-agent-systems","AG2 — Open-Source AgentOS for Multi-Agent Systems","AG2 (formerly AutoGen) is an open-source framework for building cooperating AI agents with tool use, human-in-the-loop workflows, and patterns.",[],[227],{"id":26,"name":27,"slug":28,"icon":29},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":231,"pip":232,"brew":233,"system":234},[],[],[],[],{"commands":236,"expected_files":237},[],[],{"asset_kind":28,"target_tools":239,"install_mode":36,"entrypoint":143,"risk_profile":240,"dependencies":241,"content_hash":93,"verification":246},[33,34,35],{"executes_code":30,"modifies_global_config":30,"requires_secrets":39,"uses_absolute_paths":30,"network_access":30},{"npm":242,"pip":243,"brew":244,"system":245},[],[],[],[],{"commands":247,"expected_files":248},[],[],{"target":34,"score":60,"status":61,"policy":61,"why":250,"asset_kind":28,"install_mode":36},[63,64,65,66,67,68,69],{"author_trust_level":71,"verified_publisher":30,"asset_signed_hash":93,"signature_status":72,"install_count":12,"report_count":12,"dangerous_capability_badges":252,"review_status":74,"signals":253},[28],[76,77],{"owner_uuid":9,"owner_name":10,"source_url":255,"content_hash":93,"visibility":19,"created_at":256,"updated_at":257},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fag2-open-source-agentos-for-multi-agent-systems","2026-05-12 04:56:52","2026-05-14 00:51:12",87.67091502846026,[127,128,129,130],[28,132]]