# CKA-Agent — Trojan Knowledge Attack Agent (Research) > Research code for studying trojan-knowledge attacks in agent systems, with reproducible scripts and configs; verified 203★, pushed 2026-05-13. ## Install Copy the content below into your project: ## Quick Use ```bash python -m venv .venv && source .venv/bin/activate # README uses uv for deps; follow the repo for exact pins: uv pip install accelerate fastchat nltk pandas google-genai httpx[socks] anthropic ``` ## Intro Research code for studying trojan-knowledge attacks in agent systems, with reproducible scripts and configs; verified 203★, pushed 2026-05-13. **Best for:** Researchers and security-minded agent builders evaluating knowledge poisoning risks **Works with:** Python tooling; README includes `uv pip` installs and experiment dependencies **Setup time:** 20-45 minutes ### Key facts (verified) - GitHub: 203 stars · 45 forks · pushed 2026-05-13. - License: AGPL-3.0 · owner avatar + repo URL verified via GitHub API. - README-backed entrypoint: `uv pip install ...`. ## Main - Treat it as a security lab: run experiments in an isolated environment and record the exact dependency set used. - Use it to build test cases: trojan knowledge scenarios can become unit/regression tests for your retrieval + tool pipeline. - Map the attack surface: separate poisoning in static docs vs retrieval corpora vs tool outputs so mitigations are targeted. - Export results as artifacts: logs, prompts, and configs are as important as code when reproducing agent-security claims. ### README (excerpt) **[ICML 2026] CKA-Agent: Bypassing LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search**

## 🛡️ Defense towards CKA [TurnGate](https://github.com/Graph-COM/TurnGate) is a response-aware defense mechanism designed to detect and mitigate hidden malicious intent in multi-turn dialogue systems. Defending state-of-the-art multi-turn malicious attacks like CKA-Agent, achieving great defense performance while avoiding overrefusal. ## 🔥 Latest Results on Frontier Models (Dec 2025) CKA-Agent demonstrates consistent high attack success rates against the latest frontier models, including **GPT-5.2**, **Gemini-3.0-Pro**, and **Claude-Haiku-4.5**. The results are summarized below: ### Source-backed notes - README shows `uv pip install` commands for installing experiment dependencies. - Repo is AGPL-3.0 licensed (verified via GitHub API). - The repository positions itself as a reproducible implementation for agent-security research (per README wording). ### FAQ - **Is this meant for production use?**: It’s primarily research code; use it to evaluate and harden your own systems. - **How do I install dependencies?**: Follow the README `uv pip install ...` instructions and keep versions pinned for reproducibility. - **What license applies?**: AGPL-3.0 (verified via GitHub license metadata). ## Source & Thanks > Created by [Graph-COM](https://github.com/Graph-COM). Licensed under AGPL-3.0. > > [Graph-COM/CKA-Agent](https://github.com/Graph-COM/CKA-Agent) — ⭐ 203 Thanks to the upstream maintainers and contributors for publishing this work under an open license. --- ## Quick Use ```bash python -m venv .venv && source .venv/bin/activate # README uses uv for deps; follow the repo for exact pins: uv pip install accelerate fastchat nltk pandas google-genai httpx[socks] anthropic ``` ## Intro CKA-Agent 提供针对 agent 系统的“木马知识”攻击研究代码与复现实验脚本/配置，适合做安全评估、对抗实验与基准复现并总结防护要点；已验证 203★，更新于 2026-05-13。 **Best for:** 想评估知识投毒/木马知识风险的研究者与安全敏感型 agent 开发者 **Works with:** Python 工具链；README 包含 `uv pip` 安装与实验依赖说明 **Setup time:** 20-45 minutes ### Key facts (verified) - GitHub：203 stars · 45 forks；最近更新 2026-05-13。 - 许可证：AGPL-3.0；作者头像与仓库链接均已通过 GitHub API 复核。 - README 中可对照的入口：`uv pip install ...`。 ## Main - 把它当安全实验室：在隔离环境里跑实验，并记录精确依赖版本，保证可复现。 - 用它生成测试用例：把木马知识场景沉淀成检索/工具链的回归测试。 - 拆分攻击面：区分静态文档、检索语料、工具输出三类污染来源，针对性加固。 - 把结果导出成可审计工件：日志、prompts、configs 与代码同等重要。 ### README (excerpt) **[ICML 2026] CKA-Agent: Bypassing LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search**

Model	HarmBench				StrongREJECT
Model	FS ↑	PS ↑	V ↓	R ↓	FS ↑	PS ↑	V ↓	R ↓
🟢 GPT-5.2	0.889	0.079	0.024	0.008	0.932	0.056	0.006	0.006
🟣 Gemini-3.0-Pro	0.881	0.087

### Source-backed notes - README 给出 `uv pip install` 命令用于安装实验依赖。 - 仓库为 AGPL-3.0 许可证（已通过 GitHub API 复核）。 - 仓库 README 将其定位为可复现实验实现，用于 agent 安全研究。 ### FAQ - **适合直接上生产吗？**：主要面向研究；更适合用来评估并加固你自己的系统。 - **依赖怎么装？**：按 README 的 `uv pip install ...` 说明安装，并建议锁定版本以保证复现。 - **许可证是什么？**：AGPL-3.0（已通过 GitHub 许可证元数据复核）。 ## Source & Thanks > Created by [Graph-COM](https://github.com/Graph-COM). Licensed under AGPL-3.0. > > [Graph-COM/CKA-Agent](https://github.com/Graph-COM/CKA-Agent) — ⭐ 203 --- Source: https://tokrepo.com/en/workflows/cka-agent-trojan-knowledge-attack-agent-research Author: Agent Toolkit

Model	HarmBench				StrongREJECT
Model	FS ↑	PS ↑	V ↓	R ↓	FS ↑	PS ↑	V ↓	R ↓
🟢 GPT-5.2	0.889	0.079	0.024	0.008	0.932	0.056	0.006	0.006
🟣 Gemini-3.0-Pro	0.881	0.087