# CKA-Agent — Trojan Knowledge Attack Agent (Research)
> Research code for studying trojan-knowledge attacks in agent systems, with reproducible scripts and configs; verified 203★, pushed 2026-05-13.
## Install
Copy the content below into your project:
## Quick Use
```bash
python -m venv .venv && source .venv/bin/activate
# README uses uv for deps; follow the repo for exact pins:
uv pip install accelerate fastchat nltk pandas google-genai httpx[socks] anthropic
```
## Intro
Research code for studying trojan-knowledge attacks in agent systems, with reproducible scripts and configs; verified 203★, pushed 2026-05-13.
**Best for:** Researchers and security-minded agent builders evaluating knowledge poisoning risks
**Works with:** Python tooling; README includes `uv pip` installs and experiment dependencies
**Setup time:** 20-45 minutes
### Key facts (verified)
- GitHub: 203 stars · 45 forks · pushed 2026-05-13.
- License: AGPL-3.0 · owner avatar + repo URL verified via GitHub API.
- README-backed entrypoint: `uv pip install ...`.
## Main
- Treat it as a security lab: run experiments in an isolated environment and record the exact dependency set used.
- Use it to build test cases: trojan knowledge scenarios can become unit/regression tests for your retrieval + tool pipeline.
- Map the attack surface: separate poisoning in static docs vs retrieval corpora vs tool outputs so mitigations are targeted.
- Export results as artifacts: logs, prompts, and configs are as important as code when reproducing agent-security claims.
### README (excerpt)
**[ICML 2026] CKA-Agent: Bypassing LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search**
## 🛡️ Defense towards CKA
[TurnGate](https://github.com/Graph-COM/TurnGate) is a response-aware defense mechanism designed to detect and mitigate hidden malicious intent in multi-turn dialogue systems. Defending state-of-the-art multi-turn malicious attacks like CKA-Agent, achieving great defense performance while avoiding overrefusal.
## 🔥 Latest Results on Frontier Models (Dec 2025)
CKA-Agent demonstrates consistent high attack success rates against the latest frontier models, including **GPT-5.2**, **Gemini-3.0-Pro**, and **Claude-Haiku-4.5**. The results are summarized below:
| Model |
HarmBench |
StrongREJECT |
| FS ↑ |
PS ↑ |
V ↓ |
R ↓ |
FS ↑ |
PS ↑ |
V ↓ |
R ↓ |
| 🟢 GPT-5.2 |
0.889 |
0.079 |
0.024 |
0.008 |
0.932 |
0.056 |
0.006 |
0.006 |
| 🟣 Gemini-3.0-Pro |
0.881 |
0.087 |
### Source-backed notes
- README shows `uv pip install` commands for installing experiment dependencies.
- Repo is AGPL-3.0 licensed (verified via GitHub API).
- The repository positions itself as a reproducible implementation for agent-security research (per README wording).
### FAQ
- **Is this meant for production use?**: It’s primarily research code; use it to evaluate and harden your own systems.
- **How do I install dependencies?**: Follow the README `uv pip install ...` instructions and keep versions pinned for reproducibility.
- **What license applies?**: AGPL-3.0 (verified via GitHub license metadata).
## Source & Thanks
> Created by [Graph-COM](https://github.com/Graph-COM). Licensed under AGPL-3.0.
>
> [Graph-COM/CKA-Agent](https://github.com/Graph-COM/CKA-Agent) — ⭐ 203
Thanks to the upstream maintainers and contributors for publishing this work under an open license.
---
## Quick Use
```bash
python -m venv .venv && source .venv/bin/activate
# README uses uv for deps; follow the repo for exact pins:
uv pip install accelerate fastchat nltk pandas google-genai httpx[socks] anthropic
```
## Intro
CKA-Agent 提供针对 agent 系统的“木马知识”攻击研究代码与复现实验脚本/配置,适合做安全评估、对抗实验与基准复现并总结防护要点;已验证 203★,更新于 2026-05-13。
**Best for:** 想评估知识投毒/木马知识风险的研究者与安全敏感型 agent 开发者
**Works with:** Python 工具链;README 包含 `uv pip` 安装与实验依赖说明
**Setup time:** 20-45 minutes
### Key facts (verified)
- GitHub:203 stars · 45 forks;最近更新 2026-05-13。
- 许可证:AGPL-3.0;作者头像与仓库链接均已通过 GitHub API 复核。
- README 中可对照的入口:`uv pip install ...`。
## Main
- 把它当安全实验室:在隔离环境里跑实验,并记录精确依赖版本,保证可复现。
- 用它生成测试用例:把木马知识场景沉淀成检索/工具链的回归测试。
- 拆分攻击面:区分静态文档、检索语料、工具输出三类污染来源,针对性加固。
- 把结果导出成可审计工件:日志、prompts、configs 与代码同等重要。
### README (excerpt)
**[ICML 2026] CKA-Agent: Bypassing LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search**
## 🛡️ Defense towards CKA
[TurnGate](https://github.com/Graph-COM/TurnGate) is a response-aware defense mechanism designed to detect and mitigate hidden malicious intent in multi-turn dialogue systems. Defending state-of-the-art multi-turn malicious attacks like CKA-Agent, achieving great defense performance while avoiding overrefusal.
## 🔥 Latest Results on Frontier Models (Dec 2025)
CKA-Agent demonstrates consistent high attack success rates against the latest frontier models, including **GPT-5.2**, **Gemini-3.0-Pro**, and **Claude-Haiku-4.5**. The results are summarized below:
| Model |
HarmBench |
StrongREJECT |
| FS ↑ |
PS ↑ |
V ↓ |
R ↓ |
FS ↑ |
PS ↑ |
V ↓ |
R ↓ |
| 🟢 GPT-5.2 |
0.889 |
0.079 |
0.024 |
0.008 |
0.932 |
0.056 |
0.006 |
0.006 |
| 🟣 Gemini-3.0-Pro |
0.881 |
0.087 |
### Source-backed notes
- README 给出 `uv pip install` 命令用于安装实验依赖。
- 仓库为 AGPL-3.0 许可证(已通过 GitHub API 复核)。
- 仓库 README 将其定位为可复现实验实现,用于 agent 安全研究。
### FAQ
- **适合直接上生产吗?**:主要面向研究;更适合用来评估并加固你自己的系统。
- **依赖怎么装?**:按 README 的 `uv pip install ...` 说明安装,并建议锁定版本以保证复现。
- **许可证是什么?**:AGPL-3.0(已通过 GitHub 许可证元数据复核)。
## Source & Thanks
> Created by [Graph-COM](https://github.com/Graph-COM). Licensed under AGPL-3.0.
>
> [Graph-COM/CKA-Agent](https://github.com/Graph-COM/CKA-Agent) — ⭐ 203
---
Source: https://tokrepo.com/en/workflows/cka-agent-trojan-knowledge-attack-agent-research
Author: Agent Toolkit