What is CKA-Agent — Trojan Knowledge Attack Agent (Research)?

Research code for studying trojan-knowledge attacks in agent systems, with reproducible scripts and configs; verified 203★, pushed 2026-05-13.

Is CKA-Agent — Trojan Knowledge Attack Agent (Research) free to use?

Yes. CKA-Agent — Trojan Knowledge Attack Agent (Research) is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install CKA-Agent — Trojan Knowledge Attack Agent (Research)?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

CKA-Agent — Trojan Knowledge Attack Agent (Research)

Main

Treat it as a security lab: run experiments in an isolated environment and record the exact dependency set used.
Use it to build test cases: trojan knowledge scenarios can become unit/regression tests for your retrieval + tool pipeline.
Map the attack surface: separate poisoning in static docs vs retrieval corpora vs tool outputs so mitigations are targeted.
Export results as artifacts: logs, prompts, and configs are as important as code when reproducing agent-security claims.

README (excerpt)

[ICML 2026] CKA-Agent: Bypassing LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search

🛡️ Defense towards CKA

TurnGate is a response-aware defense mechanism designed to detect and mitigate hidden malicious intent in multi-turn dialogue systems. Defending state-of-the-art multi-turn malicious attacks like CKA-Agent, achieving great defense performance while avoiding overrefusal.

🔥 Latest Results on Frontier Models (Dec 2025)

CKA-Agent demonstrates consistent high attack success rates against the latest frontier models, including GPT-5.2, Gemini-3.0-Pro, and Claude-Haiku-4.5. The results are summarized below:

Source-backed notes

README shows uv pip install commands for installing experiment dependencies.
Repo is AGPL-3.0 licensed (verified via GitHub API).
The repository positions itself as a reproducible implementation for agent-security research (per README wording).

FAQ

Is this meant for production use?: It’s primarily research code; use it to evaluate and harden your own systems.
How do I install dependencies?: Follow the README uv pip install ... instructions and keep versions pinned for reproducibility.
What license applies?: AGPL-3.0 (verified via GitHub license metadata).

Model	HarmBench				StrongREJECT
Model	FS ↑	PS ↑	V ↓	R ↓	FS ↑	PS ↑	V ↓	R ↓
🟢 GPT-5.2	0.889	0.079	0.024	0.008	0.932	0.056	0.006	0.006
🟣 Gemini-3.0-Pro	0.881	0.087

CKA-Agent — Trojan Knowledge Attack Agent (Research)

This asset can be read and installed directly by agents

Key facts (verified)

Main

README (excerpt)

🛡️ Defense towards CKA

🔥 Latest Results on Frontier Models (Dec 2025)

Source-backed notes

FAQ

Source & Thanks

Discussion

Related Assets

Koog — Kotlin/JVM Agent Framework with MCP Tools

Swarmclaw — Self-Hosted Multi-Agent Runtime + MCP

Obsidian Agent Client — Bring Agents into Obsidian

AgentKit Samples — Hello-World Agent (uv + SSE)