Cet actif peut être lu et installé directement par les agents
TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.
Treat it as a security lab: run experiments in an isolated environment and record the exact dependency set used.
Use it to build test cases: trojan knowledge scenarios can become unit/regression tests for your retrieval + tool pipeline.
Map the attack surface: separate poisoning in static docs vs retrieval corpora vs tool outputs so mitigations are targeted.
Export results as artifacts: logs, prompts, and configs are as important as code when reproducing agent-security claims.
README (excerpt)
[ICML 2026] CKA-Agent: Bypassing LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search
🛡️ Defense towards CKA
TurnGate is a response-aware defense mechanism designed to detect and mitigate hidden malicious intent in multi-turn dialogue systems. Defending state-of-the-art multi-turn malicious attacks like CKA-Agent, achieving great defense performance while avoiding overrefusal.
🔥 Latest Results on Frontier Models (Dec 2025)
CKA-Agent demonstrates consistent high attack success rates against the latest frontier models, including GPT-5.2, Gemini-3.0-Pro, and Claude-Haiku-4.5. The results are summarized below:
Source-backed notes
README shows uv pip install commands for installing experiment dependencies.
Repo is AGPL-3.0 licensed (verified via GitHub API).
The repository positions itself as a reproducible implementation for agent-security research (per README wording).
FAQ
Is this meant for production use?: It’s primarily research code; use it to evaluate and harden your own systems.
How do I install dependencies?: Follow the README uv pip install ... instructions and keep versions pinned for reproducibility.
What license applies?: AGPL-3.0 (verified via GitHub license metadata).