Practical Notes
- GitHub: 393 stars · 46 forks; pushed 2026-05-12 (verified via GitHub API).
- README requires cloning with submodules and running
uv syncto create.venvbeforeuv run boxpwnr …. - README documents hard limits like
--max-turns,--max-cost, and execution timeouts (default 30s, max 300s).
Main
A useful BoxPwnr pattern for teams:
- Define a target catalog (labs/benchmarks) and run with consistent flags (
--max-turns,--max-cost) so results are comparable. - Keep the executor boundary strict: everything runs inside the Docker environment; your host stays clean.
- Use
--generate-progress/--resume-fromto create handoffs between attempts instead of restarting from scratch. - When a task is “almost solved”, switch to manual follow-up (or keep the target running) and treat the LLM as a coordinator, not a miracle worker.
This keeps experimentation fast while still producing artifacts you can review later.
FAQ
Q: Do I need Docker? A: Yes. README says BoxPwnr requires Docker to be installed and running.
Q: How do I control cost/time?
A: Use --max-cost, --max-turns, and execution timeout flags described in the README.
Q: What’s the minimal run command?
A: After uv sync, run uv run boxpwnr --platform htb --target meow (example from README).