SWE-agent — Autonomous GitHub Issue Solver
SWE-agent lets LLMs autonomously fix GitHub issues and find security vulnerabilities. 18.9K+ stars. State-of-the-art on SWE-bench. MIT.
What it is
SWE-agent is a tool that enables language models like Claude Sonnet or GPT-4o to autonomously fix GitHub issues, identify security vulnerabilities, and tackle custom coding challenges. It provides a specialized agent-computer interface (ACI) that gives the LLM efficient commands for navigating codebases, editing files, and running tests.
SWE-agent is for engineering teams and researchers who want to automate bug fixing and security scanning. It has achieved state-of-the-art results on SWE-bench, the standard benchmark for evaluating automated software engineering.
How it saves time or tokens
SWE-agent's custom ACI is designed to reduce the number of tokens an LLM needs to understand and navigate a codebase. Instead of dumping entire files into the context, the ACI provides targeted commands for searching, viewing specific line ranges, and editing precisely.
For recurring bug categories (null pointer exceptions, missing error handling, type mismatches), SWE-agent can resolve issues without human intervention. This frees developers to focus on design decisions and complex architecture work.
How to use
- Install SWE-agent:
pip install sweagent
- Fix a GitHub issue:
sweagent run \
--agent.model.name=claude-sonnet-4-20250514 \
--problem_statement.github_url=https://github.com/owner/repo/issues/123
- Run on SWE-bench for evaluation:
sweagent run \
--problem_statement.swe_bench_id=django__django-11905
Example
Running SWE-agent with a custom problem statement:
# Create a problem description file
cat > problem.md << 'EOF'
The `parse_date` function in `utils/dates.py` raises a ValueError
when the input string contains a timezone offset like '+05:30'.
Expected behavior: parse the offset and return a timezone-aware datetime.
EOF
# Run SWE-agent
sweagent run \
--agent.model.name=claude-sonnet-4-20250514 \
--problem_statement.file_path=problem.md \
--repo.path=/path/to/your/repo
SWE-agent navigates to the relevant file, understands the bug, writes a fix, and validates it by running tests.
Related on TokRepo
- Coding AI tools -- AI-powered development tools
- Security AI tools -- automated vulnerability detection
Common pitfalls
- SWE-agent works best on well-defined issues with clear reproduction steps. Vague issues like 'improve performance' produce poor results because the agent cannot verify its fix.
- The agent modifies files in the repository. Run it in a separate branch or Docker container to avoid unintended changes to your working tree.
- Token costs add up quickly. A single complex issue can consume thousands of tokens as the agent explores the codebase. Set token budgets to prevent runaway costs.
Frequently Asked Questions
SWE-bench is a benchmark for evaluating automated software engineering tools. It consists of real GitHub issues from popular Python repositories (Django, Flask, scikit-learn, etc.) with verified fixes. SWE-agent has achieved state-of-the-art results on this benchmark.
SWE-agent supports any model available through standard API providers. It has been tested with Claude Sonnet, GPT-4o, and other models. You specify the model via the --agent.model.name flag.
Yes. SWE-agent can be configured to scan codebases for common vulnerability patterns. It uses the same ACI to navigate code and identify issues like injection flaws, improper input validation, and insecure defaults.
Yes. When the repository has a test suite, SWE-agent runs relevant tests to validate its fix. If tests fail, the agent iterates on the solution. Test availability significantly improves fix quality.
Copilot assists with code completion in the editor. SWE-agent operates autonomously on entire issues -- it reads the problem, navigates the codebase, writes a fix, and verifies it. They serve different use cases: inline assistance vs. autonomous problem solving.
Citations (3)
- SWE-agent GitHub— SWE-agent with 18.9K+ GitHub stars and SWE-bench results
- SWE-bench— SWE-bench benchmark for automated software engineering
- SWE-agent Paper— Agent-computer interface design for software engineering
Related on TokRepo
Source & Thanks
SWE-agent/SWE-agent — 18,900+ GitHub stars
Discussion
Related Assets
Claude-Flow — Multi-Agent Orchestration for Claude Code
Layers swarm and hive-mind multi-agent orchestration on top of Claude Code with 64 specialized agents, SQLite memory, and parallel execution.
ccusage — Real-Time Token Cost Tracker for Claude Code
CLI that reads ~/.claude logs and breaks down Claude Code token spend by day, session, and project — pluggable into your statusline.
SuperClaude — Workflow Framework for Claude Code
Adds 16+ slash commands, 9 cognitive personas, and a smart flag system to Claude Code in one pipx install.