SkillsApr 1, 2026·1 min read

SWE-agent — Autonomous GitHub Issue Solver

SWE-agent lets LLMs autonomously fix GitHub issues and find security vulnerabilities. 18.9K+ stars. State-of-the-art on SWE-bench. MIT.

TL;DR
SWE-agent enables LLMs to autonomously fix GitHub issues and detect security vulnerabilities via custom agent interfaces.
§01

What it is

SWE-agent is a tool that enables language models like Claude Sonnet or GPT-4o to autonomously fix GitHub issues, identify security vulnerabilities, and tackle custom coding challenges. It provides a specialized agent-computer interface (ACI) that gives the LLM efficient commands for navigating codebases, editing files, and running tests.

SWE-agent is for engineering teams and researchers who want to automate bug fixing and security scanning. It has achieved state-of-the-art results on SWE-bench, the standard benchmark for evaluating automated software engineering.

§02

How it saves time or tokens

SWE-agent's custom ACI is designed to reduce the number of tokens an LLM needs to understand and navigate a codebase. Instead of dumping entire files into the context, the ACI provides targeted commands for searching, viewing specific line ranges, and editing precisely.

For recurring bug categories (null pointer exceptions, missing error handling, type mismatches), SWE-agent can resolve issues without human intervention. This frees developers to focus on design decisions and complex architecture work.

§03

How to use

  1. Install SWE-agent:
pip install sweagent
  1. Fix a GitHub issue:
sweagent run \
  --agent.model.name=claude-sonnet-4-20250514 \
  --problem_statement.github_url=https://github.com/owner/repo/issues/123
  1. Run on SWE-bench for evaluation:
sweagent run \
  --problem_statement.swe_bench_id=django__django-11905
§04

Example

Running SWE-agent with a custom problem statement:

# Create a problem description file
cat > problem.md << 'EOF'
The `parse_date` function in `utils/dates.py` raises a ValueError
when the input string contains a timezone offset like '+05:30'.
Expected behavior: parse the offset and return a timezone-aware datetime.
EOF

# Run SWE-agent
sweagent run \
  --agent.model.name=claude-sonnet-4-20250514 \
  --problem_statement.file_path=problem.md \
  --repo.path=/path/to/your/repo

SWE-agent navigates to the relevant file, understands the bug, writes a fix, and validates it by running tests.

§05

Related on TokRepo

§06

Common pitfalls

  • SWE-agent works best on well-defined issues with clear reproduction steps. Vague issues like 'improve performance' produce poor results because the agent cannot verify its fix.
  • The agent modifies files in the repository. Run it in a separate branch or Docker container to avoid unintended changes to your working tree.
  • Token costs add up quickly. A single complex issue can consume thousands of tokens as the agent explores the codebase. Set token budgets to prevent runaway costs.

Frequently Asked Questions

What is SWE-bench?+

SWE-bench is a benchmark for evaluating automated software engineering tools. It consists of real GitHub issues from popular Python repositories (Django, Flask, scikit-learn, etc.) with verified fixes. SWE-agent has achieved state-of-the-art results on this benchmark.

Which LLMs does SWE-agent support?+

SWE-agent supports any model available through standard API providers. It has been tested with Claude Sonnet, GPT-4o, and other models. You specify the model via the --agent.model.name flag.

Can SWE-agent find security vulnerabilities?+

Yes. SWE-agent can be configured to scan codebases for common vulnerability patterns. It uses the same ACI to navigate code and identify issues like injection flaws, improper input validation, and insecure defaults.

Does SWE-agent run tests to verify fixes?+

Yes. When the repository has a test suite, SWE-agent runs relevant tests to validate its fix. If tests fail, the agent iterates on the solution. Test availability significantly improves fix quality.

How does SWE-agent differ from GitHub Copilot?+

Copilot assists with code completion in the editor. SWE-agent operates autonomously on entire issues -- it reads the problem, navigates the codebase, writes a fix, and verifies it. They serve different use cases: inline assistance vs. autonomous problem solving.

Citations (3)
  • SWE-agent GitHub— SWE-agent with 18.9K+ GitHub stars and SWE-bench results
  • SWE-bench— SWE-bench benchmark for automated software engineering
  • SWE-agent Paper— Agent-computer interface design for software engineering
🙏

Source & Thanks

SWE-agent/SWE-agent — 18,900+ GitHub stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets