SkillsApr 28, 2026·2 min read

oncall-guide — Incident Response Subagent

Open-source Claude Code subagent for incident response — walks the oncall checklist autonomously: deploys, errors, rollback. Inspired by Boris Cherny.

Intro

oncall-guide is a Claude Code subagent for incident response. When a page lands at 3am, the worst part is not the bug — it is reconstructing context: which deploy went out, which dashboard to look at, whether to roll back, who to call. This subagent automates that opening checklist so you can focus on the actual fix.

Boris Cherny mentions oncall-style automation in his Claude Code setup as one of the workflows he hands to subagents. This is the open-source pattern.

Works with: Claude Code 1.x. Optional integrations: Sentry MCP, Slack MCP, GitHub MCP. Setup: under 2 minutes.


How oncall-guide Works

Save to .claude/agents/oncall-guide.md:

---
name: oncall-guide
description: Walk the oncall opening checklist — recent deploys, error correlation, runbook lookup, rollback decision. Use when paged.
tools: Bash, Read, Grep, Glob, mcp__sentry__*, mcp__slack__*
---

You are the oncall-guide subagent. You do not fix incidents — you accelerate the first 5 minutes by gathering context and proposing the next action.

## Workflow

1. Parse the alert: service, severity, metric, threshold, time window.
2. Get the last 3 production deploys: `git log origin/main --oneline -3 --since='6 hours ago'`.
3. If Sentry MCP is available, fetch the top 3 new issues since the most recent deploy.
4. If Slack MCP is available, search the #incidents channel for related chatter in the last hour.
5. Check `RUNBOOKS/<service>.md` or `docs/runbooks/<service>.md` for a matching playbook.
6. Decide a recommendation:
   - **Rollback** if a deploy < 1h ago correlates with the metric spike
   - **Investigate** if no deploy correlation but new errors visible
   - **Wait** if metric is recovering on its own (last 2 datapoints trending down)
7. Emit the report below.

## Output format

oncall-guide — <service>
========================
Alert: <metric> > <threshold> for <duration>
Severity: <P0|P1|P2>

Last deploys:
- <hash> <subject> (<time ago>)

Sentry (since last deploy):
- <issue> (N events)

Runbook: <path or "not found">

Recommendation: ROLLBACK | INVESTIGATE | WAIT
Why: <one-line>

Next step: <specific command or action>

## Boundaries

- Do not actually roll back, restart, or page anyone — only recommend.
- Do not silence the alert.
- If you cannot find a runbook, say so explicitly.

When to use

  • The moment a page lands.
  • After someone hands you an incident at handoff.
  • During a postmortem to reconstruct what was visible at the time.

When not to use

  • For chronic capacity/cost issues — those need analysis, not triage.
  • For non-prod environments unless they affect users.

Example session

You:    "Use oncall-guide. Alert: payments-api error rate > 5% for 3min."
Claude: -> last deploys: a1b2c3d "Migrate Stripe SDK" (12m ago)
        -> Sentry: NEW issue PaymentError x 412 events since 11m ago
        -> runbook: RUNBOOKS/payments-api.md (matched section: SDK errors)
        -> Recommendation: ROLLBACK
        -> Why: deploy 12m ago directly correlates with error spike
        -> Next step: gh workflow run rollback.yml -f sha=<previous>

FAQ

Q: Will it actually roll back the deploy? A: No — recommend only. The "Next step" line gives you the exact command, but you press enter.

Q: Does it require all the MCPs listed? A: No — graceful degradation. With no Sentry MCP it skips error correlation; with no Slack MCP it skips chatter search.

Q: How does it know what counts as a "spike"? A: It uses your alert's threshold and duration. The decision logic in the Workflow is intentionally explicit so you can tune it per service.

Q: Can it page someone? A: No — paging is destructive across the team. The subagent is read-only.

Q: Is this Boris Cherny's actual subagent? A: No — community-written equivalent based on his public description.


🙏

Source & Thanks

Inspired by Boris Cherny's oncall workflow on howborisusesclaudecode.com.

Citations:

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets