# MCP Latency Probe — tools/list p95 Runbook > MCP tool calling latency runbook for agents. Measures tools/list p95, separates server latency from network delay, and defines pause rules. ## Install Copy the content below into your project: --- title: MCP Latency Probe — tools/list p95 Runbook asset_kind: knowledge target_tools: [codex, claude_code, cursor, gemini_cli] install_mode: single entrypoint: README.md --- # MCP Latency Probe — tools/list p95 Runbook Use this runbook when an agent needs to decide whether MCP tool calling latency is actually coming from the hosted MCP endpoint, or whether the slow reading came from the caller's network path. The core metric is simple: measure JSON-RPC `tools/list` repeatedly, compute p95, and keep the production-side probe separate from laptop or CI probes. ## Quick Use Run the endpoint from the same region or server that owns production first: ```bash body='{"jsonrpc":"2.0","id":1,"method":"tools/list"}' for i in $(seq 1 10); do curl -sS -o ./mcp-tools-list.json \ -w "%{time_total}\n" \ -X POST https://tokrepo.com/mcp \ -H 'Content-Type: application/json' \ -d "$body" done | awk '{a[NR]=$1} END{ for(i=1;i<=NR;i++) for(j=i+1;j<=NR;j++) if(a[i]>a[j]){t=a[i];a[i]=a[j];a[j]=t} idx=int(0.95*NR+0.999999); if(idx<1)idx=1; if(idx>NR)idx=NR print "p95=" a[idx] }' ``` Then repeat from an outside network and label it separately: ```bash curl -sS -o /dev/null -w 'code=%{http_code} total=%{time_total}\n' \ -X POST https://tokrepo.com/mcp \ -H 'Content-Type: application/json' \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' ``` ## What To Measure Track four numbers every time: | Signal | Why it matters | |---|---| | `tools/list` p95 from production | Proves the hosted MCP route and local reverse proxy are healthy. | | `tools/list` p95 from the automation host | Shows what external agents may experience from that network. | | HTTP status mix | Separates latency from auth, routing, or deploy failures. | | Tool count | Detects a bad deploy that returns 200 with a truncated catalog. | A practical threshold is `p95 < 800ms` for the production-hosted probe. If the production probe is `20ms` but an overseas laptop sees `2200ms`, do not call the MCP server broken. Report the product as healthy and log the client-network penalty separately. For search and triage, label the incident explicitly as **MCP tool calling latency** when the slow path affects `tools/list`, `tools/call`, or another JSON-RPC tool method. That wording keeps the runbook discoverable when an agent searches for "mcp tool calling latency" instead of "MCP endpoint p95". ## Common Failure Modes - **Cold Nuxt process**: first hit after restart is slow, later hits are fast. Use at least 10 samples. - **Reverse proxy buffering or TLS path**: localhost is fast, public domain is slow from the same server. - **Caller geography**: server-side probe is fast, laptop probe is slow. This is a distribution-network issue, not route logic. - **JSON-RPC body mismatch**: a GET or malformed POST may exercise a different handler. - **Tool catalog bloat**: large descriptions can make `tools/list` slow even when routing is fine. ## Decision Rule 1. If production-side p95 is under the target and status is 200, keep the MCP service marked healthy. 2. If production-side p95 is over target twice in a row, pause growth actions and inspect server logs, PM2 uptime, and route payload size. 3. If only external p95 is high, report the geography/network caveat and continue product-quality work. 4. If tool count changes unexpectedly, verify the deployed manifest before doing any promotion. ## Source & Thanks This is an original TokRepo runbook by William Wang. It uses standard JSON-RPC MCP semantics from the [Model Context Protocol documentation](https://modelcontextprotocol.io/) and ordinary `curl` timing fields from the [curl write-out documentation](https://curl.se/docs/manpage.html#-w). # MCP 延迟探针:tools/list p95 运行手册 当 Agent 需要判断 hosted MCP 是真的慢,还是调用方网络路径慢时,用这份手册。核心指标很简单:重复测 JSON-RPC `tools/list`,算 p95,并且把生产服务器侧探针和本机/CI 外部探针分开报告。 ## 快速使用 优先在生产同区域或生产服务器上跑: ```bash body='{"jsonrpc":"2.0","id":1,"method":"tools/list"}' for i in $(seq 1 10); do curl -sS -o ./mcp-tools-list.json \ -w "%{time_total}\n" \ -X POST https://tokrepo.com/mcp \ -H 'Content-Type: application/json' \ -d "$body" done ``` 然后再从外部网络跑一次,但必须单独标注。生产侧 p95 小于 800ms 时,不能因为某台本机跨境网络慢就判断产品 MCP 失效。 ## 判定 - 生产侧 p95 达标且 HTTP 200:MCP 产品面健康。 - 连续两次生产侧 p95 超标:暂停增长动作,查 PM2、Nginx、路由 payload。 - 只有外部探针慢:记录网络视角,不把它当产品故障。 - tool count 异常:先查部署和 manifest,再做分发。 --- Source: https://tokrepo.com/en/workflows/mcp-latency-probe-tools-list-p95-runbook-6bda6a2c Author: henuwangkai