# Spark History MCP — Investigate Jobs via Tools

> Kubeflow’s Spark History Server MCP + `shs` CLI for job analysis, failures, and comparisons; verified 168★, pushed 2026-05-13.

## Install

Merge the JSON below into your `.mcp.json`:

## Quick Use

```bash
# MCP server (README):
uvx --from mcp-apache-spark-history-server spark-mcp
# Or install with pip:
pip install mcp-apache-spark-history-server
spark-mcp
# CLI quickstart (README):
shs setup config > config.yaml
```

## Intro

Kubeflow’s Spark History Server MCP + `shs` CLI for job analysis, failures, and comparisons; verified 168★, pushed 2026-05-13.

**Best for:** Spark teams who want repeatable investigations from an agent (MCP) or scripts (CLI)

**Works with:** Spark History Server; MCP server runs on port 18888 and supports streamable-http/stdio (README)

**Setup time:** 12-30 minutes

### Key facts (verified)

- GitHub: 168 stars · 59 forks · pushed 2026-05-13.
- License: Apache-2.0 · owner avatar + repo URL verified via GitHub API.
- README-backed entrypoint: `uvx --from mcp-apache-spark-history-server spark-mcp`.

## Main

- Use `shs` for quick, deterministic inspection; use MCP when you want an agent to run multi-step investigations across apps and stages.

- Keep config explicit: README uses `shs setup config > config.yaml` and expects you to set your History Server URL there.

- Choose transport by deployment: streamable HTTP is convenient for remote clients; stdio is simple for local setups (README).

- Use comparisons to avoid guesswork: README links a real-world example of comparing two benchmark runs and highlights failure investigation commands.

### Source-backed notes

- README says the project provides two interfaces: an MCP server (`spark-mcp`) and a standalone CLI (`shs`).
- README shows running the MCP server directly via `uvx --from mcp-apache-spark-history-server spark-mcp` and mentions PyPI publishing.
- README config shows an MCP port default of 18888 and transport options `streamable-http` or `stdio`.

### FAQ

- **Do I need MCP if I only want scripts?**: No — use `shs` CLI directly; MCP is for agent-driven investigations (README positioning).
- **Where do I set the Spark History Server URL?**: In `config.yaml`; README generates it via `shs setup config > config.yaml`.
- **What port does the MCP server use?**: README defaults to port 18888 and supports transport configuration.

## Source & Thanks

> Source: https://github.com/kubeflow/mcp-apache-spark-history-server
> License: Apache-2.0
> GitHub stars: 168 · forks: 59

---

<!-- ZH -->

## Quick Use

```bash
# MCP server (README):
uvx --from mcp-apache-spark-history-server spark-mcp
# Or install with pip:
pip install mcp-apache-spark-history-server
spark-mcp
# CLI quickstart (README):
shs setup config > config.yaml
```

## Intro

Kubeflow 出品的 Spark History Server MCP + `shs` CLI：用于作业分析、失败排查与运行对比，适合把排障变成可复用工具链；已验证 168★，更新于 2026-05-13。

**Best for:** 希望用 agent（MCP）或脚本（CLI）复用 Spark 排障流程的团队

**Works with:** 需要 Spark History Server；MCP 默认 18888 端口，支持 streamable-http/stdio（README）

**Setup time:** 12-30 minutes

### Key facts (verified)

- GitHub：168 stars · 59 forks；最近更新 2026-05-13。
- 许可证：Apache-2.0；作者头像与仓库链接均已通过 GitHub API 复核。
- README 中可对照的入口命令：`uvx --from mcp-apache-spark-history-server spark-mcp`。

## Main

- 快速确定性检查用 `shs`；需要多步推理与聚合结论时用 MCP 让 agent 调度工具。

- 配置要显式：README 用 `shs setup config > config.yaml` 生成配置，并要求写入 History Server URL。

- 按部署选择 transport：远程/调试优先 streamable HTTP，本地接入可用 stdio（README）。

- 用对比减少拍脑袋：README 给出对比两次运行的示例，并提供失败排查的命令方向。

### Source-backed notes

- README 写明提供两种接口：MCP server（`spark-mcp`）与独立 CLI（`shs`）。
- README 给出 `uvx --from mcp-apache-spark-history-server spark-mcp` 直接运行方式，并说明发布在 PyPI。
- README 配置示例包含 MCP 默认端口 18888 与 `streamable-http/stdio` 传输选项。

### FAQ

- **只想写脚本还需要 MCP 吗？**：不需要；README 定位 `shs` 是给工程师/脚本用，MCP 适合 agent 调度。
- **Spark History Server 的 URL 写在哪？**：写在 `config.yaml`；README 通过 `shs setup config > config.yaml` 生成。
- **MCP 默认端口是多少？**：README 默认 18888，且可在配置中调整 transport/端口。

## Source & Thanks

> Source: https://github.com/kubeflow/mcp-apache-spark-history-server
> License: Apache-2.0
> GitHub stars: 168 · forks: 59


---
Source: https://tokrepo.com/en/workflows/spark-history-mcp-investigate-jobs-via-tools
Author: MCP Hub