MCP ConfigsMay 14, 2026·2 min read

Spark History MCP — Investigate Jobs via Tools

Kubeflow’s Spark History Server MCP + `shs` CLI for job analysis, failures, and comparisons; verified 168★, pushed 2026-05-13.

Agent ready

Safe staging for this asset

This asset is staged first. The copied prompt tells the agent to inspect the staged files and ask before activating scripts, MCP config, or global config.

Stage only · 17/100Policy: stage
Agent surface
Any MCP/CLI agent
Kind
Mcp Config
Install
Stage only
Trust
Trust: Established
Entrypoint
Asset
Safe staging command
npx -y tokrepo@latest install d450ff8d-c902-5131-8d86-29411fd27301 --target codex

Stages files first; activation requires review of the staged README and plan.

Intro

Kubeflow’s Spark History Server MCP + shs CLI for job analysis, failures, and comparisons; verified 168★, pushed 2026-05-13.

Best for: Spark teams who want repeatable investigations from an agent (MCP) or scripts (CLI)

Works with: Spark History Server; MCP server runs on port 18888 and supports streamable-http/stdio (README)

Setup time: 12-30 minutes

Key facts (verified)

  • GitHub: 168 stars · 59 forks · pushed 2026-05-13.
  • License: Apache-2.0 · owner avatar + repo URL verified via GitHub API.
  • README-backed entrypoint: uvx --from mcp-apache-spark-history-server spark-mcp.

Main

  • Use shs for quick, deterministic inspection; use MCP when you want an agent to run multi-step investigations across apps and stages.

  • Keep config explicit: README uses shs setup config > config.yaml and expects you to set your History Server URL there.

  • Choose transport by deployment: streamable HTTP is convenient for remote clients; stdio is simple for local setups (README).

  • Use comparisons to avoid guesswork: README links a real-world example of comparing two benchmark runs and highlights failure investigation commands.

Source-backed notes

  • README says the project provides two interfaces: an MCP server (spark-mcp) and a standalone CLI (shs).
  • README shows running the MCP server directly via uvx --from mcp-apache-spark-history-server spark-mcp and mentions PyPI publishing.
  • README config shows an MCP port default of 18888 and transport options streamable-http or stdio.

FAQ

  • Do I need MCP if I only want scripts?: No — use shs CLI directly; MCP is for agent-driven investigations (README positioning).
  • Where do I set the Spark History Server URL?: In config.yaml; README generates it via shs setup config > config.yaml.
  • What port does the MCP server use?: README defaults to port 18888 and supports transport configuration.
🙏

Source & Thanks

Source: https://github.com/kubeflow/mcp-apache-spark-history-server > License: Apache-2.0 > GitHub stars: 168 · forks: 59

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets