{"version":"1.0","workflow_uuid":"62c59e45-5cea-11f1-9bc6-00163e2b0d79","workflow_title":"LM Evaluation Harness — Few-Shot Language Model Benchmarking","recommended_install":{"schema_version":1,"status":"direct","policy":"allow","install_mode":"single","target":"codex","command":"npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 --target codex","dry_run_command":"npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 --target codex --dry-run --json","runtime_commands":{"codex":"npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 --target codex","gemini":"npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 --target gemini","project":"npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79"},"prompt":"Install and make this TokRepo asset usable: \"LM Evaluation Harness — Few-Shot Language Model Benchmarking\".\n\nPage: https://tokrepo.com/en/workflows/asset-62c59e45\nUUID: 62c59e45-5cea-11f1-9bc6-00163e2b0d79\nTarget adapter: codex\nPolicy/mode: allow / single\n\nRead these machine-facing entry points first:\n- metadata: https://tokrepo.com/metadata/asset-62c59e45.json\n- install contract: https://tokrepo.com/install-contract/asset-62c59e45.json\n- install plan: https://api.tokrepo.com/api/v1/tokenboard/workflows/install-plan?uuid=62c59e45-5cea-11f1-9bc6-00163e2b0d79&target=codex\n- raw content: https://tokrepo.com/raw/asset-62c59e45\n\nCommands:\n- choose the current runtime; do not use unsupported CLI targets.\n- Codex: npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 --target codex\n- Gemini CLI: npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 --target gemini\n- Claude Code / project agent: run npx -y tokrepo@latest install 62c59e45-5cea-11f1-9bc6-00163e2b0d79 from the project root; add `--dry-run --json` first to inspect the plan.\n- add `--dry-run --json` to the chosen command first; after the plan allows it, run the chosen command.\n\nExecution rules:\n1. Run the dry-run command first to inspect the plan and risks.\n2. Run the install command after the plan allows it.\n3. Run the post_verify checks from the install plan.\n\nReport back with changed files, verification result, and how to use the asset next.","next_steps":["Run the dry-run command first to inspect the plan and risks.","Run the install command after the plan allows it.","Run the post_verify checks from the install plan."],"success_check":["The asset is installed into the target agent or project location.","The agent can load the entrypoint and explain how to use it."]},"install_contract":{"version":"1.0","installReady":false,"title":"LM Evaluation Harness — Few-Shot Language Model Benchmarking","summary":"A unified framework for evaluating language models across hundreds of benchmarks with reproducible few-shot testing.","assetType":"Configs","pageUrl":"https://tokrepo.com/en/workflows/asset-62c59e45","sourceUrl":"https://github.com/EleutherAI/lm-evaluation-harness","intendedFor":[],"firstActions":[],"agentFirstSteps":[],"targetPaths":[],"verification":[],"startingPoints":[],"example":"","successOutcome":"","boundaries":[],"askUserIf":["the current workspace stack cannot be matched to a safe upstream template","the target path is not the project root, or an existing file should be merged instead of overwritten"]}}