ScriptsApr 9, 2026·2 min read

Latitude — AI Agent Engineering Platform

Open-source platform for building, evaluating, and monitoring AI agents in production. Observability, prompt playground, LLM-as-judge evals, experiment comparison. LGPL-3.0, 4,000+ stars.

AI
AI Open Source · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

  1. Try Latitude Cloud at latitude.so

  2. Or self-host with Docker:

git clone https://github.com/latitude-dev/latitude-llm.git
cd latitude-llm
docker compose up -d
  1. Connect your AI application and start monitoring prompts, responses, and performance.

Intro

Latitude is an open-source AI agent engineering platform with 4,000+ GitHub stars. It provides full observability for LLM applications — capturing prompts, inputs/outputs, tool calls, and performance metrics. Features a prompt playground for iteration, dataset curation for testing, LLM-as-judge evaluations, experiment comparison across models, and automated evaluation guards. Best for teams building production AI applications who need visibility into how their agents perform and tools to improve them.

Explore more AI development tools on TokRepo AI Open Source.


Latitude — Build, Evaluate, Monitor AI Agents

The Problem

Building AI agents is easy. Making them work reliably in production is hard. You need to see what prompts are sent, what responses come back, how tool calls behave, and whether quality is improving or degrading over time.

The Solution

Latitude gives you full visibility into your AI pipeline with tools to evaluate and improve agent performance.

Key Features

Feature Description
Observability Capture prompts, I/O, tool calls, latency, costs
Prompt Playground Iterate on prompts with instant feedback
Datasets Curate test data for consistent evaluation
Evaluations LLM-as-judge, custom metrics, automated grading
Experiments Compare performance across models and providers
Annotations Label and cluster issues in agent responses
Guards Automated evaluation checks before responses ship

Integration

import { Latitude } from "@latitude-data/sdk";

const latitude = new Latitude("your-api-key");

// Log a prompt-response pair
await latitude.log({
  prompt: "Summarize this document...",
  response: "The document discusses...",
  model: "claude-sonnet-4-20250514",
  duration_ms: 1200,
  tokens: { input: 500, output: 150 }
});

Evaluation Example

// Run LLM-as-judge evaluation
const result = await latitude.evaluate({
  input: userQuery,
  output: agentResponse,
  criteria: [
    "relevance",
    "accuracy",
    "helpfulness"
  ],
  judge_model: "gpt-4o"
});

FAQ

Q: What is Latitude? A: An open-source platform for building, evaluating, and monitoring AI agents in production. It provides observability, prompt management, LLM evaluations, and experiment comparison.

Q: Is Latitude free? A: The self-hosted version is free under LGPL-3.0. Latitude Cloud has a free tier for smaller projects.

Q: How is Latitude different from LangFuse? A: Latitude focuses on the full agent engineering lifecycle — from prompt iteration to evaluation to monitoring — with built-in LLM-as-judge capabilities and experiment comparison.


🙏

Source & Thanks

Created by Latitude. Licensed under LGPL-3.0.

latitude-llm — ⭐ 4,000+

Thanks to the Latitude team for making AI agent engineering more transparent and reliable.

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets