How do I install Arize Phoenix — Open Source AI Observability and Evaluation?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Arize Phoenix — Open Source AI Observability and Evaluation

Introduction

Arize Phoenix is an open-source observability platform for AI applications. It provides tracing, evaluation, and experiment tracking for LLM apps, RAG pipelines, and traditional ML models, helping teams understand model behavior, catch regressions, and iterate on prompt quality.

What Arize Phoenix Does

Traces LLM calls, retrieval steps, and tool usage in AI pipelines
Evaluates outputs with built-in and custom LLM-as-judge evaluators
Visualizes embedding spaces to detect data drift and clustering issues
Tracks experiments across prompt versions and model configurations
Integrates with OpenTelemetry for standardized instrumentation

Architecture Overview

Phoenix runs as a local web server backed by a trace store. It collects OpenTelemetry spans from instrumented applications, storing them for analysis and visualization. The evaluation engine runs LLM-based judges or custom scoring functions against collected traces. A React-based UI provides interactive exploration of traces, evaluations, and embedding projections.

Self-Hosting & Configuration

Install via pip and launch with phoenix serve
Instrument your app with the OpenTelemetry-based Phoenix SDK
Supports auto-instrumentation for LangChain, LlamaIndex, OpenAI, and more
Configure storage backend (SQLite default, PostgreSQL for production)
Deploy via Docker for team-wide access

Key Features

OpenTelemetry-native tracing for LLM applications
Built-in LLM evaluators for relevance, hallucination, and toxicity
Embedding visualization with UMAP dimensionality reduction
Experiment tracking for A/B testing prompt and model changes
Works with any LLM provider (OpenAI, Anthropic, local models)

Comparison with Similar Tools

Langfuse — open-source LLM observability; Phoenix adds embedding analysis and richer evaluation
LangSmith — LangChain's hosted tracing platform; Phoenix is fully open-source and self-hosted
Weights & Biases — general ML experiment tracking; Phoenix is purpose-built for LLM observability
Helicone — LLM proxy with logging; Phoenix provides deeper trace analysis and evaluation

FAQ

Q: Does Phoenix work with non-LLM models? A: Yes, it supports embedding visualization and evaluation for traditional ML models as well.

Q: Can I run Phoenix in production? A: Yes, deploy with PostgreSQL storage and Docker for persistent, team-accessible observability.

Q: How does tracing work? A: Phoenix uses OpenTelemetry-compatible instrumentation. Add a few lines of code or use auto-instrumentors for popular frameworks.

Q: Is there a cloud-hosted version? A: Arize offers a commercial cloud platform, but Phoenix itself is fully open-source and self-hostable.

Arize Phoenix — Open Source AI Observability and Evaluation

Introduction

What Arize Phoenix Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discusión

Activos relacionados

Phoenix — Open Source AI Observability

OpenSSF Scorecard — Security Health Metrics for Open Source

draw.io — Free Open-Source Diagramming Tool for Any Platform

Crater — Open Source Invoicing for Freelancers and Small Businesses