# Agenta — Open-Source LLMOps Platform

> Prompt playground, evaluation, and observability in one platform. Compare prompts, run evals, trace production calls. 4K+ stars.

## Install

Copy the content below into your project:

# Agenta — Open-Source LLMOps Platform

## Quick Use

```bash
pip install agenta
```

```python
import agenta as ag

ag.init()

@ag.instrument()
def generate_response(prompt: str, model: str = "gpt-4o"):
    from openai import OpenAI
    client = OpenAI()
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

# Calls are automatically traced and logged
result = generate_response("Explain RAG in 2 sentences")
```

Launch the full platform:
```bash
agenta init
agenta serve
```

Open `http://localhost:3000` — the LLMOps dashboard is ready.

---

## Intro

Agenta is an open-source LLMOps platform with 4,000+ GitHub stars that combines prompt engineering, evaluation, and observability in a single tool. It provides a visual prompt playground for iterating on prompts, automated evaluation pipelines for measuring quality, A/B testing for comparing prompt variants, and production tracing for monitoring live applications. Instead of juggling separate tools for each stage of the LLM development lifecycle, Agenta unifies them into one self-hostable platform.

Works with: OpenAI, Anthropic, Google, Mistral, local models, LangChain, LlamaIndex. Best for teams iterating on LLM applications who need prompt management + evaluation + observability together. Setup time: under 5 minutes.

---

## Agenta LLMOps Workflow

### 1. Prompt Playground

Visual interface for iterating on prompts:
- Side-by-side prompt comparison
- Variable injection for testing with different inputs
- Model parameter tuning (temperature, max_tokens, etc.)
- Version history with full diff view

### 2. Evaluation

```python
import agenta as ag

# Define an evaluator
@ag.evaluator()
def check_accuracy(output: str, reference: str) -> float:
    # Custom scoring logic
    return 1.0 if reference.lower() in output.lower() else 0.0

# Run evaluation on a dataset
results = ag.evaluate(
    app="my-chatbot",
    dataset="test-questions",
    evaluators=["check_accuracy", "coherence", "relevance"],
)

print(f"Accuracy: {results['check_accuracy']:.2%}")
```

Built-in evaluators:
- Faithfulness (factual accuracy)
- Relevance (answer matches question)
- Coherence (logical flow)
- Toxicity detection
- Custom Python evaluators

### 3. A/B Testing

```
Variant A: "You are a helpful assistant. Answer concisely."
Variant B: "You are an expert. Provide detailed explanations."

            | Accuracy | Latency | Cost    |
 Variant A  | 82%      | 1.2s    | $0.003  |
 Variant B  | 91%      | 2.8s    | $0.008  |
 Winner     |    B     |   A     |   A     |
```

### 4. Production Observability

```python
import agenta as ag

ag.init(api_key="ag-...", host="https://agenta.yourdomain.com")

@ag.instrument()
def rag_pipeline(query: str):
    # Each step is traced
    docs = retrieve_documents(query)
    context = format_context(docs)
    answer = generate_answer(query, context)
    return answer

# Dashboard shows:
# - Request/response for each call
# - Latency breakdown by step
# - Token usage and costs
# - Error rates and patterns
```

### Self-Hosting

```bash
# Docker Compose deployment
git clone https://github.com/Agenta-AI/agenta.git
cd agenta
docker compose up -d
```

---

## FAQ

**Q: What is Agenta?**
A: Agenta is an open-source LLMOps platform with 4,000+ GitHub stars that unifies prompt playground, evaluation, A/B testing, and production observability in a single self-hostable tool.

**Q: How is Agenta different from Langfuse or LangSmith?**
A: Langfuse focuses on observability/tracing. LangSmith is LangChain-specific. Agenta uniquely combines prompt engineering (playground) + evaluation (automated evals) + observability (production tracing) in one platform, covering the full LLM development lifecycle.

**Q: Is Agenta free?**
A: The open-source version is free to self-host. Agenta also offers a managed cloud service.

---

## Source & Thanks

> Created by [Agenta AI](https://github.com/Agenta-AI). Licensed under Apache-2.0.
>
> [agenta](https://github.com/Agenta-AI/agenta) — ⭐ 4,000+

---

<!-- ZH -->

## 快速使用

```bash
pip install agenta
```

```python
import agenta as ag
ag.init()

@ag.instrument()
def generate(prompt):
    # 你的 LLM 调用代码
    return call_llm(prompt)
```

---

## 简介

Agenta 是一个拥有 4,000+ GitHub stars 的开源 LLMOps 平台，将提示词工程、评估、A/B 测试和生产可观测性统一在一个可自托管的工具中。覆盖 LLM 开发的完整生命周期。

---

## 来源与感谢

> Created by [Agenta AI](https://github.com/Agenta-AI). Licensed under Apache-2.0.
>
> [agenta](https://github.com/Agenta-AI/agenta) — ⭐ 4,000+


---
Source: https://tokrepo.com/en/workflows/fc2ccb05-1663-4a13-9e47-0d496d058aa2
Author: TokRepo精选