Agenta LLMOps Workflow
1. Prompt Playground
Visual interface for iterating on prompts:
- Side-by-side prompt comparison
- Variable injection for testing with different inputs
- Model parameter tuning (temperature, max_tokens, etc.)
- Version history with full diff view
2. Evaluation
import agenta as ag
# Define an evaluator
@ag.evaluator()
def check_accuracy(output: str, reference: str) -> float:
# Custom scoring logic
return 1.0 if reference.lower() in output.lower() else 0.0
# Run evaluation on a dataset
results = ag.evaluate(
app="my-chatbot",
dataset="test-questions",
evaluators=["check_accuracy", "coherence", "relevance"],
)
print(f"Accuracy: {results['check_accuracy']:.2%}")Built-in evaluators:
- Faithfulness (factual accuracy)
- Relevance (answer matches question)
- Coherence (logical flow)
- Toxicity detection
- Custom Python evaluators
3. A/B Testing
Variant A: "You are a helpful assistant. Answer concisely."
Variant B: "You are an expert. Provide detailed explanations."
| Accuracy | Latency | Cost |
Variant A | 82% | 1.2s | $0.003 |
Variant B | 91% | 2.8s | $0.008 |
Winner | B | A | A |4. Production Observability
import agenta as ag
ag.init(api_key="ag-...", host="https://agenta.yourdomain.com")
@ag.instrument()
def rag_pipeline(query: str):
# Each step is traced
docs = retrieve_documents(query)
context = format_context(docs)
answer = generate_answer(query, context)
return answer
# Dashboard shows:
# - Request/response for each call
# - Latency breakdown by step
# - Token usage and costs
# - Error rates and patternsSelf-Hosting
# Docker Compose deployment
git clone https://github.com/Agenta-AI/agenta.git
cd agenta
docker compose up -dFAQ
Q: What is Agenta? A: Agenta is an open-source LLMOps platform with 4,000+ GitHub stars that unifies prompt playground, evaluation, A/B testing, and production observability in a single self-hostable tool.
Q: How is Agenta different from Langfuse or LangSmith? A: Langfuse focuses on observability/tracing. LangSmith is LangChain-specific. Agenta uniquely combines prompt engineering (playground) + evaluation (automated evals) + observability (production tracing) in one platform, covering the full LLM development lifecycle.
Q: Is Agenta free? A: The open-source version is free to self-host. Agenta also offers a managed cloud service.