# Llama Index — Data Framework for LLM Applications

> Leading data framework for connecting LLMs to external data. LlamaIndex handles ingestion, indexing, retrieval, and query engines for building production RAG applications.

## Install

Copy the content below into your project:

## Quick Use

```bash
pip install llama-index
```

```python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load documents
documents = SimpleDirectoryReader("./docs").load_data()

# Build index (auto-embeds and stores)
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What is the refund policy?")
print(response)
```

## What is LlamaIndex?

LlamaIndex is a data framework that connects LLMs to your data. It handles the entire RAG pipeline — data ingestion from 160+ sources, chunking, embedding, indexing, retrieval, and response synthesis. While LangChain focuses on chains and agents, LlamaIndex focuses on making your data queryable by LLMs. The two are complementary and often used together.

**Answer-Ready**: LlamaIndex is a data framework for LLM applications. Handles RAG end-to-end: 160+ data connectors, automatic chunking/embedding, multiple index types, and query engines. Used for production document Q&A, chatbots, and knowledge bases. LlamaCloud for managed RAG. 38k+ GitHub stars.

**Best for**: Teams building document Q&A and RAG applications. **Works with**: OpenAI, Claude, any LLM; 20+ vector stores. **Setup time**: Under 3 minutes.

## Core Features

### 1. Data Connectors (160+)

```python
# Local files
from llama_index.core import SimpleDirectoryReader
docs = SimpleDirectoryReader("./data").load_data()

# Web
from llama_index.readers.web import SimpleWebPageReader
docs = SimpleWebPageReader().load_data(["https://docs.example.com"])

# Database
from llama_index.readers.database import DatabaseReader
reader = DatabaseReader(uri="postgresql://...")
docs = reader.load_data(query="SELECT * FROM articles")

# APIs: Notion, Slack, Google Drive, GitHub, Confluence, etc.
```

### 2. Index Types

| Index | Best For |
|-------|----------|
| VectorStoreIndex | Semantic search (default) |
| SummaryIndex | Summarization tasks |
| TreeIndex | Hierarchical data |
| KeywordTableIndex | Keyword-based retrieval |
| KnowledgeGraphIndex | Entity relationships |

### 3. Query Engines

```python
# Simple Q&A
query_engine = index.as_query_engine()

# Chat (with memory)
chat_engine = index.as_chat_engine()

# With reranking
from llama_index.postprocessor.cohere_rerank import CohereRerank
query_engine = index.as_query_engine(
    node_postprocessors=[CohereRerank(top_n=3)],
)

# Sub-question decomposition
from llama_index.core.query_engine import SubQuestionQueryEngine
query_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=[...])
```

### 4. Agents

```python
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool

tools = [
    QueryEngineTool.from_defaults(query_engine=policy_engine, name="policy", description="Company policies"),
    QueryEngineTool.from_defaults(query_engine=product_engine, name="product", description="Product documentation"),
]

agent = ReActAgent.from_tools(tools)
response = agent.chat("What is the return policy for electronics?")
```

### 5. Evaluation

```python
from llama_index.core.evaluation import FaithfulnessEvaluator, RelevancyEvaluator

faithfulness = FaithfulnessEvaluator()
relevancy = RelevancyEvaluator()

result = faithfulness.evaluate_response(query="...", response=response)
print(f"Faithful: {result.passing}")
```

## LlamaIndex vs LangChain

| Aspect | LlamaIndex | LangChain |
|--------|------------|-----------|
| Focus | Data + RAG | Chains + Agents |
| Strength | Data ingestion, indexing | Orchestration, tool use |
| RAG quality | Advanced (reranking, sub-questions) | Basic |
| Learning curve | Moderate | Steep |
| Best for | Document Q&A | Complex agent workflows |

## FAQ

**Q: Can I use Claude with LlamaIndex?**
A: Yes, set `llm = Anthropic(model="claude-sonnet-4-20250514")` as the LLM backend.

**Q: What is LlamaCloud?**
A: Managed RAG infrastructure by LlamaIndex. Handles parsing, indexing, and retrieval as a service.

**Q: Can I use it with LangChain?**
A: Yes, LlamaIndex query engines can be used as LangChain tools. They are complementary.

## Source & Thanks

> Created by [LlamaIndex](https://github.com/run-llama). Licensed under MIT.
>
> [run-llama/llama_index](https://github.com/run-llama/llama_index) — 38k+ stars

<!-- ZH -->

## 快速使用

```bash
pip install llama-index
```

三行代码构建文档问答系统。

## 什么是 LlamaIndex？

LLM 数据框架，处理 RAG 全流程：160+ 数据源接入、分块嵌入、多种索引、查询引擎。

**一句话总结**：LLM 数据框架，160+ 数据连接器 + 多种索引 + 查询引擎 + 评估，文档 Q&A 和 RAG 首选，LlamaCloud 托管服务，38k+ stars。

**适合人群**：构建文档问答和 RAG 应用的团队。

## 核心功能

### 1. 160+ 数据源 — 文件/Web/数据库/API
### 2. 多种索引 — 向量/摘要/树形/关键词/知识图谱
### 3. 查询引擎 — Q&A/聊天/重排序/子问题分解

## 常见问题

**Q: 能用 Claude？**
A: 能，设置 Anthropic 作为 LLM 后端。

**Q: 和 LangChain 关系？**
A: 互补，LlamaIndex 专注数据和 RAG，LangChain 专注编排和 Agent。

## 来源与致谢

> [run-llama/llama_index](https://github.com/run-llama/llama_index) — 38k+ stars, MIT

---
Source: https://tokrepo.com/en/workflows/06bf6906-8f31-45d4-b0ae-008f3acb4d14
Author: Prompt Lab