Cette page est affichée en anglais. Une traduction française est en cours.

SkillsMar 29, 2026·1 min de lecture

LlamaIndex — Data Framework for LLM Applications

Connect your data to large language models. The leading framework for RAG, document indexing, knowledge graphs, and structured data extraction.

Script Depot · Community

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 98/100Policy : autoriser

Surface agent

Tout agent MCP/CLI

Type

Skill

Installation

Single

Confiance

Confiance : Established

Point d'entrée

LlamaIndex — Data Framework for LLM Applications

Commande d'installation directe

npx -y tokrepo@latest install 1bd234e2-5c10-459f-91f4-00675625103b --target codex

À exécuter après confirmation du plan en dry-run.

TL;DR

LlamaIndex connects private data to LLMs through indexing, retrieval, and structured extraction.

§01

What it is

LlamaIndex is a data framework that connects your private data to large language models. It provides tools for document ingestion, indexing, retrieval-augmented generation (RAG), knowledge graph construction, and structured data extraction. You feed it documents, databases, or APIs, and it creates queryable indexes that LLMs use to answer questions grounded in your data.

LlamaIndex targets developers building AI applications that need to work with private or domain-specific data. Instead of fine-tuning a model, you use LlamaIndex to retrieve relevant context at query time and inject it into the LLM prompt.

§02

Why it saves time or tokens

Naive RAG implementations stuff entire documents into the prompt, wasting tokens on irrelevant content. LlamaIndex's indexing and retrieval pipeline extracts only the relevant chunks, reducing per-query token consumption. Its chunking strategies, reranking, and metadata filtering ensure the LLM receives high-quality context, producing better answers with fewer tokens.

§03

How to use

Install LlamaIndex: pip install llama-index
Load documents using one of 300+ data connectors (PDF, web, database, API)
Create an index and query it with natural language

§04

Example

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load documents from a directory
documents = SimpleDirectoryReader('./data').load_data()

# Create a vector index
index = VectorStoreIndex.from_documents(documents)

# Query the index
query_engine = index.as_query_engine()
response = query_engine.query('What are the key findings?')
print(response)

Component	Purpose
Data Connectors	Ingest from 300+ sources
Index	Store and organize data for retrieval
Query Engine	Natural language interface to data
Response Synthesizer	Compose LLM answers from chunks
Agent	Autonomous data interaction

§05

Related on TokRepo

AI tools for RAG — RAG frameworks and tools on TokRepo
AI tools for documents — document processing and extraction tools

§06

Common pitfalls

Default chunking settings may split content at bad boundaries; customize chunk_size and chunk_overlap for your document type
Using the default in-memory vector store works for prototyping but does not persist across restarts; switch to a persistent store like ChromaDB or Weaviate for production
LlamaIndex's many abstractions can be confusing for newcomers; start with the simple VectorStoreIndex and add complexity only when needed

Questions fréquentes

How does LlamaIndex compare to LangChain?+

LlamaIndex specializes in data indexing and retrieval, making it the better choice for RAG-heavy applications. LangChain provides broader agent orchestration, tool use, and chain composition. Many projects use both: LlamaIndex for the data layer and LangChain for the orchestration layer. They are complementary rather than competing.

What data sources does LlamaIndex support?+

LlamaIndex has 300+ data connectors via LlamaHub, including PDF, Notion, Slack, Google Drive, databases (PostgreSQL, MySQL), APIs, web scraping, and more. Each connector handles authentication and data extraction, outputting standardized Document objects that LlamaIndex can index.

Can LlamaIndex work with local LLMs?+

Yes. LlamaIndex supports local models through Ollama, Hugging Face, llama.cpp, and other local inference providers. You configure the LLM and embedding model independently, so you can use a local embedding model with a cloud LLM or vice versa.

What is the difference between VectorStoreIndex and other index types?+

VectorStoreIndex stores document chunks as vector embeddings for semantic search. SummaryIndex stores full documents and iterates through them. TreeIndex organizes chunks hierarchically. KeywordTableIndex uses keyword extraction. Choose VectorStoreIndex for most use cases; other types optimize for specific access patterns.

How do I improve RAG accuracy with LlamaIndex?+

Tune chunk size for your content, add metadata filtering to narrow results, use reranking to improve retrieval precision, and experiment with hybrid search (vector plus keyword). LlamaIndex provides all these components as configurable modules. Start with default settings and iterate based on evaluation results.

Sources citées (3)

LlamaIndex GitHub— LlamaIndex is a data framework for LLM applications
LlamaHub— LlamaIndex provides 300+ data connectors via LlamaHub
LlamaIndex Docs— Retrieval-augmented generation improves LLM accuracy on private data

En lien sur TokRepo

RAG tools Document tools Featured workflows

🙏

Source et remerciements

Created by LlamaIndex. Licensed under MIT. run-llama/llama_index — 38K+ GitHub stars

Fil de discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires

Llama Index — Data Framework for LLM Applications

Leading data framework for connecting LLMs to external data. LlamaIndex handles ingestion, indexing, retrieval, and query engines for building production RAG applications.

Skills

Prompt Lab

Apache Flink — Stream Processing Framework for Real-Time Data

Apache Flink is the leading open-source framework for stateful stream processing. It processes unbounded data streams with exactly-once semantics, low latency, and high throughput — powering real-time analytics, fraud detection, and event-driven applications.

Skills

Apache Software Foundation

Nuxt — The Full-Stack Vue Framework

Nuxt is the intuitive full-stack framework built on Vue.js. It provides server-side rendering, auto-imports, file-based routing, data fetching, and a powerful module ecosystem — making Vue applications production-ready with minimal configuration.

Skills

Script Depot

CAMEL — Multi-Agent Framework at Scale

CAMEL is a multi-agent framework for studying scaling laws of AI agents. 16.6K+ GitHub stars. Up to 1M agents, RAG, memory systems, data generation. Apache 2.0.

Skills

Script Depot