Scripts2026年3月31日·1 分钟阅读

Llama Stack — Meta Official LLM App Framework

Official Meta framework for building LLM applications with Llama models. Inference, safety, RAG, agents, evals, and tool use. Standardized APIs. 8.3K+ stars.

TO
TokRepo精选 · Community
快速使用

先拿来用,再决定要不要深挖

这里应该同时让用户和 Agent 知道第一步该复制什么、安装什么、落到哪里。

pip install llama-stack
llama stack build --template ollama --image-type conda
llama stack run ollama

Or use the client:

from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:8321")
response = client.inference.chat_completion(
    model_id="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.completion_message.content.text)

介绍

Llama Stack is Meta's official framework for building LLM applications with Llama models. It provides standardized APIs for inference, safety (Llama Guard), RAG, agentic workflows, evaluations, tool use, and memory — all designed to work seamlessly with Llama 3, 3.1, and 3.2 models. Deploy locally, in the cloud, or on-device. 8,300+ GitHub stars, MIT licensed.

Best for: Developers building production apps with Meta's Llama models Works with: Llama 3/3.1/3.2, Ollama, Together, Fireworks, AWS Bedrock, NVIDIA NIM


Core APIs

API Description
Inference Chat completion, text generation, embeddings
Safety Content moderation with Llama Guard / Prompt Guard
Agents Multi-step agentic workflows with tool use and memory
RAG Document ingestion, vector search, contextual retrieval
Eval Benchmark and evaluate model quality
Memory Persistent memory banks for agent context
Tool Use Web search, code execution, Wolfram Alpha, custom tools

Distribution Providers

Run anywhere with pluggable backends:

  • Local: Ollama, vLLM, TGI
  • Cloud: Together, Fireworks, AWS Bedrock, NVIDIA NIM
  • On-device: Qualcomm, MediaTek, PyTorch ExecuTorch

FAQ

Q: What is Llama Stack? A: Meta's official framework for building LLM apps with Llama models. Provides standardized APIs for inference, safety, RAG, agents, and evals. 8.3K+ stars, MIT licensed.

Q: Can I use Llama Stack with non-Llama models? A: Llama Stack is designed for Llama models, but inference providers like Ollama and vLLM can serve other models through the same API.


🙏

来源与感谢

Created by Meta. Licensed under MIT. meta-llama/llama-stack — 8,300+ GitHub stars

相关资产