Smolagents — Lightweight AI Agent Framework
Hugging Face minimalist agent framework. Build AI agents in ~1000 lines of code with code-based actions, tool calling, and multi-step reasoning.
What it is
Smolagents is Hugging Face's lightweight agent framework. The core library is approximately 1,000 lines of code. Agents write and execute Python code as their actions instead of outputting JSON tool calls, making them more flexible and easier to debug.
The framework targets researchers and developers who want to prototype agents quickly without the overhead of larger frameworks. It supports OpenAI, Anthropic, Hugging Face, and local models.
How it saves time or tokens
Smolagents uses code-based actions where the agent writes Python directly. This avoids the overhead of defining JSON schemas for every tool and parsing structured outputs. The agent can compose tools, use variables, and write loops -- capabilities that JSON-based tool calling cannot express.
The minimal codebase means fewer abstractions to learn and debug. You can read the entire source in an afternoon.
How to use
- Install with
pip install smolagents. - Define tools using the
@tooldecorator with type hints and docstrings. - Create a
CodeAgentwith your tools and model, then callagent.run()with a natural language query.
Example
from smolagents import CodeAgent, tool, HfApiModel
@tool
def get_weather(city: str) -> str:
'''Get current weather for a city.'''
return f'Sunny, 22C in {city}'
agent = CodeAgent(
tools=[get_weather],
model=HfApiModel()
)
result = agent.run('What is the weather in Paris?')
print(result)
Related on TokRepo
- AI tools for agents -- Agent frameworks and orchestration tools
- Multi-agent frameworks -- Compare multi-agent approaches
Common pitfalls
- Code agents execute Python code generated by the LLM. This is powerful but carries security risks. Run agents in sandboxed environments when processing untrusted input.
- The
@tooldecorator requires a docstring. Without it, the agent has no description of what the tool does and will not use it correctly. - Smolagents is intentionally minimal. If you need built-in memory, RAG, or team orchestration, consider pairing it with a larger framework or building those components yourself.
Frequently Asked Questions
Smolagents agents write Python code as actions, while LangChain agents use JSON-based tool calling. Code actions are more expressive -- the agent can use loops, variables, and compose tools in ways JSON cannot. Smolagents is also much smaller: ~1000 lines vs LangChain's thousands of files.
Smolagents supports OpenAI, Anthropic, Hugging Face Inference API, and local models. You can use any model that accepts text input and returns text output by implementing a simple model interface.
Code execution agents carry inherent risks since the LLM generates arbitrary Python code. For production use, run agents in Docker containers or sandboxed environments. Smolagents provides a local Python executor by default, so any code the agent writes runs in your process.
Yes. Smolagents works with local models through the Hugging Face Transformers library or any OpenAI-compatible API endpoint. Point the model configuration to your local server address.
Smolagents supports a managed agent pattern where one agent can call another agent as a tool. This enables simple multi-agent delegation without requiring a separate orchestration framework.
Citations (3)
- Smolagents GitHub— Smolagents is Hugging Face lightweight agent framework
- Smolagents Documentation— Agents write and execute Python code as actions
- Hugging Face Blog— Core library is approximately 1000 lines of code
Related on TokRepo
Source & Thanks
Created by Hugging Face. Licensed under Apache 2.0. huggingface/smolagents
Discussion
Related Assets
Flax — Neural Network Library for JAX
A high-performance neural network library built on JAX, providing a flexible module system used extensively across Google DeepMind and the JAX research community.
PyCaret — Low-Code Machine Learning in Python
An open-source AutoML library that wraps scikit-learn, XGBoost, LightGBM, CatBoost, and other ML libraries into a unified low-code interface for rapid experimentation.
DGL — Deep Graph Library for Scalable Graph Neural Networks
A high-performance framework for building graph neural networks on top of PyTorch, TensorFlow, or MXNet, designed for both research prototyping and production-scale graph learning.