smolagents
Minimalist HuggingFace agent framework where agents write actions as Python code. ~1,000 lines of core logic. Released December 2024. Current version is 1.24.0 (Mar 2026). Releases frequently — weekly cadence. Two agent types: CodeAgent (writes Python code) and ToolCallingAgent (JSON tool calls).
Warnings
- breaking HfApiModel is removed in smolagents >= 1.0. Use InferenceClientModel instead. Causes ImportError: cannot import name 'HfApiModel'.
- breaking DuckDuckGoSearchTool renamed to WebSearchTool in recent versions. Import fails on newer installs.
- gotcha TransformersModel requires torch and transformers installed separately. pip install smolagents alone is not enough for local model inference.
- gotcha huggingface-hub is pinned to <1.0.0 in smolagents. Installing huggingface-hub>=1.0.0 separately breaks the tokenizer build.
- gotcha ManagedAgent is deprecated. Multi-agent orchestration now uses managed_agents parameter directly on CodeAgent.
- gotcha CodeAgent executes LLM-generated Python code locally by default. Use E2B, Docker, or Modal sandboxing in production.
- gotcha API marked experimental — breaking changes ship without major version bumps. Weekly release cadence. Pin version in production.
Install
-
pip install smolagents -
pip install 'smolagents[transformers]' -
pip install 'smolagents[litellm]' -
pip install 'smolagents[openai]'
Imports
- CodeAgent
from smolagents import CodeAgent, InferenceClientModel model = InferenceClientModel() agent = CodeAgent(tools=[], model=model) agent.run('What is 2+2?') - LiteLLMModel
from smolagents import LiteLLMModel model = LiteLLMModel(model_id='anthropic/claude-sonnet-4-5', api_key='...')
- ToolCallingAgent
from smolagents import ToolCallingAgent, InferenceClientModel model = InferenceClientModel() agent = ToolCallingAgent(tools=[], model=model)
Quickstart
from smolagents import CodeAgent, InferenceClientModel, WebSearchTool
import os
# Uses HF_TOKEN env var automatically
model = InferenceClientModel()
agent = CodeAgent(
tools=[WebSearchTool()],
model=model
)
result = agent.run('How many seconds would it take a leopard at full speed to cross the Pont des Arts?')
print(result)