Foundry Local Integration for Microsoft Agent Framework
agent-framework-foundry-local is a Python package that provides an integration client for the Microsoft Agent Framework to run large language models (LLMs) locally using Foundry Local. It allows developers to build and orchestrate AI agents that leverage local inference capabilities without needing cloud API keys. The package is currently in beta, while the broader Microsoft Agent Framework recently reached version 1.0.0, indicating active development and a rapid release cadence for the ecosystem.
Common errors
-
agent_framework.exceptions.ServiceResponseException: <class 'agent_framework_foundry_local._foundry_local_client.FoundryLocalClient'> service failed to complete the prompt: Error code: 500 - { 'type': 'https://tools.ietf.org/html/rfc9110#section-15.6.1', 'title': 'Failed to handle openAI completion', 'status': 500, 'detail': "JsonTypeInfo metadata for type 'System.Collections.Generic.List`1[Betalgo.Ranul.OpenAI.ObjectModels.RequestModels.MessageContent]' was not provided by TypeInfoResolver... Path: $.ContentCalculated." }cause The Foundry Local service (which is .NET based) might not correctly handle the multi-modal message content array format (e.g., `Message(contents=[...])`) sent by the Python Agent Framework, leading to a JSON serialization error on the server side.fixThis is a known issue, potentially indicating a mismatch in how the Python client and .NET server handle message serialization. Ensure both the `agent-framework` and `agent-framework-foundry-local` packages are updated to the latest beta/stable versions. If the issue persists, try simplifying message content or reporting the issue to the Microsoft Agent Framework GitHub. -
Service connection errors (Request to local service failed. Uri:http://127.0.0.1:0/foundry/list)
cause The Foundry Local service is either not running, or there are port binding issues preventing the Python client from connecting.fixEnsure the Foundry Local service is started by running `foundry service start` in your terminal. If it's already running, try `foundry service restart` to resolve potential port conflicts or service accessibility problems. -
Slow inference or unexpected CPU-only model usage despite GPU availability.
cause The selected local model might be a CPU-only variant, or Foundry Local is not correctly detecting/utilizing the available GPU hardware. Large models can also naturally be slow on less powerful hardware.fixUse `foundry model list` to check for GPU-optimized model variants (e.g., `phi-4-mini-instruct-cuda-gpu`). Ensure GPU drivers are up-to-date. Stop other AI inference sessions (e.g., AI Toolkit for VS Code) that might be hogging GPU resources. Consider more quantized model variants (e.g., INT8) for better performance.
Warnings
- breaking The broader Microsoft Agent Framework (which `agent-framework-foundry-local` integrates with) underwent significant architectural changes in its 1.0.0 release. Specifically, the `Message` constructor's `text` parameter was removed in favor of `contents=[...]`, and provider designs shifted. Existing code relying on older patterns of the `agent-framework` with `FoundryLocalClient` may break and require migration.
- gotcha This package requires the separate installation and active running of the Foundry Local CLI and its runtime components. `pip install` only installs the Python client library, not the local LLM server.
- gotcha The first time a model is used with Foundry Local, it needs to be downloaded, which can take a significant amount of time and consume considerable disk space depending on the model size.
- gotcha Function/tool calling and structured output capabilities of the agent are dependent on the specific local model chosen and its support within Foundry Local. Not all local models support these advanced features.
- gotcha Foundry Local is designed for on-device inference and local development. It is not intended for distributed, containerized, or multi-machine production deployments.
Install
-
pip install agent-framework-foundry-local --pre -
winget install Microsoft.FoundryLocal (Windows) OR brew tap microsoft/foundrylocal && brew install foundrylocal (macOS)
Imports
- FoundryLocalClient
from agent_framework.foundry import FoundryLocalClient
- Agent
from agent_framework import Agent
Quickstart
import asyncio
from agent_framework import Agent
from agent_framework.foundry import FoundryLocalClient
async def main():
# Set the default local model (e.g., phi-4-mini, qwen2.5) as an environment variable or pass explicitly
# Ensure Foundry Local CLI is installed and 'foundry service start' has been run.
# Optionally, set: os.environ['FOUNDRY_LOCAL_MODEL'] = 'phi-4-mini'
client = FoundryLocalClient(model='phi-4-mini') # Requires 'phi-4-mini' model to be downloaded via Foundry Local CLI
agent = Agent(
client=client,
name='LocalAssistant',
instructions='You are a helpful local assistant that runs entirely on my machine.'
)
print("Local agent created. Asking a question...")
result = await agent.run("Tell me a fun fact about Python.")
print(f"Agent response: {result}")
if __name__ == '__main__':
asyncio.run(main())