{"id":8811,"library":"agent-framework-foundry-local","title":"Foundry Local Integration for Microsoft Agent Framework","description":"agent-framework-foundry-local is a Python package that provides an integration client for the Microsoft Agent Framework to run large language models (LLMs) locally using Foundry Local. It allows developers to build and orchestrate AI agents that leverage local inference capabilities without needing cloud API keys. The package is currently in beta, while the broader Microsoft Agent Framework recently reached version 1.0.0, indicating active development and a rapid release cadence for the ecosystem.","status":"active","version":"1.0.0b260409","language":"en","source_language":"en","source_url":"https://github.com/leestott/agentframework--foundrylocal","tags":["AI Agents","Local Inference","Microsoft Agent Framework","LLM","Foundry Local","Offline AI"],"install":[{"cmd":"pip install agent-framework-foundry-local --pre","lang":"bash","label":"Install beta package"},{"cmd":"winget install Microsoft.FoundryLocal (Windows) OR brew tap microsoft/foundrylocal && brew install foundrylocal (macOS)","lang":"bash","label":"Install Foundry Local CLI (Prerequisite)"}],"dependencies":[{"reason":"This package is an integration for the Microsoft Agent Framework, which provides the core agent abstractions and runtime. FoundryLocalClient is typically paired with a standard Agent from the core framework.","package":"agent-framework","optional":false},{"reason":"Foundry Local is a separate CLI tool and runtime that manages and serves local language models, providing an OpenAI-compatible API that agent-framework-foundry-local connects to.","package":"Foundry Local CLI/runtime","optional":false}],"imports":[{"note":"FoundryLocalClient is the primary class for connecting to the local Foundry service.","symbol":"FoundryLocalClient","correct":"from agent_framework.foundry import FoundryLocalClient"},{"note":"Used to create and manage the agent instance, typically configured with a FoundryLocalClient.","symbol":"Agent","correct":"from agent_framework import Agent"}],"quickstart":{"code":"import asyncio\nfrom agent_framework import Agent\nfrom agent_framework.foundry import FoundryLocalClient\n\nasync def main():\n    # Set the default local model (e.g., phi-4-mini, qwen2.5) as an environment variable or pass explicitly\n    # Ensure Foundry Local CLI is installed and 'foundry service start' has been run.\n    # Optionally, set: os.environ['FOUNDRY_LOCAL_MODEL'] = 'phi-4-mini'\n\n    client = FoundryLocalClient(model='phi-4-mini') # Requires 'phi-4-mini' model to be downloaded via Foundry Local CLI\n    agent = Agent(\n        client=client,\n        name='LocalAssistant',\n        instructions='You are a helpful local assistant that runs entirely on my machine.'\n    )\n\n    print(\"Local agent created. Asking a question...\")\n    result = await agent.run(\"Tell me a fun fact about Python.\")\n    print(f\"Agent response: {result}\")\n\nif __name__ == '__main__':\n    asyncio.run(main())\n","lang":"python","description":"This quickstart demonstrates how to create and run a simple AI agent using FoundryLocalClient to connect to a locally hosted LLM via the Foundry Local runtime. Ensure the Foundry Local CLI is installed, the service is running (`foundry service start`), and the specified model (e.g., `phi-4-mini`) is available locally."},"warnings":[{"fix":"Update `agent-framework` to 1.0.0 or later and revise message construction to use `Message(contents=[...])`. Consult the official migration guide for Agent Framework 1.0.0.","message":"The broader Microsoft Agent Framework (which `agent-framework-foundry-local` integrates with) underwent significant architectural changes in its 1.0.0 release. Specifically, the `Message` constructor's `text` parameter was removed in favor of `contents=[...]`, and provider designs shifted. Existing code relying on older patterns of the `agent-framework` with `FoundryLocalClient` may break and require migration.","severity":"breaking","affected_versions":"agent-framework < 1.0.0 (when migrating to >=1.0.0)"},{"fix":"Before running Python code, install Foundry Local via `winget install Microsoft.FoundryLocal` (Windows) or `brew tap microsoft/foundrylocal && brew install foundrylocal` (macOS), then start the service with `foundry service start`.","message":"This package requires the separate installation and active running of the Foundry Local CLI and its runtime components. `pip install` only installs the Python client library, not the local LLM server.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Plan for initial model download time. Use `foundry model list` to see available models and `foundry model run <model_name>` or `foundry model download <model_name>` to pre-download models if desired.","message":"The first time a model is used with Foundry Local, it needs to be downloaded, which can take a significant amount of time and consume considerable disk space depending on the model size.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consult Foundry Local documentation for specific models' capabilities. The `FoundryLocalClient.manager` helper can be used to inspect the local catalog and supported features.","message":"Function/tool calling and structured output capabilities of the agent are dependent on the specific local model chosen and its support within Foundry Local. Not all local models support these advanced features.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For production deployments requiring scalability and managed services, consider using `agent-framework-foundry` with Azure AI Foundry's hosted services rather than `agent-framework-foundry-local`.","message":"Foundry Local is designed for on-device inference and local development. It is not intended for distributed, containerized, or multi-machine production deployments.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"This is a known issue, potentially indicating a mismatch in how the Python client and .NET server handle message serialization. Ensure both the `agent-framework` and `agent-framework-foundry-local` packages are updated to the latest beta/stable versions. If the issue persists, try simplifying message content or reporting the issue to the Microsoft Agent Framework GitHub.","cause":"The Foundry Local service (which is .NET based) might not correctly handle the multi-modal message content array format (e.g., `Message(contents=[...])`) sent by the Python Agent Framework, leading to a JSON serialization error on the server side.","error":"agent_framework.exceptions.ServiceResponseException: <class 'agent_framework_foundry_local._foundry_local_client.FoundryLocalClient'> service failed to complete the prompt: Error code: 500 - { 'type': 'https://tools.ietf.org/html/rfc9110#section-15.6.1', 'title': 'Failed to handle openAI completion', 'status': 500, 'detail': \"JsonTypeInfo metadata for type 'System.Collections.Generic.List`1[Betalgo.Ranul.OpenAI.ObjectModels.RequestModels.MessageContent]' was not provided by TypeInfoResolver... Path: $.ContentCalculated.\" }"},{"fix":"Ensure the Foundry Local service is started by running `foundry service start` in your terminal. If it's already running, try `foundry service restart` to resolve potential port conflicts or service accessibility problems.","cause":"The Foundry Local service is either not running, or there are port binding issues preventing the Python client from connecting.","error":"Service connection errors (Request to local service failed. Uri:http://127.0.0.1:0/foundry/list)"},{"fix":"Use `foundry model list` to check for GPU-optimized model variants (e.g., `phi-4-mini-instruct-cuda-gpu`). Ensure GPU drivers are up-to-date. Stop other AI inference sessions (e.g., AI Toolkit for VS Code) that might be hogging GPU resources. Consider more quantized model variants (e.g., INT8) for better performance.","cause":"The selected local model might be a CPU-only variant, or Foundry Local is not correctly detecting/utilizing the available GPU hardware. Large models can also naturally be slow on less powerful hardware.","error":"Slow inference or unexpected CPU-only model usage despite GPU availability."}]}