{"id":6708,"library":"llama-stack-client","title":"Llama Stack Client Python Library","description":"The official Python library for the Llama Stack API, providing convenient access to its REST API. It includes comprehensive type definitions for request parameters and response fields, and offers both synchronous and asynchronous clients. The library is generated using Stainless and is designed for Python 3.12+ applications. It is currently in active alpha development, with frequent releases.","status":"active","version":"0.7.2","language":"en","source_language":"en","source_url":"https://github.com/llamastack/llama-stack-client-python","tags":["AI","LLM","API Client","Meta","Large Language Models"],"install":[{"cmd":"pip install --pre llama-stack-client","lang":"bash","label":"Install pre-release version"}],"dependencies":[{"reason":"Used as the underlying HTTP client for both synchronous and asynchronous operations.","package":"httpx"}],"imports":[{"symbol":"LlamaStackClient","correct":"from llama_stack_client import LlamaStackClient"}],"quickstart":{"code":"import os\nfrom llama_stack_client import LlamaStackClient\n\n# Ensure a Llama Stack server is running, e.g., locally at http://localhost:8321.\n# Authentication typically uses the LLAMA_STACK_CLIENT_API_KEY environment variable.\n# Example: export LLAMA_STACK_CLIENT_API_KEY=\"your_api_key\"\n# You can also set the base URL via LLAMA_STACK_BASE_URL environment variable.\nclient = LlamaStackClient(\n    base_url=os.environ.get(\"LLAMA_STACK_BASE_URL\", \"http://localhost:8321\"),\n    api_key=os.environ.get(\"LLAMA_STACK_CLIENT_API_KEY\", \"dummy_key_for_testing_if_not_set\"),\n)\n\ntry:\n    # List available models\n    models = client.models.list()\n    print(\"Available models:\", [model.id for model in models.data])\n\n    # Perform simple inference using the Responses API\n    if models.data:\n        response = client.responses.create(\n            model=models.data[0].id, # Use the first available model\n            input=\"Write a haiku about coding.\",\n        )\n        print(\"\\nHaiku from Llama Stack:\", response.output_text)\n    else:\n        print(\"\\nNo models found on the Llama Stack server.\")\n\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\n    print(\"Please ensure your Llama Stack server is running and accessible (e.g., via Docker), and the API key is set correctly.\")","lang":"python","description":"This quickstart demonstrates how to initialize the LlamaStackClient and perform a basic model listing and inference request. It assumes a Llama Stack server is already running and accessible, for instance, locally via Docker."},"warnings":[{"fix":"Update all calls from `client.agents.*` to `client.responses.*`.","message":"The `agents` API was renamed to `responses` API in `v0.7.0-alpha.1`. Code using `client.agents` will no longer work.","severity":"breaking","affected_versions":">=0.7.0-alpha.1"},{"fix":"Review and update code interacting with `client.chat.completions.retrieve()` and `client.files.retrieve()` methods.","message":"Breaking changes were introduced to the `GET /chat/completions/{completion_id}` and `/files/{file_id}` endpoints in `v0.7.0-alpha.1` to eliminate conformance issues.","severity":"breaking","affected_versions":">=0.7.0-alpha.1"},{"fix":"Review documentation and update code using post-training APIs accordingly.","message":"Consistency improvements were made to post-training API endpoints in `v0.6.1-alpha.1`, which may involve API surface changes.","severity":"breaking","affected_versions":">=0.6.1-alpha.1"},{"fix":"Pin your `llama-stack-client` version in `requirements.txt` or `pyproject.toml` (e.g., `llama-stack-client==0.7.2`) to avoid unexpected breakage.","message":"The `llama-stack-client` library is currently in alpha (`--pre`) release status. This implies that breaking changes and API instability are frequent and expected across minor versions. It is recommended to pin exact versions.","severity":"gotcha","affected_versions":"All alpha versions"},{"fix":"Ensure a Llama Stack server is deployed and running (e.g., via Docker) before attempting to use this client.","message":"This library is a client for the Llama Stack API. It requires a separate Llama Stack server instance to be running and accessible. This client library does not include the server itself.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Set the `LLAMA_STACK_CLIENT_API_KEY` environment variable with your actual API key before initializing the client, or pass it explicitly during client instantiation.","message":"For authentication, the client primarily relies on the `LLAMA_STACK_CLIENT_API_KEY` environment variable. If this is not set, API calls may fail or use a dummy key.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Refer to the official Llama Stack documentation for the most up-to-date information on the Responses API capabilities and known limitations.","message":"The Responses API, a central feature for server-side agentic orchestration, is still under active development. While usable, some parts of its OpenAI-compatible implementation may still be unimplemented.","severity":"gotcha","affected_versions":"All alpha versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}