LangChain LiteLLM Integration
langchain-litellm is an integration package that connects LangChain with LiteLLM, a library designed to simplify calling and managing over 100 Large Language Models (LLMs) from various providers (e.g., Anthropic, Azure, Huggingface). It provides a unified interface for chat models, embeddings, and OCR document loading within the LangChain framework. The library is actively maintained with frequent patch and minor releases, adhering to semantic versioning, and is currently at version 0.6.4.
Warnings
- breaking Critical supply chain attack on `litellm` (versions 1.82.7 and 1.82.8) in March 2026. These versions contained credential-stealing malware. `langchain-litellm` version 0.6.2 and above explicitly excludes these compromised `litellm` versions from its dependencies.
- gotcha When using Claude models, `tool_choice` might be automatically downgraded to `auto` if 'thinking' is enabled. This can alter expected tool-use behavior.
- gotcha A common error is `litellm.BadRequestError: LLM Provider NOT provided`. This occurs when the underlying LLM provider for LiteLLM is not correctly specified or configured, often due to missing `model` parameters or incorrect API key environment variables.
- gotcha Integrating `langchain-litellm` with a LiteLLM Proxy often requires special handling for authentication headers (e.g., `Authorization: Bearer <token>`). LangChain's internal HTTP request mechanisms might not easily expose the ability to inject these custom headers, leading to integration difficulties.
- gotcha Version `0.6.4` included a fix to `extract reasoning tokens and handle pydantic usage in metadata`. This could imply subtle changes or sensitivities related to Pydantic versions and how structured outputs or metadata are processed, which is a frequent source of issues in the LangChain ecosystem.
Install
-
pip install langchain-litellm
Imports
- ChatLiteLLM
from langchain_litellm import ChatLiteLLM
- ChatLiteLLMRouter
from langchain_litellm import ChatLiteLLMRouter
- LiteLLMEmbeddings
from langchain_litellm import LiteLLMEmbeddings
- LiteLLMEmbeddingsRouter
from langchain_litellm import LiteLLMEmbeddingsRouter
- LiteLLMOCRLoader
from langchain_litellm import LiteLLMOCRLoader
Quickstart
import os
from langchain_litellm import ChatLiteLLM
from langchain_core.messages import HumanMessage
# Set your API key for LiteLLM's underlying provider (e.g., OpenAI)
# For a real application, use a secure method to manage API keys.
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "sk-your-openai-key")
# Instantiate ChatLiteLLM, specifying the model in LiteLLM's format
# (e.g., 'openai/gpt-3.5-turbo' for OpenAI)
chat_model = ChatLiteLLM(model="openai/gpt-3.5-turbo")
# Invoke the chat model
response = chat_model.invoke([HumanMessage(content="Hello, how are you?")])
print(response.content)
# Example for LiteLLMEmbeddings
from langchain_litellm import LiteLLMEmbeddings
# Note: API key can be passed explicitly if not in environment for embeddings
embeddings = LiteLLMEmbeddings(
model="openai/text-embedding-3-small",
api_key=os.environ.get("OPENAI_API_KEY", "sk-your-openai-key")
)
text = "This is a test document."
embedding = embeddings.embed_query(text)
print(f"Embedding length: {len(embedding)}")