Groq Python SDK
Official Python SDK for GroqCloud API. OpenAI-compatible interface for ultra-low-latency LLM inference on Groq LPU hardware. Model IDs change frequently as models are deprecated and replaced with no versioned aliases.
Common errors
-
groq.GroqError: The api_key client option must be set either by passing api_key to the client or by setting the GROQ_API_KEY environment variable
cause The Groq API key is not being provided to the client, either directly in the code or via the `GROQ_API_KEY` environment variable.fixSet the `GROQ_API_KEY` environment variable with your actual API key, or pass it directly to the `Groq` client constructor: `import os from groq import Groq client = Groq(api_key=os.environ.get("GROQ_API_KEY"))` or `client = Groq(api_key="YOUR_API_KEY")`. -
ModuleNotFoundError: No module named 'groq'
cause The `groq` Python library has not been installed in your environment or is not accessible within your current Python path.fixInstall the `groq` library using pip: `pip install groq`. -
The model xxx does not exist or you do not have access to it.
cause The specified model ID (xxx) is either incorrect, has been deprecated, or your account does not have permissions to access it. Groq model IDs can change frequently.fixVerify the exact model ID from the Groq console or documentation and ensure it's still available and that your account has access. For example, use a currently available model like `llama-3.1-8b-instant` or `llama-3.3-70b-versatile`. -
groq.APIConnectionError: Connection error
cause The client failed to establish a network connection to the Groq API, possibly due to network issues, a timeout, or an SSL certificate problem.fixCheck your internet connection, proxy settings, and ensure there are no firewall rules blocking access to `api.groq.com`. If the issue persists, review SSL certificate configurations or try again later as it might be a temporary network issue. -
groq.APIStatusError: Error code: 429 - {'error': {'message': 'You are sending requests too quickly. Please retry your request later.'}}cause You have exceeded the rate limits imposed by the Groq API for the number of requests you can make within a given timeframe.fixImplement exponential backoff and retry logic in your application. Reduce the frequency of your API calls or consider upgrading your Groq plan for higher rate limits.
Warnings
- breaking groq.cloud.core (pre-release API) is fully removed. ChatCompletion class no longer exists. Any code from early 2024 tutorials is broken.
- breaking Models are deprecated and removed with no versioned aliases. gemma-7b-it and mixtral-8x7b-32768 removed. llama-guard-3-8b decommissioned. Hardcoded model IDs break silently.
- breaking max_tokens is deprecated in favor of max_completion_tokens. Still works but may be removed.
- breaking functions and function_call parameters are deprecated in favor of tools and tool_choice respectively.
- breaking exclude_domains and include_domains parameters deprecated for agentic tooling. Use search_settings parameter instead.
- gotcha n parameter (number of completions) only supports n=1. Passing any other value returns a 400 error.
- gotcha logprobs, presence_penalty, and frequency_penalty are listed in the API but not supported by any current models. Passing them does not error but has no effect.
- gotcha Preview models can be discontinued at short notice. Do not use in production.
- gotcha Rate limits are per-model and vary significantly. Free tier limits are very low. 429s happen frequently in dev without a paid plan.
- breaking The GROQ_API_KEY environment variable is required for client initialization. Failure to set it results in a KeyError.
- breaking The GROQ_API_KEY environment variable is not set, leading to a KeyError during client initialization.
Install
-
pip install groq -
uv add groq
Imports
- Groq
from groq.cloud.core import ChatCompletion
from groq import Groq
- AsyncGroq
from groq import AsyncGroq
- aiohttp backend
from groq import DefaultAioHttpClient
Quickstart
import os
from groq import Groq
client = Groq(api_key=os.environ['GROQ_API_KEY'])
response = client.chat.completions.create(
model='llama-3.3-70b-versatile',
messages=[{'role': 'user', 'content': 'Hello'}]
)
print(response.choices[0].message.content)