Groq Python SDK
raw JSON → 0.18.0 verified Tue May 12 auth: no python install: verified quickstart: verified
Official Python SDK for GroqCloud API. OpenAI-compatible interface for ultra-low-latency LLM inference on Groq LPU hardware. Model IDs change frequently as models are deprecated and replaced with no versioned aliases.
pip install groq Common errors
error groq.GroqError: The api_key client option must be set either by passing api_key to the client or by setting the GROQ_API_KEY environment variable ↓
cause The Groq API key is not being provided to the client, either directly in the code or via the `GROQ_API_KEY` environment variable.
fix
Set the
GROQ_API_KEY environment variable with your actual API key, or pass it directly to the Groq client constructor: import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY")) or client = Groq(api_key="YOUR_API_KEY"). error ModuleNotFoundError: No module named 'groq' ↓
cause The `groq` Python library has not been installed in your environment or is not accessible within your current Python path.
fix
Install the
groq library using pip: pip install groq. error The model xxx does not exist or you do not have access to it. ↓
cause The specified model ID (xxx) is either incorrect, has been deprecated, or your account does not have permissions to access it. Groq model IDs can change frequently.
fix
Verify the exact model ID from the Groq console or documentation and ensure it's still available and that your account has access. For example, use a currently available model like
llama-3.1-8b-instant or llama-3.3-70b-versatile. error groq.APIConnectionError: Connection error ↓
cause The client failed to establish a network connection to the Groq API, possibly due to network issues, a timeout, or an SSL certificate problem.
fix
Check your internet connection, proxy settings, and ensure there are no firewall rules blocking access to
api.groq.com. If the issue persists, review SSL certificate configurations or try again later as it might be a temporary network issue. error groq.APIStatusError: Error code: 429 - {'error': {'message': 'You are sending requests too quickly. Please retry your request later.'}} ↓
cause You have exceeded the rate limits imposed by the Groq API for the number of requests you can make within a given timeframe.
fix
Implement exponential backoff and retry logic in your application. Reduce the frequency of your API calls or consider upgrading your Groq plan for higher rate limits.
Warnings
breaking groq.cloud.core (pre-release API) is fully removed. ChatCompletion class no longer exists. Any code from early 2024 tutorials is broken. ↓
fix Replace entire client setup with: from groq import Groq; client = Groq(api_key=...)
breaking Models are deprecated and removed with no versioned aliases. gemma-7b-it and mixtral-8x7b-32768 removed. llama-guard-3-8b decommissioned. Hardcoded model IDs break silently. ↓
fix Never hardcode model IDs in production. Query https://api.groq.com/openai/v1/models to get current active models. Check https://console.groq.com/docs/deprecations before each release.
breaking max_tokens is deprecated in favor of max_completion_tokens. Still works but may be removed. ↓
fix Replace max_tokens= with max_completion_tokens= in all chat.completions.create() calls
breaking functions and function_call parameters are deprecated in favor of tools and tool_choice respectively. ↓
fix Migrate function_call pattern to tools=[{type: 'function', function: {...}}] pattern
breaking exclude_domains and include_domains parameters deprecated for agentic tooling. Use search_settings parameter instead. ↓
fix Move domain filtering into search_settings={include_domains: [...]} or search_settings={exclude_domains: [...]}
gotcha n parameter (number of completions) only supports n=1. Passing any other value returns a 400 error. ↓
fix Do not use n > 1. Run multiple requests instead.
gotcha logprobs, presence_penalty, and frequency_penalty are listed in the API but not supported by any current models. Passing them does not error but has no effect. ↓
fix Do not rely on these parameters for model behavior control
gotcha Preview models can be discontinued at short notice. Do not use in production. ↓
fix Use only production-tier models. Check model status at console.groq.com/docs/models.
gotcha Rate limits are per-model and vary significantly. Free tier limits are very low. 429s happen frequently in dev without a paid plan. ↓
fix Check current limits at console.groq.com/settings/limits. Implement exponential backoff on groq.RateLimitError.
breaking The GROQ_API_KEY environment variable is required for client initialization. Failure to set it results in a KeyError. ↓
fix Ensure the GROQ_API_KEY environment variable is set before running the application, for example, by using `export GROQ_API_KEY='your_api_key'` or by loading from a .env file.
breaking The GROQ_API_KEY environment variable is not set, leading to a KeyError during client initialization. ↓
fix Ensure the GROQ_API_KEY environment variable is correctly set in your environment before running the application.
Install
uv add groq Install compatibility verified last tested: 2026-05-12
python os / libc status wheel install import disk
3.10 alpine (musl) - - 0.71s 33.3M
3.10 slim (glibc) - - 0.54s 33M
3.11 alpine (musl) - - 1.02s 36.4M
3.11 slim (glibc) - - 0.86s 36M
3.12 alpine (musl) - - 1.19s 27.9M
3.12 slim (glibc) - - 1.22s 28M
3.13 alpine (musl) - - 1.10s 27.6M
3.13 slim (glibc) - - 1.02s 27M
3.9 alpine (musl) - - 0.67s 32.6M
3.9 slim (glibc) - - 0.62s 32M
Imports
- Groq wrong
from groq.cloud.core import ChatCompletioncorrectfrom groq import Groq - AsyncGroq
from groq import AsyncGroq - aiohttp backend
from groq import DefaultAioHttpClient
Quickstart verified last tested: 2026-05-11
import os
from groq import Groq
client = Groq(api_key=os.environ['GROQ_API_KEY'])
response = client.chat.completions.create(
model='llama-3.3-70b-versatile',
messages=[{'role': 'user', 'content': 'Hello'}]
)
print(response.choices[0].message.content)