Groq Python SDK

0.18.0 verified Tue May 12 auth: no python install: verified quickstart: verified

Official Python SDK for GroqCloud API. OpenAI-compatible interface for ultra-low-latency LLM inference on Groq LPU hardware. Model IDs change frequently as models are deprecated and replaced with no versioned aliases.

pip install groq

Common errors

error groq.GroqError: The api_key client option must be set either by passing api_key to the client or by setting the GROQ_API_KEY environment variable ↓

cause The Groq API key is not being provided to the client, either directly in the code or via the `GROQ_API_KEY` environment variable.

fix

Set the GROQ_API_KEY environment variable with your actual API key, or pass it directly to the Groq client constructor:

import os
from groq import Groq

client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

or client = Groq(api_key="YOUR_API_KEY").

error ModuleNotFoundError: No module named 'groq' ↓

cause The `groq` Python library has not been installed in your environment or is not accessible within your current Python path.

fix

Install the groq library using pip: pip install groq.

error The model xxx does not exist or you do not have access to it. ↓

cause The specified model ID (xxx) is either incorrect, has been deprecated, or your account does not have permissions to access it. Groq model IDs can change frequently.

fix

Verify the exact model ID from the Groq console or documentation and ensure it's still available and that your account has access. For example, use a currently available model like llama-3.1-8b-instant or llama-3.3-70b-versatile.

error groq.APIConnectionError: Connection error ↓

cause The client failed to establish a network connection to the Groq API, possibly due to network issues, a timeout, or an SSL certificate problem.

fix

Check your internet connection, proxy settings, and ensure there are no firewall rules blocking access to api.groq.com. If the issue persists, review SSL certificate configurations or try again later as it might be a temporary network issue.

error groq.APIStatusError: Error code: 429 - {'error': {'message': 'You are sending requests too quickly. Please retry your request later.'}} ↓

cause You have exceeded the rate limits imposed by the Groq API for the number of requests you can make within a given timeframe.

fix

Implement exponential backoff and retry logic in your application. Reduce the frequency of your API calls or consider upgrading your Groq plan for higher rate limits.

Warnings

breaking groq.cloud.core (pre-release API) is fully removed. ChatCompletion class no longer exists. Any code from early 2024 tutorials is broken. ↓

fix Replace entire client setup with: from groq import Groq; client = Groq(api_key=...)

breaking Models are deprecated and removed with no versioned aliases. gemma-7b-it and mixtral-8x7b-32768 removed. llama-guard-3-8b decommissioned. Hardcoded model IDs break silently. ↓

fix Never hardcode model IDs in production. Query https://api.groq.com/openai/v1/models to get current active models. Check https://console.groq.com/docs/deprecations before each release.

breaking max_tokens is deprecated in favor of max_completion_tokens. Still works but may be removed. ↓

fix Replace max_tokens= with max_completion_tokens= in all chat.completions.create() calls

breaking functions and function_call parameters are deprecated in favor of tools and tool_choice respectively. ↓

fix Migrate function_call pattern to tools=[{type: 'function', function: {...}}] pattern

breaking exclude_domains and include_domains parameters deprecated for agentic tooling. Use search_settings parameter instead. ↓

fix Move domain filtering into search_settings={include_domains: [...]} or search_settings={exclude_domains: [...]}

gotcha n parameter (number of completions) only supports n=1. Passing any other value returns a 400 error. ↓

fix Do not use n > 1. Run multiple requests instead.

gotcha logprobs, presence_penalty, and frequency_penalty are listed in the API but not supported by any current models. Passing them does not error but has no effect. ↓

fix Do not rely on these parameters for model behavior control

gotcha Preview models can be discontinued at short notice. Do not use in production. ↓

fix Use only production-tier models. Check model status at console.groq.com/docs/models.

gotcha Rate limits are per-model and vary significantly. Free tier limits are very low. 429s happen frequently in dev without a paid plan. ↓

fix Check current limits at console.groq.com/settings/limits. Implement exponential backoff on groq.RateLimitError.

breaking The GROQ_API_KEY environment variable is required for client initialization. Failure to set it results in a KeyError. ↓

fix Ensure the GROQ_API_KEY environment variable is set before running the application, for example, by using `export GROQ_API_KEY='your_api_key'` or by loading from a .env file.

breaking The GROQ_API_KEY environment variable is not set, leading to a KeyError during client initialization. ↓

fix Ensure the GROQ_API_KEY environment variable is correctly set in your environment before running the application.

Install

uv add groq

Install compatibility verified last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) - - 0.71s 33.3M

3.10 slim (glibc) - - 0.54s 33M

3.11 alpine (musl) - - 1.02s 36.4M

3.11 slim (glibc) - - 0.86s 36M

3.12 alpine (musl) - - 1.19s 27.9M

3.12 slim (glibc) - - 1.22s 28M

3.13 alpine (musl) - - 1.10s 27.6M

3.13 slim (glibc) - - 1.02s 27M

3.9 alpine (musl) - - 0.67s 32.6M

3.9 slim (glibc) - - 0.62s 32M

Imports

Groq
wrong
```
from groq.cloud.core import ChatCompletion
```
correct
```
from groq import Groq
```
groq.cloud.core was the original pre-release SDK (v0.3.0, 2024). Completely removed. All current code uses from groq import Groq.
AsyncGroq
```
from groq import AsyncGroq
```
Drop-in async version of Groq client. Same interface, use await.
aiohttp backend
```
from groq import DefaultAioHttpClient
```
Pass as http_client=DefaultAioHttpClient() to AsyncGroq for better concurrency.

Quickstart verified last tested: 2026-05-11

Minimal chat completion

import os
from groq import Groq

client = Groq(api_key=os.environ['GROQ_API_KEY'])

response = client.chat.completions.create(
    model='llama-3.3-70b-versatile',
    messages=[{'role': 'user', 'content': 'Hello'}]
)
print(response.choices[0].message.content)