Google Cloud BigQuery
Official Python client for Google BigQuery. Current version: 3.40.1 (Mar 2026). v3.0 made google-cloud-bigquery-storage and pyarrow required dependencies. Authentication uses Application Default Credentials (ADC) — no explicit API key. query() returns a QueryJob — must call .result() to wait for completion. to_dataframe() dtype behavior changed in v3 (nullable pandas dtypes). Python 3.9+ required as of v3.x.
Warnings
- breaking google-cloud-bigquery-storage and pyarrow are now required dependencies in v3.x (previously optional). Installing v3 without them causes ImportError on to_dataframe().
- breaking to_dataframe() dtype mappings changed in v3. INT64 → Int64 (nullable), BOOLEAN → boolean (nullable), DATE → dbdate. Code checking dtype == 'int64' or 'bool' will fail silently or raise.
- gotcha client.query() returns a QueryJob — NOT results. It starts the job asynchronously. Must call .result() to block and wait for completion before iterating rows.
- gotcha Authentication uses Application Default Credentials (ADC) — no API key. Locally: run 'gcloud auth application-default login'. In production: use service account key or Workload Identity. Missing credentials raises DefaultCredentialsError.
- gotcha BigQuery table references use backtick syntax in SQL: `project.dataset.table`. Using regular quotes raises BadRequest syntax error.
- gotcha StandardSqlDataType and related types moved from google.cloud.bigquery_v2 to google.cloud.bigquery in v3. Old imports raise ImportError.
- gotcha Python 3.7 and 3.8 support dropped in v3.x. New major version in Q4 2024 dropped these Python versions.
Install
-
pip install google-cloud-bigquery -
pip install 'google-cloud-bigquery[pandas]' -
pip install 'google-cloud-bigquery[pandas,pyarrow]'
Imports
- Client (basic query)
from google.cloud import bigquery # ADC auth — uses GOOGLE_APPLICATION_CREDENTIALS env var # or gcloud CLI credentials locally client = bigquery.Client(project='my-project') # query() returns a QueryJob — NOT results query = """ SELECT name, COUNT(*) as count FROM `bigquery-public-data.usa_names.usa_1910_2013` WHERE state = 'TX' GROUP BY name ORDER BY count DESC LIMIT 10 """ query_job = client.query(query) # starts job rows = query_job.result() # BLOCKS until complete for row in rows: print(row.name, row.count) - to_dataframe
from google.cloud import bigquery client = bigquery.Client(project='my-project') query = 'SELECT id, name, created_at FROM `myproject.mydataset.mytable` LIMIT 100' # to_dataframe() requires pandas + pyarrow + db-dtypes df = client.query(query).result().to_dataframe() print(df.dtypes) # id: Int64 (nullable) — not int64 # name: object # created_at: dbdate
Quickstart
# pip install 'google-cloud-bigquery[pandas]'
# Set up ADC: gcloud auth application-default login
# or set GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
from google.cloud import bigquery
client = bigquery.Client(project='my-gcp-project')
# Query public dataset
query = """
SELECT
name,
SUM(number) as total
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE state = 'CA'
GROUP BY name
ORDER BY total DESC
LIMIT 5
"""
# Run query and wait for results
query_job = client.query(query)
rows = query_job.result() # blocks until done
for row in rows:
print(f'{row.name}: {row.total}')
# As DataFrame
df = rows.to_dataframe()
print(df.head())