pandas-gbq
pandas-gbq is a Python library that provides a convenient interface to connect pandas DataFrames with Google BigQuery. It simplifies reading data from BigQuery into a pandas.DataFrame and writing DataFrames to BigQuery tables. The current version is 0.34.1, released on 2026-03-26, and the library maintains a regular release cadence, typically with monthly or bi-monthly updates for new features and bug fixes.
Warnings
- breaking Python 2 support was officially dropped as of January 1, 2020. Versions released after this date require Python 3.9 or higher.
- gotcha Authentication is critical. Without proper credentials or a `project_id`, `pandas-gbq` will raise errors (e.g., `ValueError: Could not determine project ID`). Common authentication methods include Application Default Credentials (ADC), service account keys, or user-based OAuth.
- breaking The `to_gbq` function has breaking changes in how it infers BigQuery data types for certain pandas dtypes. Naive (timezone-unaware) datetime columns are now loaded as BigQuery `DATETIME` instead of `TIMESTAMP`. Object columns containing boolean or dictionary values are loaded as `BOOLEAN` or `STRUCT` respectively, instead of `STRING`. `UInt8` columns are now `INT64`.
- gotcha When using `to_gbq`, the default `if_exists` parameter is 'fail', meaning the operation will fail if the destination table already exists.
- deprecated The `auth_local_webserver` parameter's default behavior changed from `False` to `True` in `pandas-gbq` version 1.5.0. This is due to Google deprecating the 'out-of-band' (copy-paste) authentication flow.
Install
-
pip install pandas-gbq
Imports
- read_gbq
import pandas_gbq df = pandas_gbq.read_gbq(query, project_id=project_id)
- to_gbq
import pandas_gbq pandas_gbq.to_gbq(df, destination_table, project_id=project_id)
- pd.io.gbq
import pandas_gbq
Quickstart
import os
import pandas as pd
import pandas_gbq
# Set your Google Cloud Project ID
# It's recommended to set this as an environment variable or via credentials
project_id = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id')
# --- Reading data from BigQuery ---
# Example query from a public dataset
sql_query = """
SELECT country_name, alpha_2_code
FROM `bigquery-public-data.utility_us.country_code_iso`
WHERE alpha_2_code LIKE 'U%'
LIMIT 5
"""
try:
df_read = pandas_gbq.read_gbq(sql_query, project_id=project_id)
print("\n--- Data read from BigQuery ---")
print(df_read)
except Exception as e:
print(f"Error reading from BigQuery: {e}")
print("Please ensure GOOGLE_CLOUD_PROJECT is set and you have authenticated (e.g., `gcloud auth application-default login`).")
# --- Writing data to BigQuery ---
# Create a sample DataFrame to upload
data = {
'col1': [1, 2, 3],
'col2': ['A', 'B', 'C'],
'timestamp_col': pd.to_datetime(['2026-01-01', '2026-01-02', '2026-01-03'])
}
df_write = pd.DataFrame(data)
# Define destination table (dataset.tablename)
destination_table = 'my_test_dataset.my_test_table'
# To avoid errors, you might want to replace the table if it exists for testing
# In production, consider 'append' or 'fail' with proper checks
try:
pandas_gbq.to_gbq(
df_write,
destination_table,
project_id=project_id,
if_exists='replace' # Options: 'fail', 'replace', 'append'
)
print(f"\n--- DataFrame successfully written to {destination_table} in project {project_id} ---")
except Exception as e:
print(f"Error writing to BigQuery: {e}")
print("Ensure 'my_test_dataset' exists in BigQuery or remove 'my_test_dataset.' from 'destination_table' to allow automatic dataset creation if permitted.")