pandas-gbq

raw JSON →
0.34.1 verified Tue May 12 auth: no python install: verified quickstart: reviewed

pandas-gbq is a Python library that provides a convenient interface to connect pandas DataFrames with Google BigQuery. It simplifies reading data from BigQuery into a pandas.DataFrame and writing DataFrames to BigQuery tables. The current version is 0.34.1, released on 2026-03-26, and the library maintains a regular release cadence, typically with monthly or bi-monthly updates for new features and bug fixes.

pip install pandas-gbq
error ModuleNotFoundError: No module named 'pandas_gbq'
cause The `pandas-gbq` library has not been installed or is not accessible in the current Python environment.
fix
Install the library using pip: pip install pandas-gbq
error google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or provide credentials explicitly.
cause The `pandas-gbq` library could not find valid Google Cloud credentials in the environment to authenticate with BigQuery.
fix
Provide credentials explicitly, for example, using a service account key file: from google.oauth2 import service_account; credentials = service_account.Credentials.from_service_account_file('path/to/key.json'); df = pandas_gbq.read_gbq(sql, project_id='your-project-id', credentials=credentials) or set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of your service account key file.
error pandas_gbq.gbq.InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table.
cause The DataFrame being written to BigQuery has a schema (column names, order, or data types) that does not match the schema of the existing destination table.
fix
Ensure your DataFrame's column names, their order, and data types precisely match the BigQuery table's schema. You can also explicitly define the table_schema argument in pandas_gbq.to_gbq() to specify the BigQuery schema or use if_exists='replace' if you intend to overwrite the table and its schema.
error google.api_core.exceptions.NotFound: 404 Not Found: Table project_id:dataset.table_name not found.
cause The specified BigQuery project, dataset, or table does not exist, or the user lacks the necessary permissions to access it.
fix
Verify that the project_id, dataset, and table names are spelled correctly and exist in your Google Cloud Project. Additionally, ensure the authenticated user or service account has bigquery.tables.get and bigquery.dataViewer (or equivalent) permissions for the resource.
breaking Python 2 support was officially dropped as of January 1, 2020. Versions released after this date require Python 3.9 or higher.
fix Upgrade to Python 3.9+ and ensure all project dependencies are compatible.
gotcha Authentication is critical. Without proper credentials or a `project_id`, `pandas-gbq` will raise errors (e.g., `ValueError: Could not determine project ID`). Common authentication methods include Application Default Credentials (ADC), service account keys, or user-based OAuth.
fix Set the `GOOGLE_CLOUD_PROJECT` environment variable. Authenticate using `gcloud auth application-default login`, provide a service account JSON file via the `credentials` parameter, or set `pandas_gbq.context.credentials` and `pandas_gbq.context.project` explicitly.
breaking The `to_gbq` function has breaking changes in how it infers BigQuery data types for certain pandas dtypes. Naive (timezone-unaware) datetime columns are now loaded as BigQuery `DATETIME` instead of `TIMESTAMP`. Object columns containing boolean or dictionary values are loaded as `BOOLEAN` or `STRUCT` respectively, instead of `STRING`. `UInt8` columns are now `INT64`.
fix Review and update BigQuery table schemas if necessary. For `datetime` columns, consider making them timezone-aware (`pd.to_datetime(..., utc=True)`) if `TIMESTAMP` is desired, or explicitly define `table_schema` in `to_gbq`.
gotcha When using `to_gbq`, the default `if_exists` parameter is 'fail', meaning the operation will fail if the destination table already exists.
fix Explicitly set `if_exists='replace'` to overwrite the table, or `if_exists='append'` to add data to an existing table. Always handle this parameter carefully to prevent unintended data loss or duplication.
deprecated The `auth_local_webserver` parameter's default behavior changed from `False` to `True` in `pandas-gbq` version 1.5.0. This is due to Google deprecating the 'out-of-band' (copy-paste) authentication flow.
fix Ensure your environment allows for the local webserver flow (e.g., a browser can open `localhost:808X`). If working in a headless environment, consider using service account authentication.
python os / libc status wheel install import disk
3.10 alpine (musl) sdist - 3.16s 396.7M
3.10 alpine (musl) - - 2.57s 390.1M
3.10 slim (glibc) wheel 15.0s 2.04s 362M
3.10 slim (glibc) - - 1.86s 357M
3.11 alpine (musl) sdist - 3.71s 424.0M
3.11 alpine (musl) - - 4.50s 417.3M
3.11 slim (glibc) wheel 14.4s 2.91s 389M
3.11 slim (glibc) - - 2.90s 383M
3.12 alpine (musl) sdist - 3.78s 416.8M
3.12 alpine (musl) - - 4.60s 410.1M
3.12 slim (glibc) wheel 14.7s 3.35s 382M
3.12 slim (glibc) - - 3.72s 376M
3.13 alpine (musl) sdist - 3.52s 415.9M
3.13 alpine (musl) - - 4.29s 409.1M
3.13 slim (glibc) wheel 13.9s 3.02s 381M
3.13 slim (glibc) - - 4.31s 375M
3.9 alpine (musl) sdist - 2.56s 386.3M
3.9 alpine (musl) - - 2.43s 385.0M
3.9 slim (glibc) wheel 17.9s 2.41s 361M
3.9 slim (glibc) - - 2.11s 359M

This quickstart demonstrates how to read data from a public BigQuery dataset into a pandas DataFrame and write a pandas DataFrame to a new BigQuery table. It assumes you have a Google Cloud project set up and have authenticated (e.g., using `gcloud auth application-default login`). The `project_id` is retrieved from the `GOOGLE_CLOUD_PROJECT` environment variable for robustness.

import os
import pandas as pd
import pandas_gbq

# Set your Google Cloud Project ID
# It's recommended to set this as an environment variable or via credentials
project_id = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id')

# --- Reading data from BigQuery ---
# Example query from a public dataset
sql_query = """
    SELECT country_name, alpha_2_code
    FROM `bigquery-public-data.utility_us.country_code_iso`
    WHERE alpha_2_code LIKE 'U%'
    LIMIT 5
"""

try:
    df_read = pandas_gbq.read_gbq(sql_query, project_id=project_id)
    print("\n--- Data read from BigQuery ---")
    print(df_read)
except Exception as e:
    print(f"Error reading from BigQuery: {e}")
    print("Please ensure GOOGLE_CLOUD_PROJECT is set and you have authenticated (e.g., `gcloud auth application-default login`).")

# --- Writing data to BigQuery ---
# Create a sample DataFrame to upload
data = {
    'col1': [1, 2, 3],
    'col2': ['A', 'B', 'C'],
    'timestamp_col': pd.to_datetime(['2026-01-01', '2026-01-02', '2026-01-03'])
}
df_write = pd.DataFrame(data)

# Define destination table (dataset.tablename)
destination_table = 'my_test_dataset.my_test_table'

# To avoid errors, you might want to replace the table if it exists for testing
# In production, consider 'append' or 'fail' with proper checks
try:
    pandas_gbq.to_gbq(
        df_write,
        destination_table,
        project_id=project_id,
        if_exists='replace' # Options: 'fail', 'replace', 'append'
    )
    print(f"\n--- DataFrame successfully written to {destination_table} in project {project_id} ---")
except Exception as e:
    print(f"Error writing to BigQuery: {e}")
    print("Ensure 'my_test_dataset' exists in BigQuery or remove 'my_test_dataset.' from 'destination_table' to allow automatic dataset creation if permitted.")