SQLAlchemy BigQuery Dialect

1.16.0 · active · verified Sat Mar 28

SQLAlchemy BigQuery Dialect (sqlalchemy-bigquery) is a Python library that provides a SQLAlchemy dialect for connecting to and interacting with Google BigQuery. It enables users to leverage SQLAlchemy's Core and ORM functionalities to query, manipulate, and manage BigQuery data, treating it like a traditional relational database. The library is actively maintained by Google Cloud, with regular releases, and its current version is 1.16.0.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to establish a connection to Google BigQuery using `sqlalchemy-bigquery` and perform basic queries using both raw SQL and SQLAlchemy Core expressions. It assumes Google Cloud authentication is configured, typically via the `GOOGLE_APPLICATION_CREDENTIALS` environment variable or `gcloud` CLI.

import os
from sqlalchemy import create_engine, text, Table, MetaData, Column, String, Integer

# Configure your Google Cloud Project ID. This can also be inferred from the environment.
# For local development, ensure GOOGLE_APPLICATION_CREDENTIALS points to your service account key file.
PROJECT_ID = os.environ.get('GCP_PROJECT_ID', 'your-gcp-project-id')
DATASET_ID = os.environ.get('BIGQUERY_DATASET_ID', 'your_dataset_id')
TABLE_NAME = os.environ.get('BIGQUERY_TABLE_NAME', 'your_table_name')

# Connection string format: 'bigquery://[PROJECT_ID]/[DATASET_ID]'
# If PROJECT_ID is omitted, it will try to infer from the environment.
# Authentication typically happens via GOOGLE_APPLICATION_CREDENTIALS env var or gcloud CLI.
engine = create_engine(f'bigquery://{PROJECT_ID}/{DATASET_ID}')

try:
    # Example 1: Execute a raw SQL query
    with engine.connect() as connection:
        result = connection.execute(text(f"SELECT 1 FROM `{PROJECT_ID}.{DATASET_ID}.{TABLE_NAME}` LIMIT 1"))
        print(f"Raw SQL query result: {result.scalar()}")

    # Example 2: Reflect table metadata and query using SQLAlchemy Core
    metadata = MetaData()
    my_table = Table(TABLE_NAME, metadata, autoload_with=engine)

    with engine.connect() as connection:
        # Select all columns from the table, limit to 5 rows
        select_stmt = my_table.select().limit(5)
        result = connection.execute(select_stmt)
        print("\nQuerying table via SQLAlchemy Core (first 5 rows):")
        for row in result:
            print(row)

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure GOOGLE_APPLICATION_CREDENTIALS is set, your project/dataset/table exist, and you have permissions.")

view raw JSON →