Google Cloud BigQuery Client (async)

raw JSON →
7.1.0 verified Tue May 12 auth: no python install: stale

gcloud-aio-bigquery is an asynchronous Python client for Google Cloud BigQuery, built on `asyncio` and `aiohttp`. It's part of the `gcloud-aio-*` family, providing an asynchronous HTTP implementation of Google Cloud client libraries. The current version is 7.1.0 and it maintains an active release cadence.

pip install gcloud-aio-bigquery
error ModuleNotFoundError: No module named 'gcloud-aio-bigquery'
cause The 'gcloud-aio-bigquery' package is not installed in the Python environment.
fix
pip install gcloud-aio-bigquery
error ModuleNotFoundError: No module named 'google.cloud'
cause The 'google-cloud-bigquery' package is not installed in the Python environment.
fix
pip install google-cloud-bigquery
error ImportError: cannot import name 'BigQuery' from 'gcloud.aio.bigquery'
cause Incorrect import statement; 'BigQuery' is not a valid import from 'gcloud.aio.bigquery'.
fix
from gcloud.aio.bigquery import BigQueryClient
error ModuleNotFoundError: No module named 'gcloud.aio.bigquery'
cause The 'gcloud-aio-bigquery' library is not installed in your Python environment or is not accessible in your current Python path.
fix
Install the library using pip: pip install gcloud-aio-bigquery
error ValueError: 'external_account' is not a valid Type
cause This error occurs when using Workload Identity Federation (WIF) credentials, as the `gcloud-aio-auth` library (a dependency of `gcloud-aio-bigquery`) does not yet natively support the 'external_account' credential type.
fix
A possible workaround is to obtain an access token using Google's official google-auth library and then pass this token manually to the gcloud-aio-bigquery client.
gotcha The `gcloud-aio-bigquery` library currently does not support deleting rows from a table via its API. Users requiring this functionality may need to use alternative methods like the standard `google-cloud-bigquery` library or BigQuery's DML statements directly.
fix Use BigQuery DML statements (e.g., `DELETE FROM ... WHERE ...`) via the client's query method, or consider using the synchronous `google-cloud-bigquery` client for direct `delete` API calls if available there.
gotcha There can be confusion between `gcloud-aio-*` (asynchronous) and `gcloud-rest-*` (synchronous) client libraries, as they share a codebase and similar naming conventions. Additionally, the main client classes within `gcloud.aio.<service_name>` modules may not be directly exposed at the top level of the service module (e.g., `gcloud.aio.bigquery`), requiring deeper imports.
fix Always explicitly import from the correct full path for the client class. For `gcloud-aio-*` clients, this often means `from gcloud.aio.<service_name>.<service_name> import <ClientClassName>` (e.g., `from gcloud.aio.bigquery.bigquery import BigqueryClient`), and for `gcloud-rest-*` clients, `from gcloud.rest.<service_name> import <ClientClassName>`. Consult the specific library's documentation or source code for exact import paths.
gotcha The `query_response_to_dict` utility may raise exceptions when processing nullable integer fields that contain `None` values. This can lead to data parsing errors for queries returning sparse data.
fix When dealing with nullable fields, especially integers, implement robust error handling or explicitly cast/check for `None` values before processing. Consider inspecting the raw `result` structure before using helper functions if this issue occurs.
gotcha Overusing `SELECT *` in BigQuery queries can significantly increase query costs and execution time, as BigQuery charges based on the amount of data scanned. This is a fundamental BigQuery best practice that applies to `gcloud-aio-bigquery` as well.
fix Always specify only the columns you need in your `SELECT` statements. Use `LIMIT` and `WHERE` clauses effectively to reduce data scanned.
gotcha The `BigQuery` client class cannot be imported directly using `from gcloud.aio.bigquery import BigQuery`. The primary client class might be named differently (e.g., `Client` or `BigqueryClient`) or reside in a deeper submodule within `gcloud.aio.bigquery`.
fix Verify the exact class name and import path for the BigQuery client. Common patterns for `gcloud-aio` libraries include importing `Client` (e.g., `from gcloud.aio.bigquery import Client`) or `BigqueryClient` from a submodule (e.g., `from gcloud.aio.bigquery.client import BigqueryClient`). Refer to the library's documentation for the correct import statement.
python os / libc status wheel install import disk mem side effects
3.10 alpine (musl) wheel - - 60.1M - broken
3.10 alpine (musl) - - - - - -
3.10 slim (glibc) wheel 6.3s - 62M - broken
3.10 slim (glibc) - - - - - -
3.11 alpine (musl) wheel - - 68.4M - broken
3.11 alpine (musl) - - - - - -
3.11 slim (glibc) wheel 5.1s - 71M - broken
3.11 slim (glibc) - - - - - -
3.12 alpine (musl) wheel - - 62.8M - broken
3.12 alpine (musl) - - - - - -
3.12 slim (glibc) wheel 4.5s - 65M - broken
3.12 slim (glibc) - - - - - -
3.13 alpine (musl) wheel - - 62.2M - broken
3.13 alpine (musl) - - - - - -
3.13 slim (glibc) wheel 4.6s - 64M - broken
3.13 slim (glibc) - - - - - -
3.9 alpine (musl) wheel - - 45.6M - broken
3.9 alpine (musl) - - - - - -
3.9 slim (glibc) wheel 6.9s - 48M - broken
3.9 slim (glibc) - - - - - -

This quickstart demonstrates how to initialize the `BigQuery` client, authenticate using `gcloud-aio-auth`, and execute a simple SQL query against a public BigQuery dataset. Ensure your `GOOGLE_CLOUD_PROJECT` environment variable is set to your Google Cloud project ID, and `GOOGLE_APPLICATION_CREDENTIALS` points to your service account key file for local execution.

import asyncio
import os
import aiohttp
from gcloud.aio.auth import Token
from gcloud.aio.bigquery import BigQuery

async def main():
    # Ensure GOOGLE_CLOUD_PROJECT and GOOGLE_APPLICATION_CREDENTIALS
    # are set in your environment for authentication.
    project = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id') # Replace with your project ID
    
    async with aiohttp.ClientSession() as session:
        # Obtain Google Cloud credentials token
        token = await Token(session=session).get()
        
        # Initialize BigQuery client
        client = BigQuery(project=project, session=session, token=token)

        query = """
            SELECT name, SUM(number) as total_babies
            FROM `bigquery-public-data.usa_names.usa_1910_2013`
            WHERE state = 'TX'
            GROUP BY name
            ORDER BY total_babies DESC
            LIMIT 5
        """
        
        print(f"Executing query for project: {project}")
        job_id, result = await client.query_and_wait(query)
        
        print(f"Query Job ID: {job_id}")
        print("Top 5 baby names in Texas (1910-2013):")
        
        for row in result['rows']:
            print(f"- {row['name']}: {row['total_babies']}")

if __name__ == "__main__":
    asyncio.run(main())