Google Cloud BigQuery Client (async)
gcloud-aio-bigquery is an asynchronous Python client for Google Cloud BigQuery, built on `asyncio` and `aiohttp`. It's part of the `gcloud-aio-*` family, providing an asynchronous HTTP implementation of Google Cloud client libraries. The current version is 7.1.0 and it maintains an active release cadence.
Warnings
- gotcha The `gcloud-aio-bigquery` library currently does not support deleting rows from a table via its API. Users requiring this functionality may need to use alternative methods like the standard `google-cloud-bigquery` library or BigQuery's DML statements directly.
- gotcha There can be confusion between `gcloud-aio-*` (asynchronous) and `gcloud-rest-*` (synchronous) client libraries, as they share a codebase and similar naming conventions. Ensure you are importing and using the correct `gcloud.aio.*` modules for asynchronous operations.
- gotcha The `query_response_to_dict` utility may raise exceptions when processing nullable integer fields that contain `None` values. This can lead to data parsing errors for queries returning sparse data.
- gotcha Overusing `SELECT *` in BigQuery queries can significantly increase query costs and execution time, as BigQuery charges based on the amount of data scanned. This is a fundamental BigQuery best practice that applies to `gcloud-aio-bigquery` as well.
Install
-
pip install gcloud-aio-bigquery
Imports
- BigQuery
from gcloud.aio.bigquery import BigQuery
- Token
from gcloud.aio.auth import Token
Quickstart
import asyncio
import os
import aiohttp
from gcloud.aio.auth import Token
from gcloud.aio.bigquery import BigQuery
async def main():
# Ensure GOOGLE_CLOUD_PROJECT and GOOGLE_APPLICATION_CREDENTIALS
# are set in your environment for authentication.
project = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id') # Replace with your project ID
async with aiohttp.ClientSession() as session:
# Obtain Google Cloud credentials token
token = await Token(session=session).get()
# Initialize BigQuery client
client = BigQuery(project=project, session=session, token=token)
query = """
SELECT name, SUM(number) as total_babies
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE state = 'TX'
GROUP BY name
ORDER BY total_babies DESC
LIMIT 5
"""
print(f"Executing query for project: {project}")
job_id, result = await client.query_and_wait(query)
print(f"Query Job ID: {job_id}")
print("Top 5 baby names in Texas (1910-2013):")
for row in result['rows']:
print(f"- {row['name']}: {row['total_babies']}")
if __name__ == "__main__":
asyncio.run(main())