PyAthena

3.30.1 · active · verified Sun Mar 29

PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena, enabling SQL queries on data stored in Amazon S3. It provides a familiar interface for database interactions, supports various cursor types (e.g., standard, Pandas, Arrow), SQLAlchemy integration, and asynchronous query execution. The library is actively maintained with frequent updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to establish a connection to Amazon Athena, execute a simple SQL query, and fetch results using `pyathena`. It expects AWS credentials to be configured via environment variables, IAM roles, or `~/.aws/credentials` (handled by `boto3`). The `s3_staging_dir` and `region_name` are mandatory connection parameters.

import os
from pyathena import connect

# Configure these environment variables or replace with actual values
# AWS_S3_STAGING_DIR: S3 path for Athena query results (e.g., "s3://my-athena-results-bucket/")
# AWS_REGION_NAME: AWS region (e.g., "us-east-1")
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN will be picked up by boto3 if not explicitly passed
s3_staging_dir = os.environ.get('AWS_S3_STAGING_DIR', 's3://your-athena-query-results-bucket/')
region_name = os.environ.get('AWS_REGION_NAME', 'us-east-1')
aws_access_key_id = os.environ.get('AWS_ACCESS_KEY_ID')
aws_secret_access_key = os.environ.get('AWS_SECRET_ACCESS_KEY')
aws_session_token = os.environ.get('AWS_SESSION_TOKEN')

# Ensure mandatory parameters are set
if not s3_staging_dir.startswith('s3://') or not region_name:
    print("Error: AWS_S3_STAGING_DIR and AWS_REGION_NAME must be set correctly.")
else:
    try:
        # Connect to Athena
        conn = connect(
            s3_staging_dir=s3_staging_dir,
            region_name=region_name,
            aws_access_key_id=aws_access_key_id, # Optional: boto3 usually handles this
            aws_secret_access_key=aws_secret_access_key, # Optional
            aws_session_token=aws_session_token # Optional
        )
        cursor = conn.cursor()

        # Execute a sample query
        cursor.execute("SELECT 1 as one, 'hello' as greeting")

        # Fetch results
        print("Query Results:")
        for row in cursor.fetchall():
            print(row)

        # Close cursor and connection
        cursor.close()
        conn.close()

    except Exception as e:
        print(f"An error occurred: {e}")
        print("Please ensure AWS credentials are configured (e.g., via environment variables, ~/.aws/credentials, or IAM role) and AWS_S3_STAGING_DIR and AWS_REGION_NAME are set correctly.")

view raw JSON →