{"id":1015,"library":"pyathena","title":"PyAthena","description":"PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena, enabling SQL queries on data stored in Amazon S3. It provides a familiar interface for database interactions, supports various cursor types (e.g., standard, Pandas, Arrow), SQLAlchemy integration, and asynchronous query execution. The library is actively maintained with frequent updates.","status":"active","version":"3.30.1","language":"python","source_language":"en","source_url":"https://github.com/pyathena-dev/PyAthena","tags":["aws","athena","database","sql","data-lake","etl"],"install":[{"cmd":"pip install pyathena","lang":"bash","label":"Core library"},{"cmd":"pip install \"pyathena[sqlalchemy,pandas,arrow,polars]\"","lang":"bash","label":"With common extras (SQLAlchemy, Pandas, Arrow, Polars)"}],"dependencies":[{"reason":"Required for AWS API interactions.","package":"boto3","optional":false},{"reason":"Required for AWS API interactions (a dependency of boto3).","package":"botocore","optional":false},{"reason":"Optional, for SQLAlchemy dialect support.","package":"SQLAlchemy","optional":true},{"reason":"Optional, for PandasCursor to fetch results as DataFrames.","package":"pandas","optional":true},{"reason":"Optional, for ArrowCursor to fetch results as Apache Arrow tables.","package":"pyarrow","optional":true},{"reason":"Optional, for PolarsCursor to fetch results as Polars DataFrames.","package":"polars","optional":true}],"imports":[{"symbol":"connect","correct":"from pyathena import connect"}],"quickstart":{"code":"import os\nfrom pyathena import connect\n\n# Configure these environment variables or replace with actual values\n# AWS_S3_STAGING_DIR: S3 path for Athena query results (e.g., \"s3://my-athena-results-bucket/\")\n# AWS_REGION_NAME: AWS region (e.g., \"us-east-1\")\n# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN will be picked up by boto3 if not explicitly passed\ns3_staging_dir = os.environ.get('AWS_S3_STAGING_DIR', 's3://your-athena-query-results-bucket/')\nregion_name = os.environ.get('AWS_REGION_NAME', 'us-east-1')\naws_access_key_id = os.environ.get('AWS_ACCESS_KEY_ID')\naws_secret_access_key = os.environ.get('AWS_SECRET_ACCESS_KEY')\naws_session_token = os.environ.get('AWS_SESSION_TOKEN')\n\n# Ensure mandatory parameters are set\nif not s3_staging_dir.startswith('s3://') or not region_name:\n    print(\"Error: AWS_S3_STAGING_DIR and AWS_REGION_NAME must be set correctly.\")\nelse:\n    try:\n        # Connect to Athena\n        conn = connect(\n            s3_staging_dir=s3_staging_dir,\n            region_name=region_name,\n            aws_access_key_id=aws_access_key_id, # Optional: boto3 usually handles this\n            aws_secret_access_key=aws_secret_access_key, # Optional\n            aws_session_token=aws_session_token # Optional\n        )\n        cursor = conn.cursor()\n\n        # Execute a sample query\n        cursor.execute(\"SELECT 1 as one, 'hello' as greeting\")\n\n        # Fetch results\n        print(\"Query Results:\")\n        for row in cursor.fetchall():\n            print(row)\n\n        # Close cursor and connection\n        cursor.close()\n        conn.close()\n\n    except Exception as e:\n        print(f\"An error occurred: {e}\")\n        print(\"Please ensure AWS credentials are configured (e.g., via environment variables, ~/.aws/credentials, or IAM role) and AWS_S3_STAGING_DIR and AWS_REGION_NAME are set correctly.\")","lang":"python","description":"This quickstart demonstrates how to establish a connection to Amazon Athena, execute a simple SQL query, and fetch results using `pyathena`. It expects AWS credentials to be configured via environment variables, IAM roles, or `~/.aws/credentials` (handled by `boto3`). The `s3_staging_dir` and `region_name` are mandatory connection parameters."},"warnings":[{"fix":"If your code relies on the previous heuristic type inference for complex types, explicitly provide `result_set_type_hints` in your `connect` or `cursor.execute()` calls to specify the expected Athena DDL type signatures for affected columns. Otherwise, adapt your code to handle string values for complex type elements.","message":"Starting with PyAthena v3.30.0, the library no longer infers Python types for scalar values inside complex Athena types (e.g., '123' to 123 in structs/arrays). Values are kept as strings unless `result_set_type_hints` is provided.","severity":"breaking","affected_versions":">=3.30.0"},{"fix":"Always pass `s3_staging_dir` (e.g., `s3://your-bucket/path/to/results/`) and `region_name` (e.g., `us-east-1`) to the `pyathena.connect()` function.","message":"The `s3_staging_dir` and `region_name` parameters are mandatory when establishing a connection to Athena. Failure to provide them will result in a connection error.","severity":"gotcha","affected_versions":"All"},{"fix":"Consider using `PandasCursor` with the `chunksize` option (e.g., `cursor_class=PandasCursor, cursor_kwargs={'chunksize': 100000}`) for better memory management, or configure Athena to write results to S3 directly and then download/process the CSV file for optimal performance with massive datasets.","message":"For very large query results, the default cursor might be slow as it fetches results in smaller chunks. This can lead to performance bottlenecks for extensive data analysis.","severity":"gotcha","affected_versions":"All"},{"fix":"Verify `boto3`'s credential chain can find valid AWS credentials. For explicit control, you can pass `aws_access_key_id`, `aws_secret_access_key`, and `aws_session_token` directly to `pyathena.connect()`.","message":"Ensure your AWS environment is correctly configured for authentication (e.g., IAM role, `~/.aws/credentials`, or environment variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`). PyAthena relies on `boto3` for credential resolution.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-05-12T22:38:44.821Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"Install the library using pip: `pip install PyAthena` or ensure the correct virtual environment is activated.","cause":"The `pyathena` library is not installed in the Python environment, or the Python interpreter cannot locate it in its search path.","error":"ModuleNotFoundError: No module named 'pyathena'"},{"fix":"Configure your Athena workgroup to use Athena engine version 3, or disable the feature causing the error (e.g., `result_reuse_enable=False`) if backward compatibility is required.","cause":"This error occurs when attempting to use a feature, such as result reuse, that requires Athena engine version 3, but the configured workgroup is using an older engine version.","error":"pyathena.error.DatabaseError: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: This functionality is not enabled in the selected engine version."},{"fix":"Grant the required AWS IAM permissions (`athena:StartQueryExecution`, `athena:GetQueryExecution`, `athena:GetQueryResults`, `s3:GetObject`, `s3:ListBucket`, `s3:PutObject`, `s3:DeleteObject` for the staging S3 bucket) to the IAM entity PyAthena is using.","cause":"The AWS IAM user or role configured for PyAthena lacks the necessary permissions to execute Athena queries or access the specified S3 staging directory.","error":"botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the StartQueryExecution operation: User: arn:aws:iam::... is not authorized to perform: athena:StartQueryExecution on resource: ..."},{"fix":"Review the PyAthena connection parameters and query arguments, ensuring they conform to the expected types and formats as per PyAthena and `boto3` documentation for the Athena service.","cause":"An invalid or incorrectly formatted parameter was passed to an underlying `boto3` call made by PyAthena, often due to a mismatch with expected types or values by the AWS API.","error":"botocore.exceptions.ParamValidationError: Parameter validation failed:"},{"fix":"Implement robust error handling and `None` checks around PyAthena API calls, especially for `connect()`, `cursor.execute()`, and result fetching methods, to identify and handle cases where these operations might fail and return `None`.","cause":"This generic Python error occurs in PyAthena when an operation attempts to access an attribute (like `get`) on an object that is `None`, usually because a preceding step (e.g., connection, query execution, or result fetching) failed to return a valid object and instead returned `None`.","error":"AttributeError: 'NoneType' object has no attribute 'get'"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"3.30.1","cli_name":"","install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"609.5M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"570.8M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"54.1M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"53.9M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":15.9,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"575M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"538M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":4.5,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"55M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"54M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":1.1,"disk_size":"630.0M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.03,"mem_mb":1.1,"disk_size":"591.2M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":1.1,"disk_size":"57.8M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.03,"mem_mb":1.1,"disk_size":"57.6M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":14.3,"import_time_s":0.02,"mem_mb":1.1,"disk_size":"595M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":1.1,"disk_size":"557M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":4.1,"import_time_s":0.02,"mem_mb":1.1,"disk_size":"58M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":1.1,"disk_size":"58M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"613.8M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":0.7,"disk_size":"575.0M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":0.7,"disk_size":"49.3M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.02,"mem_mb":0.7,"disk_size":"49.1M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":14.1,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"579M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"541M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":3.6,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"50M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"50M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"612.2M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"573.3M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"49.0M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.7,"disk_size":"48.7M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":13.6,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"577M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"539M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":3.2,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"49M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"49M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"387.1M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"387.0M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"52.8M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"52.7M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":16.4,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"363M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"sqlalchemy,pandas,arrow,polars","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"363M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":5.2,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"53M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.01,"mem_mb":0.5,"disk_size":"53M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":0},{"runtime":"python:3.9-slim","exit_code":0}]}}