{"id":6785,"library":"pyathenajdbc","title":"PyAthenaJDBC: Amazon Athena JDBC driver wrapper for Python DB API 2.0","description":"PyAthenaJDBC is a Python DB API 2.0 (PEP 249) compliant wrapper for Amazon Athena, utilizing the official JDBC driver via JPype. It provides a way to interact with Athena from Python using standard database connection patterns. The library is currently at version 3.0.1 and frequently updates to support the latest Athena JDBC driver and port features from its pure Python counterpart, PyAthena.","status":"active","version":"3.0.1","language":"en","source_language":"en","source_url":"https://github.com/laughingman7743/PyAthenaJDBC/","tags":["aws","athena","jdbc","database","sql","data-warehouse","db-api"],"install":[{"cmd":"pip install pyathenajdbc","lang":"bash","label":"Install PyAthenaJDBC"}],"dependencies":[{"reason":"Required for bridging Python with the Java JDBC driver.","package":"JPype1"},{"reason":"Used for AWS credential resolution (via DefaultAWSCredentialsProviderChain) and S3 interactions.","package":"boto3","optional":true},{"reason":"Optional, if using PyAthenaJDBC as a SQLAlchemy dialect.","package":"SQLAlchemy","optional":true},{"reason":"Optional, for integrations like `pandas.to_sql`.","package":"pandas","optional":true}],"imports":[{"note":"The primary entry point is the connect function, following DB API 2.0 standards.","wrong":"from pyathenajdbc.connection import Connection","symbol":"connect","correct":"from pyathenajdbc import connect"}],"quickstart":{"code":"import os\nfrom pyathenajdbc import connect\n\n# Ensure AWS credentials (e.g., AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)\n# and region (AWS_REGION) are set via environment variables or boto3 config.\n\n# S3OutputLocation is required for Athena query results.\n# It's recommended to set it via an environment variable.\n# Example: export AWS_ATHENA_S3_OUTPUT_LOCATION='s3://your-athena-query-results-bucket/'\ns3_output_location = os.environ.get('AWS_ATHENA_S3_OUTPUT_LOCATION', '')\n\nif not s3_output_location:\n    raise ValueError(\"AWS_ATHENA_S3_OUTPUT_LOCATION environment variable must be set.\")\n\ntry:\n    conn = connect(\n        AwsRegion=os.environ.get('AWS_REGION', 'us-east-1'),\n        Schema='default', # Your Athena database name\n        S3OutputLocation=s3_output_location,\n        # User=os.environ.get('AWS_ACCESS_KEY_ID'), # Optional if using default credential chain\n        # Password=os.environ.get('AWS_SECRET_ACCESS_KEY') # Optional if using default credential chain\n    )\n\n    with conn.cursor() as cursor:\n        cursor.execute(\"SELECT 1 as one_value\")\n        row = cursor.fetchone()\n        print(f\"Result from SELECT 1: {row}\")\n\n        cursor.execute(\"SHOW TABLES\")\n        tables = cursor.fetchall()\n        print(f\"Tables in 'default' schema: {tables}\")\n\nfinally:\n    if 'conn' in locals() and conn:\n        conn.close()","lang":"python","description":"Connects to Amazon Athena using default AWS credentials and executes a simple query. The `S3OutputLocation` is mandatory for Athena query results and should be configured. Credentials can be passed explicitly but are often resolved automatically by the `DefaultAWSCredentialsProviderChain`."},"warnings":[{"fix":"Upgrade to Python 3.6.1 or newer. Review custom formatter/converter implementations for compatibility with the new interfaces.","message":"Version 3.0.0 dropped support for Python 2.7 and Python 3.5. It also redesigned Formatter and Converter classes, which might affect custom type handling.","severity":"breaking","affected_versions":">=3.0.0"},{"fix":"Update `connect` call arguments in your code to use the new names. Refer to the official JDBC driver documentation or the library's README for the full list of argument changes.","message":"Version 2.1.0 changed the argument names for the `connect` method to align with the JDBC driver's Driver Configuration Options. For example, `access_key` became `User`, `secret_key` became `Password`, `region_name` became `AwsRegion`, `schema_name` became `Schema`, and `s3_staging_dir` became `S3OutputLocation`.","severity":"breaking","affected_versions":">=2.1.0"},{"fix":"Ensure that network access to the new S3 endpoint for the JDBC driver (e.g., `https://s3.amazonaws.com/athena-downloads/drivers/JDBC/SimbaAthenaJDBC-2.0.15.1000/AthenaJDBC42.jar`) is permitted.","message":"The Amazon Athena JDBC driver download URL changed in v3.0.0 (for driver 2.0.15). If you are behind a strict firewall or proxy that whitelists specific URLs, this change might prevent the library from automatically downloading the driver JAR.","severity":"gotcha","affected_versions":">=3.0.0"},{"fix":"If encountering issues, consult `pyathenajdbc` and `JPype1` release notes for known compatibility ranges. You might need to pin `JPype1` to a specific version or upgrade `pyathenajdbc`.","message":"PyAthenaJDBC relies on JPype1 for its Java bridge. Historically, there have been specific JPype1 version incompatibilities (e.g., v2.0.6 pinned JPype1 to <=0.7.1). While newer versions aim for broader compatibility, always check release notes when upgrading either library.","severity":"gotcha","affected_versions":"<3.0.0 (historically)"},{"fix":"Always ensure an `S3OutputLocation` is specified either in the `connect` call or through your Athena Workgroup configuration to avoid query execution errors.","message":"While `S3OutputLocation` (formerly `s3_staging_dir`) was made optional in `connect` method since v2.0.8, Athena queries fundamentally require an S3 location to store query results. Omitting it from `connect` means it must be configured at the Athena Workgroup level or other default settings, otherwise, queries will fail.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}