Shillelagh
Shillelagh is a Python library and CLI tool that enables users to query various data sources, including APIs, files, and in-memory objects, using standard SQL. It implements the Python DB API 2.0 specification based on SQLite (using the APSW library) and also provides a SQLAlchemy dialect. The library is actively maintained with frequent releases, with the current version being 1.4.4.
Warnings
- breaking Shillelagh v1.3.2 introduced a minimum version requirement for `apsw` (3.43.2.0 or higher) due to an internal API change (`createscalarfunction` renamed to `create_scalar_function`). Older `apsw` versions will cause import errors or runtime failures.
- gotcha When querying resources requiring credentials, Shillelagh may embed sensitive information directly into the virtual table name within the SQL query string. This is not recommended for security reasons, especially when logging queries or exposing them.
- gotcha Prior to version 1.4.4, date parsing error messages could be unhelpful or lack context, making debugging data ingestion issues difficult.
Install
-
pip install shillelagh -
pip install 'shillelagh[multicorn]'
Imports
- connect
from shillelagh.backends.apsw.db import connect
- create_engine
from sqlalchemy.engine import create_engine; engine = create_engine('shillelagh://')
Quickstart
import os
from shillelagh.backends.apsw.db import connect
# Example using a public Google Sheet as a data source
# For sensitive data, consider the security warning about credentials in table names.
# A public sheet with some example data
SHEET_URL = "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0"
# Connect to the in-memory database
connection = connect(":memory:")
cursor = connection.cursor()
try:
# Query the Google Sheet directly via its URL
query = f'SELECT country, SUM(cnt) FROM "{SHEET_URL}" WHERE cnt > 0 GROUP BY country'
print(f"Executing query:\n{query}\n")
for row in cursor.execute(query):
print(row)
except Exception as e:
print(f"An error occurred: {e}")
finally:
connection.close()