{"id":3449,"library":"dataset","title":"Python Dataset Library (SQL Toolkit)","description":"The Python 'dataset' library (version 1.6.2) is a lightweight toolkit for simplified Python-based database access, abstracting away much of the direct SQL interaction. It enables reading and writing data in SQL data stores with an API designed to feel as straightforward as working with JSON files, offering features like implicit table and column creation, upserts, and convenient query helpers. Built on SQLAlchemy, it ensures compatibility with major databases such as SQLite, PostgreSQL, and MySQL. The library maintains a steady release cadence with bug fixes and feature enhancements.","status":"active","version":"1.6.2","language":"en","source_language":"en","source_url":"https://github.com/pudo/dataset","tags":["database","sql","orm","data","toolkit","sqlite","postgresql","mysql"],"install":[{"cmd":"pip install dataset","lang":"bash","label":"Install core library"},{"cmd":"pip install \"dataset[postgresql]\" # or \"dataset[mysql]\"","lang":"bash","label":"Install with database drivers (optional)"}],"dependencies":[{"reason":"Core ORM dependency for database interaction.","package":"SQLAlchemy","optional":false},{"reason":"Required for PostgreSQL support.","package":"psycopg2","optional":true},{"reason":"Required for MySQL support.","package":"mysql-db","optional":true},{"reason":"For data export features, extracted into a separate package as of dataset v1.0.","package":"datafreeze","optional":true}],"imports":[{"symbol":"dataset","correct":"import dataset"}],"quickstart":{"code":"import dataset\nimport os\n\n# Connect to an in-memory SQLite database\ndb = dataset.connect('sqlite:///:memory:')\n\n# Get a table reference; it will be created if it doesn't exist\ntable = db['users']\n\n# Insert new records; columns are created automatically\ntable.insert(dict(name='John Doe', age=30, city='New York'))\ntable.insert(dict(name='Jane Smith', age=25, city='London'))\n\n# Update an existing record\ntable.update(dict(name='John Doe', age=31), ['name'])\n\n# Find all records\nall_users = table.all()\nprint(\"All users:\")\nfor user in all_users:\n    print(user)\n\n# Find one record by a specific field\njohn = table.find_one(name='John Doe')\nprint(f\"\\nJohn Doe's updated age: {john['age']}\")\n\n# Find records with a filter\nlondon_users = table.find(city='London')\nprint(\"\\nUsers in London:\")\nfor user in london_users:\n    print(user)\n\n# Using environment variable for database URL (example)\n# os.environ['DATABASE_URL'] = 'sqlite:///mydb.db'\n# db_env = dataset.connect()\n# print(f\"\\nConnected via env var to: {db_env.url}\")\n","lang":"python","description":"This quickstart demonstrates how to connect to a database (using an in-memory SQLite for simplicity), create a table implicitly, insert and update data, and query records using the `dataset` library."},"warnings":[{"fix":"Upgrade Python to 3.9+ and SQLAlchemy to a compatible version (>=1.4.0, preferably 2.0+). Consult the official Changelog and migration guides.","message":"Version 1.7.0 (released March 28, 2026, on GitHub, though not yet on PyPI at time of verification) introduces significant breaking changes. It requires Python 3.9+ and full support for SQLAlchemy 2.0+ (with backward compatibility to 1.4.0). The build system migrated to Hatchling, linting to Ruff, and testing to pytest. Users should review the Changelog for a full list of changes and potential migration steps.","severity":"breaking","affected_versions":">=1.7.0"},{"fix":"Install `datafreeze` separately (`pip install datafreeze`) and adjust code to use its API for data export functionality.","message":"As of `dataset` version 1.0, the data export features (e.g., freezing data to CSV or JSON) were extracted into a separate, standalone package named `datafreeze`. Projects relying on these export capabilities will need to install and use `datafreeze` in addition to `dataset`.","severity":"breaking","affected_versions":">=1.0"},{"fix":"Install the required database driver using `pip install <driver_package_name>` (e.g., `pip install psycopg2-binary` for PostgreSQL, `pip install mysqlclient` for MySQL), or use the optional installation syntax like `pip install \"dataset[postgresql]\"`.","message":"Database-specific drivers (e.g., `psycopg2` for PostgreSQL, `mysqlclient` for MySQL) are NOT automatically installed with the `dataset` package. You must install the appropriate driver separately for the database backend you intend to use. SQLite is built into Python and does not require an additional driver.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For production, use environment variables or a configuration management system to securely inject database credentials. For local development, pass the URL explicitly or ensure the environment variable is set safely.","message":"For configuring database connections, `dataset.connect()` can automatically use a database URL defined in the `DATABASE_URL` environment variable if no URL is explicitly passed. While convenient, ensure sensitive credentials in this environment variable are managed securely, especially in production environments.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}