{"library":"soda-core-duckdb","title":"Soda Core DuckDB Connector","type":"library","description":"soda-core-duckdb is a Python connector that enables Soda Core, an open-source data quality and data contract verification engine, to connect and run data quality checks against DuckDB databases. It facilitates defining data quality expectations in YAML (SodaCL) and executing scans programmatically or via CLI to validate data. The library is actively maintained as part of the broader Soda Core ecosystem, which sees frequent updates and new feature releases.","language":"python","status":"active","last_verified":"Sat May 16","install":{"commands":["pip install soda-core-duckdb"],"cli":{"name":"soda","version":"soda-core, version 3.5.6"}},"imports":["from soda.scan import Scan"],"auth":{"required":false,"env_vars":[]},"links":{"homepage":"https://www.soda.io","github":null,"docs":null,"changelog":null,"pypi":"https://pypi.org/project/soda-core-duckdb/","npm":null,"openapi_spec":null,"status_page":null,"smithery":null},"quickstart":{"code":"import os\nimport duckdb\nfrom soda.scan import Scan\n\n# 1. Create a dummy DuckDB database and a table\ncon = duckdb.connect(database=':memory:', read_only=False)\ncon.execute(\"CREATE TABLE my_table (id INTEGER, name VARCHAR);\")\ncon.execute(\"INSERT INTO my_table VALUES (1, 'Alice'), (2, 'Bob'), (3, NULL);\")\n\n# 2. Define a data source configuration (optional for in-memory, but good practice)\n# This would typically be in a configuration.yml file\n# ds_config_content = \"\"\"\n# data_source my_duckdb:\n#   type: duckdb\n#   connection:\n#     database: ':memory:'\n# \"\"\"\n\n# 3. Define SodaCL checks in a checks.yml file\nchecks_content = \"\"\"\nchecks for my_table:\n  - row_count > 0\n  - missing_count(name) = 1\n  - column_count = 2\n\"\"\"\n\nwith open('checks.yml', 'w') as f:\n    f.write(checks_content)\n\n# 4. Programmatically run a Soda scan\nscan = Scan()\nscan.add_duckdb_connection(con)\nscan.set_data_source_name('my_duckdb_source') # Logical name for the data source\nscan.add_sodacl_yaml_files(file_paths=['checks.yml'])\n\nprint('Running Soda scan...')\nscan.execute()\n\nif scan.has_failures():\n    print('Scan failed!')\n    # Optionally, you can assert or raise an error\n    # scan.assert_no_checks_fail()\nelse:\n    print('Scan successful: all checks passed or warned.')\n\nprint(scan.get_logs_text())\n\n# Clean up temporary files\nos.remove('checks.yml')\ncon.close()\n","lang":"python","description":"This quickstart demonstrates how to set up an in-memory DuckDB database, define data quality checks using SodaCL in a `checks.yml` file, and then execute a programmatic scan using the `soda.scan.Scan` class to validate the data. It checks for a positive row count, a specific number of missing values in a column, and the total column count.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-16","installed_version":"3.5.6","pypi_latest":"3.5.6","is_stale":false,"summary":{"python_range":"3.10–3.9","success_rate":80,"avg_install_s":7.2,"avg_import_s":1.98,"wheel_type":"wheel"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.02,"mem_mb":28,"disk_size":"118.5M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":7.9,"import_time_s":1.35,"mem_mb":26.6,"disk_size":"107M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":2.77,"mem_mb":30.9,"disk_size":"124.4M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":6.7,"import_time_s":2.16,"mem_mb":29.6,"disk_size":"113M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"114.5M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":5.5,"import_time_s":null,"mem_mb":null,"disk_size":"103M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"soda-core-duckdb","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"soda-core-duckdb","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":7.9,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":1.91,"mem_mb":27.7,"disk_size":"117.6M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"soda-core-duckdb","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":8.8,"import_time_s":1.69,"mem_mb":26.4,"disk_size":"106M"}]}}