Soda Core SQL Server

3.5.6 · active · verified Fri Apr 17

Soda Core SQL Server is a plugin for Soda Core, a data quality monitoring framework. It enables connecting to and scanning data quality checks against SQL Server databases. This library provides the necessary data source integration, allowing users to define checks on their SQL Server data using Soda's YAML-based configuration. Version 3.5.6 is current, and releases typically align with the active development cadence of the main `soda-core` library.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to configure Soda Core to connect to a SQL Server database and run a simple data quality scan programmatically. It generates `configuration.yml` and `checks.yml` files, then executes a scan. Ensure your environment variables `SQLSERVER_HOST`, `SQLSERVER_DATABASE`, `SQLSERVER_USERNAME`, `SQLSERVER_PASSWORD` are set, and an ODBC driver for SQL Server is installed on your system. Replace placeholder values or environment variables with your actual SQL Server connection details.

import os
from soda.scan import Scan

# Ensure environment variables are set for the quickstart to run
# For local testing, replace os.environ.get with actual values or create a .env file
host = os.environ.get('SQLSERVER_HOST', 'localhost')
port = os.environ.get('SQLSERVER_PORT', '1433')
database = os.environ.get('SQLSERVER_DATABASE', 'your_database')
username = os.environ.get('SQLSERVER_USERNAME', 'sa')
password = os.environ.get('SQLSERVER_PASSWORD', 'your_password')

# Create a dummy configuration.yml and checks.yml for demonstration
# In a real scenario, these files would be persisted.
config_yaml_content = f"""
data_source:
  type: sqlserver
  host: "{host}"
  port: "{port}"
  database: "{database}"
  username: "{username}"
  password: "{password}"
  # Add other connection options if needed, e.g., for trusted connections or specific drivers
  # Example using system-installed ODBC Driver 17 for SQL Server on Linux/macOS:
  # connection_string_user_defined_options: "Driver={{ODBC Driver 17 for SQL Server}};Encrypt=no;TrustServerCertificate=yes;"
  # Example for Windows Trusted Connection (if your SQL Server supports it):
  # connection_string_user_defined_options: "Trusted_Connection=Yes;Encrypt=no;"
"""

checks_yaml_content = """
checks for demo_table:
  - row_count > 0
  - missing_count(id) = 0
"""

config_path = 'configuration.yml'
checks_path = 'checks.yml'

with open(config_path, 'w') as f:
    f.write(config_yaml_content)
with open(checks_path, 'w') as f:
    f.write(checks_yaml_content)

print("Configuration and checks files created. Running Soda Scan...")

scan = Scan()
scan.set_data_source_name('data_source') # Corresponds to the top-level key in configuration.yml
scan.add_configuration_path(config_path)
scan.add_checks_path(checks_path)
scan.execute()

if scan.has_failures():
    print("Scan completed with failures.")
else:
    print("Scan completed successfully without failures.")

# Clean up generated files (optional, remove in persistent setups)
os.remove(config_path)
os.remove(checks_path)

view raw JSON →