{"id":873,"library":"great-expectations","title":"Great Expectations","description":"Great Expectations (GX) is an open-source Python library for data quality. It helps data teams validate, document, and profile their data to ensure quality and consistency throughout data pipelines. It allows users to define 'Expectations' (assertions about data), run validation tests, and generate human-readable data quality reports called 'Data Docs'. The library is actively maintained with frequent releases and supports Python versions 3.10 through 3.13, with experimental support for 3.14.","status":"active","version":"1.15.1","language":"python","source_language":"en","source_url":"https://github.com/great-expectations/great_expectations","tags":["data quality","data validation","data profiling","data testing","etl","data engineering"],"install":[{"cmd":"pip install great_expectations","lang":"bash","label":"Install Great Expectations"}],"dependencies":[{"reason":"Great Expectations supports Python 3.10 through 3.13. Experimental support for Python 3.14 and later can be enabled via an environment variable during installation.","package":"python","optional":false},{"reason":"Commonly used for in-memory data validation and often provides more granular error details (like row identifiers) than SQL engines.","package":"pandas","optional":true},{"reason":"Used for validating data in Spark DataFrames.","package":"apache-spark","optional":true},{"reason":"Used for connecting to and validating data in various SQL databases.","package":"sqlalchemy","optional":true}],"imports":[{"symbol":"gx","correct":"import great_expectations as gx"},{"note":"The `gx.get_context()` method is the recommended way to instantiate a Data Context, abstracting away the underlying DataContext class and handling initialization of file-based or ephemeral contexts.","wrong":"from great_expectations.data_context import DataContext; context = DataContext()","symbol":"get_context","correct":"context = gx.get_context()"}],"quickstart":{"code":"import great_expectations as gx\nimport pandas as pd\nimport os\n\n# 1. Initialize a Data Context (or use an existing one)\n# For quickstart, a temporary in-memory context is often sufficient\n# For persistent configuration, run `great_expectations init` in your terminal\ncontext = gx.get_context()\n\n# 2. Connect to data (using a Pandas DataFrame for simplicity)\n# This example uses a publicly available CSV dataset\n# In a real scenario, you'd load your own data, e.g., from a file, database, or API\ndf = pd.read_csv(\"https://raw.githubusercontent.com/great-expectations/great_expectations/develop/tests/test_sets/taxi_trips.csv\")\n\n# Add a Pandas Datasource and a Data Asset\ndatasource = context.data_sources.add_pandas(\"my_pandas_datasource\")\ndata_asset = datasource.add_dataframe_asset(name=\"my_dataframe_asset\", dataframe=df)\n\n# Get a Validator to create and run Expectations\nvalidator = context.get_validator(batch_request=data_asset.build_batch_request())\n\n# 3. Create Expectations\n# Define assertions about your data\nvalidator.expect_column_to_exist(\"passenger_count\")\nvalidator.expect_column_values_to_be_between(\"passenger_count\", min_value=1, max_value=6)\nvalidator.expect_column_values_to_not_be_null(\"pickup_datetime\")\n\n# 4. Save the Expectation Suite\nvalidator.save_expectation_suite(discard_failed_expectations=False)\n\n# 5. Run validation\ncheckpoint = context.add_or_update_checkpoint(\n    name=\"my_checkpoint\",\n    validator=validator,\n)\n\ncheckpoint_result = checkpoint.run()\n\n# 6. Review validation results (e.g., in Data Docs)\n# To open Data Docs in your browser, uncomment the line below after a successful run\n# context.build_data_docs()\n# context.open_data_docs()\n\nprint(\"Validation successful:\", checkpoint_result.success)\nif not checkpoint_result.success:\n    print(\"Validation failed. Check Data Docs for details.\")","lang":"python","description":"This quickstart demonstrates how to initialize a Data Context, connect to a sample Pandas DataFrame, define and save an Expectation Suite, run validation using a Checkpoint, and view the results. For persistent setups, you would typically run `great_expectations init` in your terminal to create a filesystem-backed Data Context."},"warnings":[{"fix":"Consult the official migration guides in the Great Expectations documentation for detailed steps on upgrading your configurations and API calls.","message":"Breaking changes were introduced in the transition from V0 to V1 API and V2 to V3 API, requiring significant updates to configuration files (e.g., `expectation_suite_name` to `name`, `evaluation_parameters` to `suite_parameters`, `ge_cloud_id` to `id`). Validation Operators were deprecated in V3.","severity":"breaking","affected_versions":"<=0.12.x to >=0.13.x (V2 to V3), <=0.18.x to >=1.0.x (V0 to V1)"},{"fix":"Consider running Great Expectations in a Linux or macOS environment, or using a Linux-based Docker container on Windows.","message":"Windows support for the open-source Python version (GX OSS) is currently limited or unavailable. Users in Windows environments might encounter errors or performance issues.","severity":"gotcha","affected_versions":"All versions (GX OSS)"},{"fix":"For detailed row-level failure information, consider using a Pandas-backed data source, or implement custom logic to extract identifying information from your SQL query results before validation.","message":"When validating data from SQL data sources, it can be challenging to retrieve specific row identifiers (e.g., primary keys or row numbers) for failed expectations directly in the validation results. This often requires switching to a Pandas-based execution engine to obtain more granular details.","severity":"gotcha","affected_versions":"All versions (SQL Alchemy execution engine)"},{"fix":"Carefully review your Great Expectations and orchestrator configurations. Ensure checkpoints are correctly defined and that batch requests are optimized to prevent redundant computations. Consider isolated testing of expectation suites to diagnose performance bottlenecks.","message":"In complex data pipelines, particularly when integrating with orchestrators like Airflow, users have reported issues with Expectations executing multiple times or experiencing slow performance.","severity":"gotcha","affected_versions":"All versions, especially in orchestrated environments"},{"fix":"Verify the accessibility and correctness of the remote URL pointing to your data source. If the URL refers to a resource within the Great Expectations project repository, ensure you are using a current and valid path or consider downloading the data locally.","message":"When loading data from remote URLs (e.g., using `pandas.read_csv` with a URL), users may encounter `HTTP Error 404: Not Found` if the remote resource is unavailable, has moved, or the URL is incorrect. This prevents data from being loaded into the Great Expectations context.","severity":"gotcha","affected_versions":"All versions (when relying on external data sources from URLs)"},{"fix":"Verify the data source URL is correct and accessible. Check for changes in the repository path or file availability. Consider downloading the data locally or using a more stable data hosting solution if frequent changes occur.","message":"Data loading from remote URLs (e.g., raw GitHub links) may fail if the resource is moved, deleted, or if there are network issues, resulting in HTTP errors (e.g., 404 Not Found).","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-05-12T20:39:59.034Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"Ensure Great Expectations is installed using pip: `pip install great-expectations`. If using a virtual environment or IDE, verify that the correct Python interpreter linked to the installation is selected and restart the kernel if necessary.","cause":"This error occurs when the 'great-expectations' package is not installed in the active Python environment or is not accessible by the interpreter being used.","error":"ModuleNotFoundError: No module named 'great_expectations'"},{"fix":"First, uninstall and reinstall the package: `pip uninstall great-expectations` followed by `pip install great-expectations`. If the issue persists, ensure your IDE or environment is using the correct Python interpreter where the package is installed and restart your kernel or IDE.","cause":"This error typically arises when the `great_expectations` package is either not fully or correctly installed, or there's a version mismatch where the `get_context` function, a primary entry point, isn't found at the module level. This can also happen if the Python interpreter caches old module states.","error":"AttributeError: module 'great_expectations' has no attribute 'get_context'"},{"fix":"Update your code to use the modern Fluent API for defining and accessing Data Sources, such as `context.data_sources.add_pandas(...)` or `context.add_or_update_datasource()` for file-based contexts. Refer to the Great Expectations V1 documentation for the correct methods to configure data sources.","cause":"This error indicates that you are attempting to access data sources using the `context.sources` attribute, which is part of the older Great Expectations V2 API. The newer V3 (GX 1.0+) Fluent API uses a different approach, often `context.data_sources` or specific methods for adding data sources.","error":"AttributeError: 'EphemeralDataContext' object has no attribute 'sources'"},{"fix":"Replace `suite.add_expectation_configuration(expectation_configuration=config)` with `suite.add_expectation(expectation_configuration=config)` or `suite.expectations.append(config)`. Consult the official documentation for the version of Great Expectations you are using.","cause":"This `AttributeError` occurs because the `add_expectation_configuration` method has been deprecated or removed in newer versions of Great Expectations. The correct method to add expectations to an `ExpectationSuite` is `add_expectation()`, or directly appending to the `suite.expectations` list.","error":"AttributeError: 'ExpectationSuite' object has no attribute 'add_expectation_configuration'"},{"fix":"Instead of subscripting the `Checkpoint` object directly, run the checkpoint to get a `CheckpointResult` object, and then access its attributes or methods, such as `checkpoint_result.run_results` or `checkpoint_result.list_validation_results()`.","cause":"This error typically arises when trying to access elements of a `Checkpoint` object using dictionary-like indexing (e.g., `checkpoint['batches']`), which is not supported for `Checkpoint` objects in current versions of Great Expectations. `Checkpoint` objects manage validation runs and return a `CheckpointResult` object, which then contains the validation results.","error":"TypeError: 'Checkpoint' object is not subscriptable"}],"ecosystem":"pypi","meta_description":null,"install_score":85,"install_tag":"verified","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"1.17.1","cli_name":"great_expectations","cli_version":"sh: 1: great_expectations: not found","install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","installed_version":"1.8.1","pypi_latest":"1.17.1","is_stale":true,"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":9.54,"mem_mb":113.8,"disk_size":"372.8M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":8.57,"mem_mb":113.5,"disk_size":"371.4M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":18.5,"import_time_s":7.4,"mem_mb":113.8,"disk_size":"359M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":7.09,"mem_mb":113.5,"disk_size":"357M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":11.28,"mem_mb":130.2,"disk_size":"398.9M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":11.65,"mem_mb":129.9,"disk_size":"397.4M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":17.4,"import_time_s":10.45,"mem_mb":130.2,"disk_size":"383M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":9.42,"mem_mb":129.9,"disk_size":"381M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":10.79,"mem_mb":127.8,"disk_size":"380.3M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":11.33,"mem_mb":127.5,"disk_size":"378.7M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":16.1,"import_time_s":11.12,"mem_mb":127.8,"disk_size":"364M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":11.28,"mem_mb":127.5,"disk_size":"363M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":9.91,"mem_mb":129.3,"disk_size":"378.4M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":10.5,"mem_mb":128.9,"disk_size":"376.8M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"clean","install_time_s":16.7,"import_time_s":10.13,"mem_mb":129.3,"disk_size":"362M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":10.9,"mem_mb":128.9,"disk_size":"361M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"noisy","install_time_s":null,"import_time_s":9.79,"mem_mb":102.2,"disk_size":"373.0M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":8.89,"mem_mb":102.2,"disk_size":"371.7M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"noisy","install_time_s":21.3,"import_time_s":9.03,"mem_mb":102.2,"disk_size":"364M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"great_expectations","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":7.8,"mem_mb":102.3,"disk_size":"363M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}