Acryl Great Expectations
raw JSON → 0.15.50.1 verified Thu Apr 16 auth: no en
Acryl Great Expectations is Acryl Data's opinionated flavor of the Great Expectations data validation library, providing a specific set of configurations and integrations primarily for use with the DataHub metadata platform. It currently pins to an older, V2 configuration API of Great Expectations (versions `<0.16.0`). Its release cadence is tied to DataHub releases.
pip install acryl-great-expectations cli
great_expectations Common errors
error ModuleNotFoundError: No module named 'great_expectations.data_context.data_context' ↓
cause Attempting to import `DataContext` from a V3-specific path, which might not be directly available or work as expected with the V2 API.
fix
Use
from great_expectations.data_context import DataContext (V2 common import) or confirm V2-compatible paths. error AttributeError: 'DataContext' object has no attribute 'get_context' ↓
cause Attempting to use the V3 API method `DataContext.get_context()` with a DataContext instance from the V2 API, which lacks this method.
fix
Instantiate the
DataContext directly via its constructor (e.g., DataContext('/path/to/project')) as per V2 API documentation. error great_expectations.exceptions.exceptions.InvalidConfigurationBundleError: The great_expectations.yml file in /path/to/great_expectations is not valid. The following errors were found: - ... (schema validation errors related to V3 fields) ↓
cause Using a `great_expectations.yml` file formatted for the V3 API with `acryl-great-expectations`, which expects the V2 configuration schema.
fix
Revert to a
great_expectations.yml configuration adhering to the V2 API schema, or manually adjust the file to remove V3-specific elements like fluent_datasources. Warnings
breaking `acryl-great-expectations` explicitly pins its `great-expectations` dependency to versions `<0.16.0`. This means it *only* supports the legacy V2 API of Great Expectations, which is fundamentally incompatible with the V3 API introduced in `great-expectations>=0.17.0`. ↓
fix Ensure all Great Expectations code and configuration used with `acryl-great-expectations` adheres to the V2 API patterns. Do not attempt to use V3-specific classes, methods, or configuration formats (e.g., `DataContext.get_context()` or V3 YML schema).
gotcha Configuration files (e.g., `great_expectations.yml`) generated or modified for the V3 Great Expectations API will cause parsing errors or unexpected behavior when used with `acryl-great-expectations` due to its V2 API dependency. The structure and available parameters differ significantly. ↓
fix Always refer to V2 API documentation for `great_expectations.yml` structure and best practices when working with `acryl-great-expectations`.
gotcha The primary intent of `acryl-great-expectations` is to provide Great Expectations functionality specifically for DataHub integrations. While it can be used standalone, users new to Great Expectations might find clearer guidance and more up-to-date examples using the main `great-expectations` library (which supports V3 API). ↓
fix If not explicitly integrating with DataHub, consider using the main `pip install great-expectations` package to access the latest features and documentation (V3 API).
Imports
- PandasDataset
from great_expectations.dataset import PandasDataset - ExpectationSuite
from great_expectations.core.expectation_suite import ExpectationSuite - DataContext wrong
from great_expectations.data_context.data_context import DataContextcorrectfrom great_expectations.data_context import DataContext
Quickstart
import pandas as pd
from great_expectations.dataset import PandasDataset
from great_expectations.core.batch_spec import BatchSpec
# Sample data
df = pd.DataFrame({
"id": [1, 2, 3, 4, 5],
"value": [10, 20, 30, 40, 50],
"category": ["A", "B", "A", "C", "B"]
})
# Create a PandasDataset (V2 API style for in-memory validation)
batch = PandasDataset(df, batch_spec=BatchSpec(data_asset_name="my_dataframe"))
# Define and add expectations
batch.expect_column_to_exist("id")
batch.expect_column_values_to_be_between("value", min_value=0, max_value=100)
batch.expect_column_distinct_values_to_be_in_set("category", ["A", "B", "C"])
# Validate the batch
validation_result = batch.validate()
print(f"Validation successful: {validation_result.success}")
if not validation_result.success:
print("Validation failed details:")
for result in validation_result.results:
if not result.success:
print(f" Expectation: {result.expectation_config.expectation_type}, Status: {result.success}")
# Expected output: Validation successful: True