pandas-schema
raw JSON → 0.3.6 verified Fri May 01 auth: no python maintenance
A validation library for Pandas data frames using user-friendly schemas. Current version is 0.3.6, with infrequent releases.
pip install pandas-schema Common errors
error ImportError: cannot import name 'Check' from 'pandas_schema' ↓
cause Check class is in the validation submodule, not at package level.
fix
Use: from pandas_schema.validation import Check
error AttributeError: 'DataFrame' object has no attribute 'validate' ↓
cause Confusion with pandas built-in validate; the package provides schema.validate(df).
fix
Instantiate DataFrameSchema and call schema.validate(df).
Warnings
gotcha Column type parameter expects Python types (e.g., int, float) not numpy dtypes. Using 'int64' may cause unexpected behavior. ↓
fix Use int, float, str, etc.
gotcha Validators are in pandas_schema.validation; importing from pandas_schema directly won't give Check or validation classes. ↓
fix Use from pandas_schema.validation import Check, CanConvertValidation, etc.
gotcha InRangeValidation crashes on non-numeric text in versions <=0.3.5. This is fixed in 0.3.6. ↓
fix Upgrade to 0.3.6 or ensure data is numeric before validation.
Imports
- DataFrameSchema wrong
from pandas_schema.schema import DataFrameSchemacorrectfrom pandas_schema import DataFrameSchema - Column
from pandas_schema import Column - Check wrong
from pandas_schema import Checkcorrectfrom pandas_schema.validation import Check
Quickstart
import pandas as pd
from pandas_schema import DataFrameSchema, Column
from pandas_schema.validation import CanConvertValidation
schema = DataFrameSchema({
'age': Column(int, [
CanConvertValidation(int)
])
})
df = pd.DataFrame({'age': ['25', '30']})
errors = schema.validate(df)
if errors:
for error in errors:
print(error)
else:
print('Validation passed')