PyCSVSchema
raw JSON → 0.0.6 verified Mon Apr 27 auth: no python
PyCSVSchema is a Python implementation of the CSV Schema specification (version 0.3). It allows you to validate CSV files against a schema defined in YAML or JSON. Currently at version 0.0.6, it is in early development with weekly commits.
pip install pycsvschema Common errors
error TypeError: string indices must be integers ↓
cause Passed a schema string directly to CSVSchema instead of a dict.
fix
Parse the schema with yaml.safe_load(schema_string) or json.loads(schema_json) first.
error AttributeError: module 'pycsvschema' has no attribute 'CSVValidator' ↓
cause CSVValidator was renamed to CSVSchema in v0.0.6.
fix
Use CSVSchema instead of CSVValidator.
error FileNotFoundError: [Errno 2] No such file or directory: 'data.csv' ↓
cause You passed a file path string to validate(), which is deprecated and often fails due to cwd issues.
fix
Open the file with open('data.csv') and pass the file object to validate().
Warnings
gotcha PyCSVSchema expects the schema in Python dict form (parsed from YAML/JSON), not a raw string. Passing a schema string directly will raise a TypeError. ↓
fix Parse the schema using yaml.safe_load() or json.loads() before passing to CSVSchema.
breaking In v0.0.6, the library switched from CSVValidator class to CSVSchema. Importing CSVValidator from pycsvschema no longer works. ↓
fix Use CSVSchema instead of CSVValidator. Change imports and constructor calls.
deprecated Using 'validate()' with a file path string is deprecated; always pass a file object (opened via open()). ↓
fix Use with open('file.csv') as f: validator.validate(f) instead of validator.validate('file.csv').
Imports
- CSVSchema wrong
from pycsvschema.validator import CSVSchemacorrectfrom pycsvschema import CSVSchema - ValidationError wrong
from pycsvschema.exceptions import ValidationErrorcorrectfrom pycsvschema import ValidationError
Quickstart
import yaml
from pycsvschema import CSVSchema
schema_yaml = """
fields:
- name: id
type: integer
constraints:
required: true
- name: name
type: string
"""
schema = yaml.safe_load(schema_yaml)
validator = CSVSchema(schema)
with open('data.csv', 'r') as f:
errors = validator.validate(f)
for error in errors:
print(error)