PySHACL: Python SHACL Validator
PySHACL is a Python library that implements the W3C Shapes Constraint Language (SHACL) specification. It allows users to validate RDF data graphs against SHACL shapes, identify data quality issues, and generate detailed validation reports. Currently at version 0.31.0, the library maintains an active release cadence, with significant updates and fixes appearing every 1-3 months.
Warnings
- breaking Support for Python 3.7 and earlier was dropped in PySHACL v0.25.0. Users on older Python versions must upgrade to Python 3.8.1+ or use a PySHACL version less than 0.25.0.
- breaking PySHACL v0.25.0 and later only support RDFLib versions 6.3.2 and higher. Older RDFLib versions (e.g., 6.2.0 or earlier) are no longer compatible.
- gotcha In PySHACL versions prior to v0.28.1, the library could aggressively overwrite the root Python logger, potentially removing other application-defined handlers. This was fixed in v0.28.1.
- breaking Starting with v0.31.0, the default behavior for validating multiple data graphs has changed. Providing multiple data graphs to the `validate` function will now combine them into a single `Dataset` for validation. For independent validation of each graph, use the new `validate_each()` entrypoint or the `--validate-each` CLI flag.
Install
-
pip install pyshacl
Imports
- validate
from pyshacl import validate
Quickstart
from pyshacl import validate
from rdflib import Graph, Literal, URIRef
from rdflib.namespace import SH, XSD
# Define data graph and shapes graph in TTL format
data_graph_ttl = """
@prefix ex: <http://example.com/ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
ex:John a ex:Person ;
ex:name "John Doe" .
ex:Jane a ex:Person . # This node will violate ex:PersonShape (missing ex:name)
"""
shapes_graph_ttl = """
@prefix ex: <http://example.com/ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
ex:PersonShape a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:property [
sh:path ex:name ;
sh:minCount 1 ;
sh:maxCount 1 ;
sh:datatype xsd:string ;
] .
"""
# Load graphs using RDFLib
data_graph = Graph().parse(data=data_graph_ttl, format="ttl")
shapes_graph = Graph().parse(data=shapes_graph_ttl, format="ttl")
# Validate the data graph against the shapes graph
conforms, results_graph, results_text = validate(
data_graph,
shacl_graph=shapes_graph,
inference='rdfs', # Perform RDFS inference before validation
serialize_report_graph=True # Return the report graph as an RDFLib Graph
)
print(f"Validation Conforms: {conforms}")
if not conforms:
print("\nValidation Report:")
print(results_text)
# for s, p, o in results_graph.triples((None, SH.resultMessage, None)):
# print(f" - {o}")