PySHACL: Python SHACL Validator

0.31.0 · active · verified Wed Apr 15

PySHACL is a Python library that implements the W3C Shapes Constraint Language (SHACL) specification. It allows users to validate RDF data graphs against SHACL shapes, identify data quality issues, and generate detailed validation reports. Currently at version 0.31.0, the library maintains an active release cadence, with significant updates and fixes appearing every 1-3 months.

Warnings

Install

Imports

Quickstart

This example demonstrates how to validate a simple RDF data graph against a SHACL shapes graph using `pyshacl.validate`. It defines two in-memory graphs, one containing sample data and another defining a shape that requires every `ex:Person` to have exactly one `ex:name`. The validation identifies `ex:Jane` as non-conformant.

from pyshacl import validate
from rdflib import Graph, Literal, URIRef
from rdflib.namespace import SH, XSD

# Define data graph and shapes graph in TTL format
data_graph_ttl = """
@prefix ex: <http://example.com/ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

ex:John a ex:Person ;
  ex:name "John Doe" .

ex:Jane a ex:Person . # This node will violate ex:PersonShape (missing ex:name)
"""

shapes_graph_ttl = """
@prefix ex: <http://example.com/ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

ex:PersonShape a sh:NodeShape ;
  sh:targetClass ex:Person ;
  sh:property [
    sh:path ex:name ;
    sh:minCount 1 ;
    sh:maxCount 1 ;
    sh:datatype xsd:string ;
  ] .
"""

# Load graphs using RDFLib
data_graph = Graph().parse(data=data_graph_ttl, format="ttl")
shapes_graph = Graph().parse(data=shapes_graph_ttl, format="ttl")

# Validate the data graph against the shapes graph
conforms, results_graph, results_text = validate(
    data_graph,
    shacl_graph=shapes_graph,
    inference='rdfs', # Perform RDFS inference before validation
    serialize_report_graph=True # Return the report graph as an RDFLib Graph
)

print(f"Validation Conforms: {conforms}")
if not conforms:
    print("\nValidation Report:")
    print(results_text)
    # for s, p, o in results_graph.triples((None, SH.resultMessage, None)):
    #     print(f" - {o}")

view raw JSON →