{"id":7602,"library":"pyrdfa3","title":"pyRdfa3: RDFa Distiller/Parser","description":"pyRdfa3 is a Python library that functions as an RDFa 1.1 distiller and parser. It can extract RDFa 1.1 (and, if properly configured, RDFa 1.0) from various document types including (X)HTML, SVG, and general XML. The library outputs either serialized RDF graphs or an RDFLib Graph object. The current version is 3.6.5. While the original maintainer has archived the primary GitHub repository, the 3.6.5 version is now built and maintained under a new GitHub Pages project, suggesting a community-driven or less frequent release cadence going forward.","status":"maintenance","version":"3.6.5","language":"en","source_language":"en","source_url":"https://github.com/RDFLib/pyrdfa3","tags":["RDFa","RDF","parser","distiller","semantic web","HTML","XML","SVG","RDFLib"],"install":[{"cmd":"pip install pyRdfa3","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Required Python version.","package":"python","optional":false},{"reason":"Used for HTTP requests, e.g., fetching remote RDFa sources or vocabularies.","package":"requests","optional":false},{"reason":"Core dependency for RDF graph handling and serialization.","package":"rdflib","optional":false},{"reason":"Used for parsing HTML5 documents, providing a more robust parsing experience for potentially erroneous HTML.","package":"html5lib","optional":false}],"imports":[{"symbol":"pyRdfa","correct":"from pyRdfa import pyRdfa"}],"quickstart":{"code":"from pyRdfa import pyRdfa\nfrom rdflib import Graph\n\n# Example HTML content with RDFa\nhtml_content = '''\n<div prefix=\"schema: http://schema.org/\">\n  <p typeof=\"schema:Person\">\n    <span property=\"schema:name\">Jane Doe</span>\n    <span property=\"schema:jobTitle\">Professor</span>\n    <a href=\"http://www.example.com/janedoe\" property=\"schema:url\">Homepage</a>\n  </p>\n</div>\n'''\n\n# Create a dummy file for demonstration\nwith open(\"example.html\", \"w\") as f:\n    f.write(html_content)\n\n# Extract RDF as a serialized string (Turtle format by default)\nturtle_output = pyRdfa().rdf_from_source('example.html')\nprint(\"--- Turtle Output ---\")\nprint(turtle_output)\n\n# Extract RDF as an RDFLib Graph object\ngraph = pyRdfa().graph_from_source('example.html')\nprint(\"\\n--- RDFLib Graph (Triples) ---\")\nfor s, p, o in graph:\n    print(s, p, o)\n\n# Clean up the dummy file (optional)\nimport os\nos.remove(\"example.html\")\n","lang":"python","description":"This quickstart demonstrates how to use `pyRdfa` to parse RDFa content from a source (a local HTML file in this case) and either obtain a serialized RDF string (defaulting to Turtle) or an RDFLib Graph object."},"warnings":[{"fix":"Monitor the new GitHub Pages project (prrvchr.github.io/pyrdfa3) for updates and be aware of potential delays in bug fixes or new features. Consider contributing to the project if active maintenance is critical for your use case.","message":"The original maintainer has retired and archived the primary GitHub repository. While version 3.6.5 is now maintained by a new entity, this indicates a significant shift in project leadership and potentially irregular future updates or support.","severity":"breaking","affected_versions":"All versions >= 3.6.5 (due to change in maintenance model)"},{"fix":"Ensure your environment uses Python 3.8 or a newer compatible version.","message":"`pyrdfa3` no longer supports Python 2.x. It explicitly requires Python 3.8 or higher.","severity":"breaking","affected_versions":"< 3.6.0 (before explicit Python 3.8+ requirement)"},{"fix":"Consult the RDFa 1.1 specifications (Core, XHTML+RDFa, HTML+RDFa) and the `pyRdfa` documentation to understand the exact parsing behavior for different RDFa versions. Ensure your markup explicitly uses `@version` if mixing RDFa 1.0 and 1.1 concepts.","message":"Some parsing behaviors, particularly around RDFa 1.0 vs. 1.1 specifics (e.g., `@property`, list handling, `@typeof`), have changed due to the library being a rewrite of a previous RDFa 1.0 distiller.","severity":"gotcha","affected_versions":"All versions >= 3.0.0"},{"fix":"Use a recent version of RDFLib (>=7.0.0 as recommended by pyrdfa3). If encountering serialization issues or needing JSON-LD, ensure `pyRdfaExtras` is installed or verify `rdflib_jsonld` compatibility with your RDFLib version.","message":"The default RDF serialization formats rely on RDFLib's serializers. Older RDFLib releases might have issues with certain serialization formats, and formats like JSON-LD might require the `pyRdfaExtras` package or a compatible `rdflib_jsonld` package if not part of the core RDFLib distribution.","severity":"gotcha","affected_versions":"All versions, especially with older RDFLib installations"},{"fix":"Avoid using these scripts directly. Implement similar functionality by calling the `pyRdfa` library methods directly within your Python 3 application.","message":"The `CGI_RDFa.py` and `localRdfa.py` utility scripts included in the distribution have not been ported to Python 3.x and will not function correctly on modern Python environments.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure you are calling the `pyRdfa` class. Correct: `from pyRdfa import pyRdfa; distiller = pyRdfa()`. Incorrect: `import pyRdfa; distiller = pyRdfa()`.","cause":"Attempting to call the `pyRdfa` module directly instead of instantiating the `pyRdfa` class within the module.","error":"TypeError: 'module' object is not callable"},{"fix":"Install the package using pip: `pip install pyRdfa3`. If it's already installed, verify your Python environment's PYTHONPATH.","cause":"The `pyrdfa3` package is not installed or the Python environment is not correctly configured.","error":"ModuleNotFoundError: No module named 'pyRdfa'"},{"fix":"Ensure you have a compatible `rdflib` version (>=7.0.0). If the issue persists, try installing `pyRdfaExtras` if it's explicitly required by your `pyrdfa3` version for JSON-LD, or `pip install rdflib-jsonld` if using an older RDFLib that doesn't include it.","cause":"The JSON-LD serializer is not available in the installed RDFLib version or `pyRdfaExtras` (or `rdflib-jsonld`) is not installed or detected.","error":"rdflib.exceptions.SerializerNotAvailable: No serializer for format 'json-ld' installed."}]}