{"library":"microdata-rdf-streaming-parser","title":"Microdata to RDF Streaming Parser","description":"microdata-rdf-streaming-parser is a JavaScript library designed for efficiently parsing HTML documents containing Microdata annotations and transforming them into RDFJS-compliant quads. Currently at version 3.0.0, the library prioritizes a streaming approach, allowing it to process documents larger than available memory and emit RDF triples as soon as possible, rather than waiting for the entire document to be loaded. It is 100% spec-compliant with the W3C Microdata to RDF transformation algorithm and integrates seamlessly with the RDFJS ecosystem for its data model representations. While a specific release cadence is not outlined, major version updates, like the current v3, typically introduce significant architectural changes, such as a shift towards ESM-first patterns. Its key differentiators include its robust streaming capability powered by `htmlparser2`, strict adherence to the Microdata to RDF specification, and its lightweight footprint, making it suitable for both Node.js environments and browser-based applications via bundlers.","language":"javascript","status":"active","last_verified":"Sun Apr 19","install":{"commands":["npm install microdata-rdf-streaming-parser"],"cli":null},"imports":["import { MicrodataRdfParser } from 'microdata-rdf-streaming-parser';","import type { IHtmlParseListener } from 'microdata-rdf-streaming-parser';","import type { MicrodataRdfParserOptions } from 'microdata-rdf-streaming-parser';"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import { MicrodataRdfParser } from 'microdata-rdf-streaming-parser';\nimport { Readable } from 'stream';\n// It's recommended to explicitly import RDFJS data model factories for clarity and version control\nimport { createDataFactory, createDefaultGraph, createNamedNode } from '@rdfjs/data-model'; \n\nasync function parseMicrodataStream(htmlString: string) {\n  const dataFactory = createDataFactory();\n  const defaultGraph = createDefaultGraph();\n  const baseIRI = createNamedNode('http://example.org/document');\n\n  const parser = new MicrodataRdfParser({\n    dataFactory,\n    baseIRI: baseIRI.value, // Pass the string value of the NamedNode\n    defaultGraph,\n    xmlMode: false, // Set to true if parsing strict XHTML documents\n  });\n\n  const htmlStream = Readable.from([htmlString]);\n\n  console.log('Starting Microdata parsing...');\n  let quadCount = 0;\n  try {\n    await new Promise<void>((resolve, reject) => {\n      htmlStream\n        .pipe(parser)\n        .on('data', (quad) => {\n          // Log each emitted quad. Subject, predicate, object, and graph are RDFJS Term objects.\n          console.log(`  Quad: ${quad.subject.value} ${quad.predicate.value} ${quad.object.value} ${quad.graph.value}`);\n          quadCount++;\n        })\n        .on('end', () => {\n          console.log(`Microdata parsing finished. Emitted ${quadCount} quads.`);\n          resolve();\n        })\n        .on('error', (err) => {\n          console.error('Error during parsing:', err);\n          reject(err);\n        });\n    });\n  } catch (error) {\n    console.error(\"An unexpected error occurred:\", error);\n  }\n}\n\nconst microdataHtml = `\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n    <meta charset=\"UTF-8\">\n    <title>Microdata Example</title>\n</head>\n<body>\n    <div itemscope itemtype=\"http://schema.org/Person\">\n        <h1 itemprop=\"name\">John Doe</h1>\n        <p>Title: <span itemprop=\"jobTitle\">Professor</span></p>\n        <p>Works at: <span itemprop=\"worksFor\" itemscope itemtype=\"http://schema.org/Organization\"><span itemprop=\"name\">University of Example</span></span></p>\n        <p>Email: <a href=\"mailto:john.doe@example.com\" itemprop=\"email\">john.doe@example.com</a></p>\n    </div>\n    <div itemscope itemtype=\"http://schema.org/Article\">\n        <h2 itemprop=\"headline\">My Awesome Article</h2>\n        <span itemprop=\"author\" itemscope itemtype=\"http://schema.org/Person\">\n            By <span itemprop=\"name\">Jane Smith</span>\n        </span>\n        <p itemprop=\"articleBody\">This is the body of the article.</p>\n    </div>\n</body>\n</html>\n`;\n\nparseMicrodataStream(microdataHtml).catch(console.error);\n","lang":"typescript","description":"Demonstrates how to initialize the `MicrodataRdfParser` with custom options, feed it an HTML string via a readable stream, and log the resulting RDFJS quads as they are emitted in a streaming fashion.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":null}