{"id":11981,"library":"sax","title":"SAX Streaming XML Parser","description":"sax is an evented streaming XML parser implemented in JavaScript, designed primarily for Node.js but also functional in browser environments and other CommonJS implementations. It provides a SAX-style API, emitting events for different XML constructs such as `ontext`, `onopentag`, and `onattribute` as it processes input. The current stable version, as specified, is 1.6.0, indicating a mature and stable API, although new feature releases are infrequent. Key differentiators include its lightweight nature, efficient streaming capabilities, and its explicit focus on parsing XML rather than attempting to correct malformed HTML. It avoids the complexities associated with full DOM construction, XSLT transformations, or comprehensive schema/DTD validation, making it suitable for scenarios requiring simple, fast XML event processing. It offers both a direct parser interface for string input and a Node.js stream API for handling larger files efficiently.","status":"maintenance","version":"1.6.0","language":"javascript","source_language":"en","source_url":"ssh://git@github.com/isaacs/sax-js","tags":["javascript"],"install":[{"cmd":"npm install sax","lang":"bash","label":"npm"},{"cmd":"yarn add sax","lang":"bash","label":"yarn"},{"cmd":"pnpm add sax","lang":"bash","label":"pnpm"}],"dependencies":[],"imports":[{"note":"The primary entry point is a CommonJS module, exposing the parser factory and stream creator. Direct ESM imports are not officially supported.","wrong":"import sax from 'sax';","symbol":"sax","correct":"const sax = require('sax');"},{"note":"The `parser` method is a factory function on the `sax` object that returns a parser instance, not a class constructor to be invoked with `new`.","wrong":"const parser = new sax.parser(strict, options);","symbol":"parser","correct":"const parser = sax.parser(strict, options);"},{"note":"The `createStream` function is exposed as a method on the default `sax` export, designed for Node.js stream piping. Like the parser, it's part of the CJS interface.","wrong":"import { createStream } from 'sax';","symbol":"createStream","correct":"const saxStream = sax.createStream(strict, options);"}],"quickstart":{"code":"const sax = require('sax');\nconst stream = require('stream');\n\n// --- Direct Parser Example ---\nconst strictMode = true; // set to false for html-mode\nconst directParser = sax.parser(strictMode);\n\ndirectParser.onerror = function (e) {\n  console.error(\"Direct Parser Error:\", e.message);\n  if (!strictMode) {\n    // In loose mode, clear error and resume to try and continue parsing\n    this._parser.error = null;\n    this._parser.resume();\n  }\n};\ndirectParser.ontext = function (t) {\n  const trimmedText = t.trim();\n  if (trimmedText) console.log(\"Direct Text:\", trimmedText);\n};\ndirectParser.onopentag = function (node) {\n  console.log(\"Direct Open Tag:\", node.name, \"Attributes:\", JSON.stringify(node.attributes));\n};\ndirectParser.onclosetag = function () {\n  console.log(\"Direct Close Tag\");\n};\ndirectParser.onend = function () {\n  console.log(\"Direct Parser End.\\n\");\n};\n\nconsole.log(\"--- Parsing XML directly ---\");\ndirectParser.write('<root><data name=\"example\">Hello</data>, <world/></root>').close();\n\n\n// --- Stream Parser Example ---\nconst streamMode = false; // loose mode for more forgiving parsing\nconst saxStream = sax.createStream(streamMode, {\n  trim: true,\n  normalize: true,\n  lowercase: true\n});\n\nsaxStream.on('error', function (e) {\n  console.error('Stream Error:', e.message);\n  // Crucial for stream to continue if you want to recover after non-fatal errors\n  this._parser.error = null;\n  this._parser.resume();\n});\n\nsaxStream.on('opentag', function (node) {\n  console.log('Stream Open Tag:', node.name, JSON.stringify(node.attributes));\n});\n\nsaxStream.on('text', function (t) {\n  const trimmedText = t.trim();\n  if (trimmedText) console.log('Stream Text:', trimmedText);\n});\n\nsaxStream.on('end', function () {\n  console.log('Stream End.');\n});\n\nconsole.log(\"--- Parsing XML via stream ---\");\n// Simulate a readable stream from a string buffer\nconst xmlContent = Buffer.from('<catalog><book id=\"1\"><title>The Great Book</title></book><book id=\"2\"></book></catalog>');\nconst readableStream = stream.Readable.from(xmlContent);\n\nreadableStream.pipe(saxStream);\n","lang":"javascript","description":"Demonstrates both direct parser usage with event listeners for string input and stream-based parsing, including error handling and common configuration options for processing XML."},"warnings":[{"fix":"Carefully choose the `strict` option based on your XML's compliance level. For parsing HTML or less rigidly formed XML, `strict: false` is often necessary. Always ensure attribute values are quoted if `strict: true`.","message":"The `strict` option significantly alters parsing behavior, especially regarding unquoted attribute values and unknown entities. Setting `strict: true` will cause parsing to fail on many documents that might parse successfully with `strict: false`. The default behavior for `unquotedAttributeValues` also depends on the `strict` setting, being `false` when `strict` is `true`, and `true` otherwise.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"For documents with custom DTD entities, implement a listener for the `ondoctype` event to parse the DTD, extract entities, and populate `parser.ENTITIES` manually. Otherwise, ensure input XML relies only on standard entities or process unknown entities in loose mode (`strict: false`).","message":"The `sax` parser provides minimal support for XML entities. Only the five predefined XML entities (`&amp;`, `&lt;`, `&gt;`, `&apos;`, `&quot;`) are processed automatically. Custom entities defined within DTDs are ignored unless manually processed and added to `parser.ENTITIES` by implementing custom logic within an `ondoctype` event handler.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"If you require robust HTML parsing with error correction or a DOM structure, use a dedicated HTML parser like `parse5` or `jsdom`. If you need to build a DOM from XML, you must manually construct it using the events emitted by the `sax` parser.","message":"`sax` is a pure XML parser, not an HTML parser, and does not automatically build a Document Object Model (DOM). It expects well-formed XML and will not attempt to correct malformed HTML or provide built-in DOM manipulation capabilities. Attempting to parse severely malformed HTML in strict mode will likely result in errors.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"Always include an `on('error', ...)` handler for `saxStream` instances. Within this handler, ensure you clear `this._parser.error = null;` and call `this._parser.resume();` if you intend for parsing to recover and continue after non-fatal errors.","message":"When using `sax.createStream()`, unhandled errors can cause the stream to stall or stop processing. The internal parser's error state within the stream must be explicitly cleared (`this._parser.error = null`) and the parser resumed (`this._parser.resume()`) within the stream's `on('error', ...)` handler to allow processing of subsequent data.","severity":"gotcha","affected_versions":">=1.0.0"}],"env_vars":null,"last_verified":"2026-04-19T00:00:00.000Z","next_check":"2026-07-18T00:00:00.000Z","problems":[{"fix":"Either set `strict: false` in the parser options, or manually define the entity by listening to `ondoctype` and adding it to `parser.ENTITIES`.","cause":"Attempting to parse an XML document containing an undefined custom entity (e.g., `&foo;`) while the parser is in `strict` mode.","error":"Unknown entity: &foo;"},{"fix":"Ensure the XML document adheres to strict XML rules, having a single root element and all text nodes properly enclosed within tags. Use `strict: false` if dealing with less rigid XML or HTML-like content.","cause":"Malformed XML where text appears directly outside of the root element or before the opening root tag, which is not permitted in strict XML.","error":"Text data outside of a tag"},{"fix":"Ensure all attribute values are properly quoted (e.g., `<tag foo=\"bar\">` or `<tag foo='bar'>`). If dealing with XML that intentionally uses unquoted attributes, set `strict: false` and ensure `unquotedAttributeValues: true` in the parser options.","cause":"An attribute value is not enclosed in quotes (e.g., `<tag foo=bar>`) when `strict` mode is `true`, or when `strict` is `false` but `unquotedAttributeValues` is explicitly `false`.","error":"Attribute 'foo' had no quote and no whitespace, and strict or html-mode is 'false'"}],"ecosystem":"npm"}