ECMAScript Regular Expression Parser
`regexpp` is a specialized library for parsing and validating ECMAScript regular expressions, generating an Abstract Syntax Tree (AST) that precisely conforms to the ECMAScript specification. It provides robust tools for static analysis of regex patterns, including a parser (`RegExpParser`), a validator (`RegExpValidator`), and a visitor pattern (`RegExpVisitor`) for AST traversal. The current stable version, 3.2.0, actively receives updates to support the latest ECMAScript features and Unicode versions. This package differentiates itself by offering fine-grained control over the parsing process, allowing specification of ECMAScript version targets and Unicode flags, making it invaluable for linters, code transformers, and tools that require a deep, standards-compliant understanding and manipulation of regular expressions. Its release cadence is active, with significant updates roughly yearly for major versions and minor releases as needed to keep pace with spec changes and bug fixes.
Common errors
-
Parsing error: Invalid regular expression pattern.
cause The input string contains a syntax error in the regular expression or is not compatible with the configured `ecmaVersion`.fixReview the regular expression pattern for syntax errors. When using `RegExpParser` or `parseRegExpLiteral`, ensure the `ecmaVersion` option matches the target ECMAScript standard for the regex. -
TypeError: Cannot read properties of undefined (reading 'alternatives')
cause Code is attempting to access AST properties like `elements` or node types like `Disjunction` that were renamed or removed in `regexpp` v2.0.0 or later.fixUpdate AST traversal or manipulation logic to use the new property names (e.g., `alternatives` instead of `elements`) and current node types as defined in `regexpp` v2.0.0 and above. -
ReferenceError: require is not defined
cause Attempting to use CommonJS `require()` syntax directly in an ES Module context, or an incorrect import path/symbol for `regexpp` which is primarily an ESM library.fixUse ES Module `import` syntax (`import { parseRegExpLiteral } from 'regexpp'`) in modern Node.js environments or with a bundler. Ensure your project is configured for ESM by using `.mjs` files or `"type": "module"` in `package.json`. -
Error: Node.js v6.x is no longer supported
cause Running `regexpp` v3.0.0+ on an unsupported Node.js version (specifically, older than 8.x).fixUpgrade Node.js to version 8 or newer to meet the minimum engine requirements for `regexpp` v3.x.
Warnings
- breaking Node.js 6.x is no longer supported; the minimum required Node.js version is now 8.x.
- breaking The default ECMAScript version for parsing and validation changed to ES2020. This might cause previously valid regular expressions to be parsed differently or throw errors if they rely on older ECMAScript semantics without explicit version options.
- breaking The Abstract Syntax Tree (AST) shape underwent significant changes. The `Disjunction` node type was removed, an `Alternative` node type was added, and the `elements` property on `Pattern`, `Group`, `CapturingGroup`, and `Assertion` nodes was renamed to `alternatives`.
- gotcha Regular expression patterns ending with an unescaped backslash (`\`) are now disallowed to align with ECMAScript spec updates. Such patterns will throw a syntax error during parsing or validation.
- gotcha Unicode version updates in various releases (e.g., 2.0.0, 3.0.0, 3.1.0) mean that behavior of Unicode property escapes or character classes might subtly change if existing code relied on very specific, older Unicode versions. Always parse with the `u` flag for proper Unicode handling.
Install
-
npm install vscode-regexpp -
yarn add vscode-regexpp -
pnpm add vscode-regexpp
Imports
- parseRegExpLiteral
const { parseRegExpLiteral } = require('regexpp')import { parseRegExpLiteral } from 'regexpp' - RegExpParser
import RegExpParser from 'regexpp'
import { RegExpParser } from 'regexpp' - AST
import type { AST } from 'regexpp' - visitRegExpAST
const visitRegExpAST = require('regexpp').visitRegExpASTimport { visitRegExpAST } from 'regexpp'
Quickstart
import { parseRegExpLiteral, visitRegExpAST, RegExpValidator } from 'regexpp';
const regexSource = '/^(hello|world)\s+\d{4}$/ui';
try {
// Parse the regular expression literal
const ast = parseRegExpLiteral(regexSource, { ecmaVersion: 2021 });
console.log('Successfully parsed regex:', regexSource);
console.log('Pattern body:', ast.pattern.raw);
console.log('Flags:', ast.flags.raw);
// Validate the regular expression
const validator = new RegExpValidator({ ecmaVersion: 2021 });
validator.validateLiteral(regexSource);
console.log('Successfully validated regex:', regexSource);
// Visit the AST nodes
console.log('\nTraversing AST:');
visitRegExpAST(ast, {
onPatternEnter(node) { console.log(` Entering Pattern: ${node.raw}`); },
onGroupEnter(node) { console.log(` Entering Group: ${node.raw}`); },
onCharacterEnter(node) { console.log(` Entering Character: ${node.raw}`); },
onQuantifierEnter(node) { console.log(` Entering Quantifier: ${node.raw}`); }
});
} catch (error) {
console.error('Error processing regex:', error.message);
}