Regular Expression Parser for JavaScript
RegJSParser is a JavaScript library designed for parsing JavaScript regular expressions into an abstract syntax tree (AST). It provides a programmatic way to analyze and manipulate regular expression patterns, enabling tools that need to understand the structure of regular expressions. The current stable version is 0.13.1, with recent releases indicating an active maintenance schedule, primarily focused on dependency updates, performance improvements, bug fixes, and keeping up with the latest Unicode specifications (e.g., Unicode 17.0.0 in v0.13.0). Key differentiators include its ability to parse various modern RegExp features, such as Unicode property escapes, named capture groups, and lookbehind assertions, which are often toggled via an options object during parsing. It ships with TypeScript types, facilitating its integration into modern TypeScript projects.
Common errors
-
SyntaxError: Invalid regular expression: /.../: Invalid group
cause Attempting to use named capture groups (e.g., `(?<name>...)`) without enabling the `namedGroups` option in the parser.fixEnsure the `namedGroups: true` option is passed to the `parse` function: `parse(pattern, flags, { namedGroups: true });` -
SyntaxError: Invalid regular expression: /.../: Invalid property name
cause Using Unicode property escapes (e.g., `\p{Script=Latin}`) without enabling the `unicodePropertyEscape` option in the parser.fixEnable Unicode property escapes by passing `unicodePropertyEscape: true` to the options object: `parse(pattern, flags, { unicodePropertyEscape: true });` -
TypeError: Cannot read properties of undefined (reading 'parse')
cause Incorrect CommonJS import for the `parse` function, often due to trying to use `require('regjsparser')` directly as a function.fixFor CommonJS, use `const { parse } = require('regjsparser');` or `const parse = require('regjsparser').parse;`. Ensure you are accessing the `parse` property from the module export.
Warnings
- gotcha Several advanced RegExp features like Unicode property escapes (\p{...}), named capture groups ((?<name>...)), and lookbehind assertions ((?<=...)) are not enabled by default. They must be explicitly turned on via the `options` object passed to the `parse` function.
- breaking The library frequently updates its support for Unicode versions to align with the latest ECMAScript specification. Changes in Unicode data (e.g., property values, character ranges) can subtly alter parsing results for Unicode-aware regular expressions.
- breaking Behavioral changes were introduced in `v0.11.0` regarding quantifiable anchors in Unicode mode and modifiers in lookbehind assertions. This could lead to different parsing outcomes or errors for certain complex regular expression patterns that previously parsed successfully.
Install
-
npm install regjsparser -
yarn add regjsparser -
pnpm add regjsparser
Imports
- parse
const parse = require('regjsparser').parse;import { parse } from 'regjsparser'; - parse (TypeScript type)
import type { AST } from 'regjsparser';
Quickstart
import { parse } from 'regjsparser';
// Basic parsing of a regular expression
const simplePattern = '^hello(world)?$';
const simpleAst = parse(simplePattern);
console.log('Simple AST:', JSON.stringify(simpleAst, null, 2));
// Parsing with advanced features enabled via options
// Note: These features are typically opt-in to maintain compatibility
const advancedPattern = '(?<greeting>hi)\p{Script=Latin}(?<name>.*)(?<!bye)';
const advancedAst = parse(advancedPattern, '', {
unicodePropertyEscape: true, // Enables \p{...} and \P{...}
namedGroups: true, // Enables (?<name>...)
lookbehind: true // Enables (?<=...) and (?<!...)
});
console.log('\nAdvanced AST:', JSON.stringify(advancedAst, null, 2));
// Example of parsing a regex with flags
const flaggedPattern = '/test/gi';
const flaggedAst = parse(flaggedPattern, 'gi');
console.log('\nFlagged AST:', JSON.stringify(flaggedAst, null, 2));