Optimized Regular Expressions for JavaScript
The `perf-regexes` package (current version 1.0.1, last updated in late 2018) provides a collection of pre-optimized regular expressions tailored for common parsing tasks in JavaScript. This includes patterns for identifying HTML comments, JavaScript comments (single and multi-line), various types of strings (single and double-quoted), and managing line endings. It offers utilities for detecting empty lines, non-empty lines, trailing whitespace, and normalizing line-ending styles. The library supports both CommonJS and UMD builds, making it usable in Node.js environments (with a minimum requirement of Node.js 6.14) and directly in browsers via a global `R` object. A key differentiator is its focus on robust, pre-built, and tested regex patterns that simplify complex parsing challenges, especially for nested structures or escaped characters, which are notoriously difficult to handle with custom regexes. The package also ships with TypeScript definitions, enhancing developer experience in type-checked environments. Despite its utility, the package has not received updates since 2018, indicating it is no longer actively maintained.
Common errors
-
TypeError: R.JS_REGEX_P is not a function (or similar 'undefined' error)
cause Attempting to use the deprecated `JS_REGEX_P` regex, which has been removed or is no longer accessible.fixReplace `R.JS_REGEX_P` with `R.JS_REGEX` and implement additional validation, or consider using alternative parsing strategies. -
Unexpected empty matches or incorrect parsing when repeatedly calling 'exec()' on a global regex.
cause The `lastIndex` property of a global regex (`g` flag) was not reset between calls to `exec()`, causing it to resume from the previous match's end.fixSet `yourRegex.lastIndex = 0;` before each new `exec()` call on the same regex instance, or create a new `RegExp` instance each time. -
ReferenceError: R is not defined
cause The `perf-regexes` library's default export `R` was not correctly imported or required in a module environment, or the UMD bundle was not loaded in the browser.fixIn CommonJS, use `const R = require('perf-regexes');`. For ESM, use `import R from 'perf-regexes';`. In a browser, ensure `<script src="https://unpkg.com/perf-regexes/index.min.js"></script>` is loaded before attempting to access `window.R`.
Warnings
- deprecated `JS_REGEX_P` is deprecated as of v1.0 and will be removed in a future minor version. It is highly risky to match literal regexes with other regexes, especially in ES6+ environments.
- gotcha When using any regex with the global (`'g'`) flag, you must manually reset `lastIndex` before each new `exec` call or clone the regex instance to prevent unexpected behavior and incorrect matches.
- breaking The minimum supported version of NodeJS is now 6.14. Running on older versions may lead to compatibility issues or errors.
- gotcha The `JS_REGEX` pattern should be used with caution and its results validated. Matching complex JavaScript regexes reliably with simple regex patterns is inherently difficult and prone to edge cases.
Install
-
npm install perf-regexes -
yarn add perf-regexes -
pnpm add perf-regexes
Imports
- R
import { R } from 'perf-regexes';import R from 'perf-regexes';
- HTML_CMNT
import { HTML_CMNT } from 'perf-regexes';import R from 'perf-regexes'; const htmlCommentRegex = R.HTML_CMNT;
- JS_STRING
import { JS_STRING } from 'perf-regexes';import R from 'perf-regexes'; const jsStringRegex = R.JS_STRING;
- JS_REGEX_P
import { JS_REGEX_P } from 'perf-regexes';import R from 'perf-regexes'; const deprecatedRegex = R.JS_REGEX_P;
Quickstart
const R = require('perf-regexes');
// Function to remove trailing whitespace, empty lines, and normalize line-endings
const cleaner = (text) => text.split(R.OPT_WS_EOL).filter(Boolean).join('\n');
console.log('Cleaned text example:');
console.dir(cleaner(' \r\r\n\nAA\t\t\t\r\n\rBB\nCC \rDD '));
// Expected output: 'AA\nBB\nCC\nDD'
// Use the cleaner function to cleanup HTML text by first removing HTML comments
const htmlCleaner = (html) => cleaner(html.replace(R.HTML_CMNT, ''));
const rawHtml = '\r<!--header--><h1>A</h1>\r<div>B<br>\r\nC</div> <!--end-->\n';
console.log('\nCleaned HTML example:');
console.dir(htmlCleaner(rawHtml));
// Expected output: '<h1>A</h1>\n<div>B<br>\nC</div>'
// Demonstrating string conversion: Double-quoted to single-quoted strings
const toSingleQuotes = (text) => text.replace(R.JS_STRING, (str) => {
return str[0] === '"'
? `'${str.slice(1, -1).replace(/'/g, "\'")}'`
: str;
});
const stringWithQuotes = `"A's" 'B' "C" "D\\"E" 'F\\\'G'`;
console.log('\nString quote conversion example:');
console.log(toSingleQuotes(stringWithQuotes));
// Expected output: 'A\'s' 'B' 'C' 'D\"E' 'F\'G'