Regenerate: Unicode-Aware Regex Generator

1.4.2 · maintenance · verified Sun Apr 19

Regenerate is a specialized JavaScript library designed to create regular expressions from a given set of Unicode symbols or code points. It intelligently handles the complexities of Unicode in JavaScript, particularly with astral symbols (those outside the Basic Multilingual Plane), by generating ES5-compatible patterns that correctly match these characters, typically using surrogate pairs. The library provides a fluent, chainable API to add, remove, and manage code points and ranges, allowing developers to precisely define character sets for their regexes. Currently at version 1.4.2, the package appears to be in a maintenance or stable state, having seen its last significant update several years ago, indicating a mature and feature-complete solution for Unicode-aware regex generation. It remains a valuable tool for ensuring cross-browser and historical JavaScript engine compatibility when dealing with advanced Unicode characters in regular expressions.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a `regenerate` set, add and remove individual code points and ranges, and finally generate both the regex string and the `RegExp` object. It also shows adding multiple values directly during initialization, including astral symbols.

const regenerate = require('regenerate');

// Create a set and add/remove code points and ranges
const unicodeSet = regenerate()
  .addRange(0x60, 0x69) // Add U+0060 (`) to U+0069 (i)
  .remove(0x62, 0x64) // Remove U+0062 (b) and U+0064 (d)
  .add(0x1D306); // Add U+1D306 (a rare astral symbol)

// Get the array of code points
console.log('Code points:', unicodeSet.valueOf());
// Expected: [96, 97, 99, 101, 102, 103, 104, 105, 119558]

// Get the ES5-compatible regex string
const regexString = unicodeSet.toString();
console.log('Regex string:', regexString);
// Expected: '[`ace-i]|\uD834\uDF06'

// Get the RegExp object
const regex = unicodeSet.toRegExp();
console.log('RegExp object:', regex);
// Expected: /[`ace-i]|\uD834\uDF06/

// Example with direct arguments to regenerate
const directSet = regenerate(0x1D306, 'A', '©', 0x2603);
console.log('Direct set regex string:', directSet.toString());
// Expected: '[A\xA9\u2603]|\uD834\uDF06'

view raw JSON →