Graphemer: Unicode Character Splitter
Graphemer is a JavaScript and TypeScript library designed to accurately split strings into user-perceived characters, also known as 'extended grapheme clusters' in Unicode terminology. It addresses the complexities of Unicode, where a single visual character can be composed of multiple JavaScript characters (e.g., emojis, combining marks), which standard string operations often fail to handle correctly. The library is currently stable at version 1.4.0, which supports Unicode 15 and below. It follows a release cadence tied to new Unicode versions, typically updating annually. Key differentiators include its adherence to UAX #29's Default Grapheme Cluster Boundary rules, providing a robust solution for internationalization (i18n) and accurate character counting that standard JavaScript methods like `String.prototype.length` or simple `String.prototype.split('')` cannot achieve, especially with complex scripts and emoji sequences.
Common errors
-
TypeError: Graphemer is not a constructor
cause Attempting to instantiate `Graphemer` from a CommonJS `require` call without accessing the `.default` property.fixChange `const Graphemer = require('graphemer');` to `const Graphemer = require('graphemer').default;`. -
SyntaxError: Named export 'Graphemer' not found. The requested module 'graphemer' does not provide an export named 'Graphemer'
cause Attempting to use a named import for `Graphemer` when it is a default export in ESM.fixChange `import { Graphemer } from 'graphemer';` to `import Graphemer from 'graphemer';`.
Warnings
- breaking Older versions of `graphemer` may not correctly parse strings containing newer Unicode versions due to the underlying Grapheme Cluster Boundary Algorithm updates. Each major/minor release often updates Unicode support.
- gotcha Mixing CommonJS `require` with an ESM default export requires accessing the `.default` property. Forgetting this will result in `Graphemer` being an object containing the class, not the class constructor itself, leading to 'TypeError: Graphemer is not a constructor'.
Install
-
npm install graphemer -
yarn add graphemer -
pnpm add graphemer
Imports
- Graphemer
import { Graphemer } from 'graphemer';import Graphemer from 'graphemer';
- Graphemer (CommonJS)
const Graphemer = require('graphemer');const Graphemer = require('graphemer').default;
Quickstart
import Graphemer from 'graphemer';
const splitter = new Graphemer();
const emojiString = 'Hello 🏳️🌈 world! 👋🏽';
const hindiString = 'अनुच्छेद'; // 5 user-perceived letters
const zalgoString = 'Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘';
console.log('Original emoji string length:', emojiString.length); // 19
const emojiGraphemes = splitter.splitGraphemes(emojiString);
console.log('Split emoji graphemes:', emojiGraphemes); // ['H', 'e', 'l', 'l', 'o', ' ', '🏳️🌈', ' ', 'w', 'o', 'r', 'l', 'd', '!', ' ', '👋🏽']
console.log('Emoji grapheme count:', emojiGraphemes.length); // 16
const hindiGraphemes = splitter.splitGraphemes(hindiString);
console.log('Hindi grapheme count:', hindiGraphemes.length); // 5
const zalgoGraphemes = splitter.splitGraphemes(zalgoString);
console.log('Zalgo grapheme count:', zalgoGraphemes.length); // 5
// Iterate through graphemes
for (const grapheme of splitter.iterateGraphemes(emojiString)) {
console.log('Iterated grapheme:', grapheme);
}
// Get count directly
const count = splitter.countGraphemes(emojiString);
console.log('Direct grapheme count:', count);