Graphemer: Unicode Character Splitter

1.4.0 · active · verified Sun Apr 19

Graphemer is a JavaScript and TypeScript library designed to accurately split strings into user-perceived characters, also known as 'extended grapheme clusters' in Unicode terminology. It addresses the complexities of Unicode, where a single visual character can be composed of multiple JavaScript characters (e.g., emojis, combining marks), which standard string operations often fail to handle correctly. The library is currently stable at version 1.4.0, which supports Unicode 15 and below. It follows a release cadence tied to new Unicode versions, typically updating annually. Key differentiators include its adherence to UAX #29's Default Grapheme Cluster Boundary rules, providing a robust solution for internationalization (i18n) and accurate character counting that standard JavaScript methods like `String.prototype.length` or simple `String.prototype.split('')` cannot achieve, especially with complex scripts and emoji sequences.

Common errors

Warnings

Install

Imports

Quickstart

Demonstrates initializing Graphemer and using `splitGraphemes`, `iterateGraphemes`, and `countGraphemes` with complex Unicode strings including emojis, combining marks, and Zalgo text.

import Graphemer from 'graphemer';

const splitter = new Graphemer();

const emojiString = 'Hello 🏳️‍🌈 world! 👋🏽';
const hindiString = 'अनुच्छेद'; // 5 user-perceived letters
const zalgoString = 'Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘';

console.log('Original emoji string length:', emojiString.length); // 19
const emojiGraphemes = splitter.splitGraphemes(emojiString);
console.log('Split emoji graphemes:', emojiGraphemes); // ['H', 'e', 'l', 'l', 'o', ' ', '🏳️‍🌈', ' ', 'w', 'o', 'r', 'l', 'd', '!', ' ', '👋🏽']
console.log('Emoji grapheme count:', emojiGraphemes.length); // 16

const hindiGraphemes = splitter.splitGraphemes(hindiString);
console.log('Hindi grapheme count:', hindiGraphemes.length); // 5

const zalgoGraphemes = splitter.splitGraphemes(zalgoString);
console.log('Zalgo grapheme count:', zalgoGraphemes.length); // 5

// Iterate through graphemes
for (const grapheme of splitter.iterateGraphemes(emojiString)) {
  console.log('Iterated grapheme:', grapheme);
}

// Get count directly
const count = splitter.countGraphemes(emojiString);
console.log('Direct grapheme count:', count);

view raw JSON →