NLCST Emoji Modifier

6.0.2 · active · verified Sun Apr 19

The `nlcst-emoji-modifier` package is an NLCST utility designed to classify both standard Unicode emoji (e.g., 👍) and GitHub-style gemoji shortcodes (e.g., `:cat:`) within natural language text by transforming them into `EmoticonNode`s in the syntax tree. The current stable version is 6.0.2. This project, part of the `unified` collective, releases new major versions aligned with Node.js LTS support, dependency updates, and shifts to modern JavaScript module practices, ensuring ongoing compatibility and performance. It functions as a foundational component for advanced linguistic analysis, often used implicitly by higher-level plugins like `retext-emoji`. Its key differentiator lies in its specific integration within the NLCST ecosystem for detailed emoji and emoticon processing.

Common errors

Warnings

Install

Imports

Quickstart

This example demonstrates how to integrate `nlcst-emoji-modifier` into a `ParseEnglish` pipeline to tokenize emoji and gemoji shortcodes as `EmoticonNode`s.

import {emojiModifier} from 'nlcst-emoji-modifier'
import {ParseEnglish} from 'parse-english'
import {inspect} from 'unist-util-inspect'

const english = new ParseEnglish()
english.tokenizeSentencePlugins.unshift(emojiModifier)

console.log(inspect(english.parse('It’s raining :cat:s and :dog:s.')))

// Expected output (simplified):
// RootNode
// └─ ParagraphNode
//    └─ SentenceNode
//       ├─ WordNode: "It’s"
//       ├─ WhiteSpaceNode: " "
//       ├─ WordNode: "raining"
//       ├─ WhiteSpaceNode: " "
//       ├─ EmoticonNode: ":cat:"
//       ├─ WordNode: "s"
//       ├─ WhiteSpaceNode: " "
//       ├─ WordNode: "and"
//       ├─ WhiteSpaceNode: " "
//       ├─ EmoticonNode: ":dog:"
//       └─ WordNode: "s."

view raw JSON →