rjieba

0.2.0 · active · verified Thu Apr 16

rjieba is a high-performance Python binding for the `jieba-rs` Rust library, offering efficient Chinese word segmentation. It aims to provide faster processing speeds compared to pure Python implementations by leveraging Rust's performance. The current version is 0.2.0. Releases are infrequent and typically driven by significant updates to the underlying `jieba-rs` library or `pyo3` binding infrastructure.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates basic Chinese word segmentation and part-of-speech tagging using `rjieba.cut` and `rjieba.tag` functions. No explicit dictionary initialization is required as dictionaries are embedded by default.

import rjieba

text = '我们中出了一个叛徒'
segmented_text = rjieba.cut(text)
print(f"Segmented (cut): {list(segmented_text)}")

tagged_text = rjieba.tag(text)
print(f"Tagged: {list(tagged_text)}")

view raw JSON →