MiniSBD: Fast Sentence Boundary Detection

0.9.5 · active · verified Fri Apr 17

MiniSBD is a free and open-source Python library designed for fast and efficient sentence boundary detection (SBD). It provides a lightweight solution for splitting text into sentences, supporting various punctuation and language patterns. The current version is 0.9.5, with releases occurring periodically, often driven by improvements in tokenization or punctuation handling.

Common errors

Warnings

Install

Imports

Quickstart

Initialize the SBD class once, then use its `segment` method to split a string into a list of sentences.

from minisbd import SBD

sbd = SBD() # Initialize the SBD object once

text1 = "Hello world. This is a test. Is it working?"
sentences1 = sbd.segment(text1)
print(f"Text 1: {sentences1}")

text2 = "Hello world! This is another test. Is it working now?"
sentences2 = sbd.segment(text2)
print(f"Text 2: {sentences2}")

view raw JSON →