Wordninja

2.0.0 · maintenance · verified Sun Apr 12

Wordninja is a Python library that probabilistically splits concatenated words based on English Wikipedia uni-gram frequencies. It is designed to segment strings like 'imateapot' into ['im', 'a', 'teapot']. The current version is 2.0.0, released in August 2019, with a focus on stability rather than active new feature development.

Warnings

Install

Imports

Quickstart

Demonstrates how to import the library and use the `split` function to segment a concatenated string into a list of words.

import wordninja

split_words = wordninja.split('thisisateststring')
print(split_words)

split_phrase = wordninja.split('hellofromtheotherside')
print(split_phrase)

view raw JSON →