Normality

3.1.0 · active · verified Thu Apr 16

Normality is a Python micro-package (current version 3.1.0) that provides a small set of text normalization functions for easier re-use. It accepts unicode or utf-8 encoded text and removes various classes of characters, such as diacritics and punctuation. This library is actively maintained and is useful as a preparation step for further text analysis.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the core normalization functions of the `normality` library, including `normalize` for general text cleaning, `slugify` for creating URL-friendly slugs, and `collapse_spaces` for standardizing whitespace.

from normality import normalize, slugify, collapse_spaces

text = normalize('Nie wieder "Grüne Süppchen" kochen!')
print(f"Normalized: {text}")
# Expected: nie wieder grune suppchen kochen

slug = slugify('My first blog post!')
print(f"Slugified: {slug}")
# Expected: my-first-blog-post

spaced_text = 'this \n\n\r\nhas\tlots of \nodd spacing.'
cleaned_text = collapse_spaces(spaced_text)
print(f"Collapsed spaces: {cleaned_text}")
# Expected: this has lots of odd spacing.

view raw JSON →