NLPAug: Natural Language Processing Augmentation Library

1.1.11 · active · verified Mon Apr 13

NLPAug is a Python library designed for natural language processing data augmentation. It helps improve deep learning model performance by generating synthetic textual data, making models more robust and less prone to overfitting on small datasets. The library supports various augmentation techniques across character, word, and sentence levels. Currently at version 1.1.11, it maintains an active release cadence with several minor updates throughout the year.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates character-level augmentation using KeyboardAug. It initializes an augmenter to simulate typos by replacing characters with nearby keys on the keyboard. The `augment` method returns a list of augmented texts, even if `n=1` (default).

import nlpaug.augmenter.char as nac

text = "The quick brown fox jumps over the lazy dog."

# Initialize a Keyboard Augmenter
# Simulates typos based on keyboard proximity
aug = nac.KeyboardAug(aug_char_p=0.1, aug_word_p=0.1, aug_char_min=1)

# Augment the text
augmented_text = aug.augment(text)

print(f"Original: {text}")
print(f"Augmented: {augmented_text[0]}")

view raw JSON →