Delimiter Detector

0.1.1 · maintenance · verified Thu Apr 16

The `detect-delimiter` Python library, currently at version 0.1.1 and last released in July 2018, provides a simple function to automatically identify the delimiter used in various ad-hoc file formats like CSV or TSV. It primarily operates by counting character frequencies within an input string. The library exposes a single `detect()` function, making it straightforward to use for basic delimiter detection needs. Its release cadence appears to be sporadic or ceased, indicating a stable but not actively developed state.

Common errors

Warnings

Install

Imports

Quickstart

The `detect()` function is the primary entry point. It takes the text as a string and can optionally take `whitelist` (a list of characters to prioritize), `blacklist` (characters to ignore), and `default` (a value to return if no delimiter is found) parameters.

from detect_delimiter import detect

# Example 1: Basic comma-separated data
text1 = "apple,banana,cherry"
delimiter1 = detect(text1)
print(f"Delimiter for '{text1}': '{delimiter1}'")

# Example 2: Tab-separated data
text2 = "name\tage\tcity"
delimiter2 = detect(text2)
print(f"Delimiter for '{text2}': '{delimiter2}'")

# Example 3: Semicolon-separated with a custom whitelist
text3 = "one;two;three"
delimiter3 = detect(text3, whitelist=[';', ',', '|'])
print(f"Delimiter for '{text3}': '{delimiter3}'")

# Example 4: No common delimiter found, returning a default value
text4 = "hello world"
delimiter4 = detect(text4, default='NA')
print(f"Delimiter for '{text4}': '{delimiter4}'")

# Example 5: Period as delimiter, which is blacklisted by default
text5 = "file.name.txt"
delimiter5 = detect(text5)
print(f"Delimiter for '{text5}': '{delimiter5}'") # Expected: None (as '.' is blacklisted by default)

view raw JSON →