CleverCSV

0.8.4 · active · verified Wed Apr 15

CleverCSV is a Python package designed for robustly handling messy CSV files. It provides a drop-in replacement for the standard Python `csv` module, enhancing dialect detection capabilities to accurately parse files that often cause issues. It also includes a command-line interface for tasks like standardization and code generation. The library maintains an active development status, with several minor releases typically occurring each year.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use CleverCSV to read a messy CSV file, automatically detecting its dialect, using both the high-level `read_table` function and by employing `clevercsv.Sniffer` as a drop-in replacement for the standard library's `csv.Sniffer`.

import clevercsv
import os

# Create a dummy messy CSV file for demonstration
csv_content = 'col1;col2;col3\nvalue1;"value,2";value3\n4;5;6\n'
file_path = 'messy_data.csv'
with open(file_path, 'w', newline='', encoding='utf-8') as f:
    f.write(csv_content)

try:
    # Use read_table to automatically detect the dialect and load the data
    rows = clevercsv.read_table(file_path)
    print(f"Loaded {len(rows)} rows with detected dialect:")
    for row in rows:
        print(row)

    # Demonstrate drop-in replacement for standard csv module usage
    with open(file_path, 'r', newline='') as csvfile:
        # Sniff the dialect using CleverCSV's improved sniffer
        dialect = clevercsv.Sniffer().sniff(csvfile.read(1024))
        csvfile.seek(0)
        reader = clevercsv.reader(csvfile, dialect)
        sniffer_rows = list(reader)
    print(f"\nLoaded {len(sniffer_rows)} rows using CleverCSV.Sniffer:")
    for row in sniffer_rows:
        print(row)

finally:
    # Clean up the dummy file
    if os.path.exists(file_path):
        os.remove(file_path)

view raw JSON →