dataclass-csv
dataclass-csv is a Python library designed to effortlessly map CSV data into Python dataclasses. It handles type conversions automatically for standard types and supports custom converters for more complex scenarios. The library provides both a reader and a writer for CSV operations with dataclasses. It is currently at version 1.4.1 and sees active development with a moderate release cadence, addressing issues and adding features.
Common errors
-
dataclass_csv.exceptions.DuplicatedHeaderError: CSV header '...' is duplicated. This can lead to unexpected data mapping.
cause Your CSV file contains multiple columns with the same header name, which dataclass-csv considers ambiguous and prevents to ensure data integrity.fixModify your CSV file to ensure all header names are unique. If you have columns with logically similar data, give them distinct names (e.g., `value_1`, `value_2`). -
ValueError: invalid literal for int() with base 10: 'abc' (or similar for float, date, bool)
cause A value in your CSV column could not be automatically converted to the type specified in the corresponding dataclass field (e.g., a string 'abc' into an `int`).fixInspect the CSV data and the dataclass field type. Ensure the data matches the type, or provide a custom `TypeConverter` (via `DataclassReader(..., converter=...)`) if you need special handling for conversions or errors. -
AttributeError: type object 'MyDataclass' has no attribute 'missing_field'
cause This error typically occurs when trying to access a field on your dataclass instance that either does not exist in the dataclass definition or was not mapped from the CSV (e.g., a CSV header name did not match a dataclass field name).fixVerify that your CSV header names exactly match the field names in your dataclass (case-sensitive). If the names differ, use `dataclasses.field(metadata={'dataclass_csv': {'column_name': 'ActualCsvHeader'}})` to explicitly map them.
Warnings
- breaking Starting from version 1.4.1, the internal boolean string conversion logic has changed. Previously, it used `distutils.util.strtobool` which raised a `ValueError` for unrecognized boolean strings (e.g., 'unknown'). The new implementation converts unrecognized strings to `False` instead of raising an error. This can silently change behavior if your CSV contains non-standard boolean representations.
- gotcha Before version 1.3.0, if your CSV file contained duplicated header names, `dataclass-csv` might have silently mapped data incorrectly or overwritten values. From 1.3.0 onwards, it explicitly checks for and raises a `DuplicatedHeaderError` for such cases.
- gotcha Automatic date and datetime type conversion was officially introduced in version 1.4.0. Prior versions did not inherently support converting string representations of dates (e.g., 'YYYY-MM-DD') into `datetime.date` or `datetime.datetime` objects, requiring manual conversion or custom type converters.
- deprecated The project deprecated `pipenv` for dependency management in version 1.4.1, moving towards `poetry`. While this primarily affects project contributors and development setup, users following older contribution guides or examples might find inconsistencies.
Install
-
pip install dataclass-csv
Imports
- DataclassReader
from dataclass_csv import DataclassReader
- DataclassWriter
from dataclass_csv import DataclassWriter
Quickstart
import dataclasses
from dataclass_csv import DataclassReader
import io
@dataclasses.dataclass
class Product:
product_id: int
name: str
price: float
in_stock: bool
csv_data = """product_id,name,price,in_stock
101,Laptop,1200.50,true
102,Mouse,25.99,False
103,Keyboard,75.00,1
104,Monitor,300.00,0
"""
# Read CSV data into dataclass instances
reader = DataclassReader(io.StringIO(csv_data), Product)
products = []
for product in reader:
products.append(product)
print(f"Product: {product.name}, Price: ${product.price}, In Stock: {product.in_stock}")
# Example of accessing a specific product
if products:
print(f"\nFirst product name: {products[0].name}")