Defused CSV
defusedcsv is a Python library (version 3.0.0) that acts as a drop-in replacement for the standard library's `csv` module, specifically designed to mitigate CSV injection attacks. It works by sanitizing output, prepending an apostrophe to cells that start with potentially malicious characters like `=`, `+`, `-`, `@`, `|`, or `%`, and escaping `|` characters within these cells. This prevents spreadsheet software (like MS Excel or LibreOffice) from interpreting the cell content as a formula. The library's release cadence appears to be infrequent, with the latest version published to PyPI on September 2, 2025.
Warnings
- breaking The primary function of `defusedcsv` is to modify CSV cell content to prevent injection attacks. This means the output CSV files will not be byte-for-byte identical to those produced by the standard `csv` module if malicious-looking data is present. Systems expecting exact, untransformed CSV output (e.g., for cryptographic hashing or strict format validation) may break.
- gotcha The library explicitly states it's tested with Python 3.9 to 3.13. While it might work on other Python 3 versions, explicit support is not guaranteed, which could lead to unexpected behavior or incompatibilities.
- gotcha The sanitization only addresses CSV injection for spreadsheet software. It does not validate or sanitize other forms of potentially malicious data within the CSV (e.g., malformed data, incorrect types, or general parsing errors) that could exploit other vulnerabilities or cause issues in different downstream processing systems.
Install
-
pip install defusedcsv
Imports
- csv
from defusedcsv import csv
Quickstart
from defusedcsv import csv
import io
# Prepare an in-memory CSV output stream
output = io.StringIO()
writer = csv.writer(output)
# Write header and rows, including potentially malicious payloads
writer.writerow(['ID', 'Name', 'Notes'])
writer.writerow(['1', 'Alice', 'Safe note'])
writer.writerow(['2', 'Bob', '=1+1'])
writer.writerow(['3', 'Charlie', '@SUM(A1:A2)'])
writer.writerow(['4', 'David', '|cmd /C calc!A1']) # ' | ' is escaped, and the cell is prefixed with an apostrophe
# Get the sanitized CSV data
sanitized_csv_data = output.getvalue()
print("--- Sanitized CSV Output (as seen in file) ---")
print(sanitized_csv_data)
# Example of reading the sanitized CSV back (shows raw content)
input_data = io.StringIO(sanitized_csv_data)
reader = csv.reader(input_data)
print("\n--- Reading Sanitized CSV ---")
headers = next(reader)
print(f"Headers: {headers}")
for row in reader:
print(f"Row: {row}")
# Expected output for the problematic cells (when viewed programmatically):
# ['2', 'Bob', "'=1+1"]
# ['3', 'Charlie', "'@SUM(A1:A2)"]
# ['4', 'David', "'\\|cmd /C calc!A1"]