xlsx2csv
xlsx2csv is a Python library and command-line tool designed to convert XLSX files to CSV format. It is efficient, capable of handling large XLSX files, and fast. The current version is 0.8.6, and the project shows an active development and release cadence with recent updates addressing various bug fixes and Python version compatibility.
Warnings
- breaking Prior to version 0.8.2, `xlsx2csv` might have used or re-raised `zipfile.BadZipfile`. With the update in version 0.8.2, `BadZipFile` (with a capital 'F') is used, aligning with deprecations in Python's `zipfile` module since Python 3.2.
- breaking Versions of `xlsx2csv` prior to 0.8.3 could encounter a `SyntaxError` when run on Python 3.12 or newer due to changes in how Python 3.12 handles invalid escape sequences in regular expressions.
- gotcha Not using `Xlsx2csv` within a `with` statement (context manager) may lead to `ResourceWarning` in modern Python versions due to improper resource cleanup.
- gotcha In versions prior to 0.8.3, a bug existed when processing XLSX files with missing workbook relationships, which could lead to conversion failures or unexpected behavior.
- gotcha XLSX files containing hyperlinks could cause crashes or incorrect processing in `xlsx2csv` versions older than 0.8.2.
Install
-
pip install xlsx2csv
Imports
- Xlsx2csv
from xlsx2csv import Xlsx2csv
Quickstart
import os
from xlsx2csv import Xlsx2csv
# Create a dummy XLSX file for demonstration
# In a real scenario, this file would already exist.
# For simplicity, we'll just demonstrate the conversion logic.
# You might use openpyxl to create a real .xlsx file for testing:
# import openpyxl
# wb = openpyxl.Workbook()
# ws = wb.active
# ws['A1'] = 'Header1'
# ws['B1'] = 'Header2'
# ws['A2'] = 'Data1'
# ws['B2'] = 'Data2'
# wb.save('example.xlsx')
# Ensure a dummy file path exists for the example
excel_file = 'example.xlsx'
csv_output = 'example.csv'
# To make this runnable without needing openpyxl just for quickstart:
# Create a dummy excel_file (normally this would be a real .xlsx)
with open(excel_file, 'w') as f:
f.write('This is a placeholder for a .xlsx file content.')
# Recommended: using context manager for proper resource cleanup
try:
with Xlsx2csv(excel_file, outputencoding="utf-8") as converter:
converter.convert(csv_output)
print(f"Successfully converted '{excel_file}' to '{csv_output}'.")
# Optionally, read and print the CSV content to verify
if os.path.exists(csv_output):
with open(csv_output, 'r', encoding='utf-8') as f_csv:
print("CSV Content:\n" + f_csv.read())
else:
print(f"Error: CSV file '{csv_output}' not created.")
except Exception as e:
print(f"An error occurred during conversion: {e}")
finally:
# Clean up dummy files
if os.path.exists(excel_file):
os.remove(excel_file)
if os.path.exists(csv_output):
os.remove(csv_output)