pyxlsb: Excel Binary Workbook (.xlsb) Parser
pyxlsb is a Python library for parsing Excel 2007-2010 Binary Workbook (.xlsb) files. It provides an API to read sheets, rows, and cells, converting various data types like numbers, strings, and dates. The library is actively maintained with a focus on bug fixes and stability, currently at version 1.0.10.
Warnings
- gotcha Prior to version 1.0.10, users might encounter `DeprecationWarning` due to string-byte comparison issues on Python 3.
- gotcha Versions before 1.0.8 had known resource leaks. For long-running processes or processing many files, this could lead to memory accumulation or file handle exhaustion.
- gotcha Date and time conversions in versions prior to 1.0.9 could have rounding errors for specific values, leading to slight inaccuracies.
- gotcha Reading worksheets with mixed types (e.g., chart sheets alongside data sheets) could cause issues with sheet selection in versions before 1.0.7.
Install
-
pip install pyxlsb
Imports
- open_workbook
from pyxlsb import open_workbook
Quickstart
import os
from pyxlsb import open_workbook
# Create a dummy .xlsb file for demonstration if it doesn't exist
# In a real scenario, you would have your existing file.
# Note: pyxlsb can only read, not write .xlsb files.
# A simple way to get one is to create an Excel file and save as 'Excel Binary Workbook (*.xlsb)'
# For this example, let's assume 'example.xlsb' exists and has a sheet with data.
# If you don't have one, this example will likely fail or print nothing.
# For a real application, ensure 'example.xlsb' exists in the current directory
# or provide a full path.
file_path = os.path.join(os.getcwd(), 'example.xlsb') # Or provide an actual path
# You would typically have a real .xlsb file here.
# For a runnable example without manual file creation, this part is tricky as pyxlsb is read-only.
# For now, we assume a file named 'example.xlsb' exists.
try:
with open_workbook(file_path) as wb:
# Access the first sheet by index (1-based)
# Or by name: with wb.get_sheet('Sheet1') as sheet:
with wb.get_sheet(1) as sheet:
print(f"Reading sheet: {sheet.name}")
# Iterate through rows
for row_index, row in enumerate(sheet.rows()):
if row_index >= 5: # Limit output for quickstart
break
row_values = [cell.v for cell in row if cell.v is not None]
if row_values:
print(f"Row {row_index + 1}: {row_values}")
except FileNotFoundError:
print(f"Warning: '{file_path}' not found. Please create an example.xlsb file to run this quickstart.")
except Exception as e:
print(f"An error occurred: {e}")