excelrd
excelrd is a Python library designed for developers to extract data and formatting information from Microsoft Excel spreadsheet files, supporting both the older `.xls` and newer `.xlsx` formats. It is a modified version of the `xlrd` library, specifically updated to maintain compatibility and functionality with modern Python versions (3.7+), as `xlrd` itself no longer supports `.xlsx` files in its recent versions. The library is currently at version 3.0.0 and sees releases primarily for Python compatibility and maintenance.
Common errors
-
ModuleNotFoundError: No module named 'excelrd'
cause The `excelrd` library has not been installed in your Python environment or the environment where the script is being run is not the one where `excelrd` was installed.fixRun `pip install excelrd` to install the library. -
FileNotFoundError: [Errno 2] No such file or directory: 'your_excel_file.xls'
cause The specified Excel file path is incorrect, the file does not exist at the given location, or the program does not have read permissions for the file or directory.fixDouble-check the file path and name, ensuring it's absolute or correctly relative to your script's execution directory. Verify file permissions. -
excelrd.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 0x...
cause This error typically indicates that the file being opened is not a valid Excel file or is corrupted. While `excelrd` is designed to handle both `.xls` and `.xlsx` files, it relies on underlying format recognition, which can fail for malformed or non-standard Excel files.fixVerify that the file you are attempting to open is a legitimate Excel file and not, for example, a CSV file renamed to .xls, or a severely corrupted document. Try opening the file manually in Excel to confirm its integrity. If it's an `.xlsx` file and you are certain it's valid, ensure `excelrd` is updated to the latest version to handle newer `.xlsx` structures.
Warnings
- breaking Version 3.0.0 dropped support for Python 3.5 and 3.6. Ensure your environment uses Python 3.7 or newer for compatibility.
- breaking Version 2.0.0 dropped support for Python 2. This was a significant change moving the library to Python 3 exclusivity.
- gotcha excelrd (like its base xlrd) extracts only the *results* of formulas, not the formulas themselves. If a cell contains a formula, you will get the calculated value, not the formula string (e.g., `=A1+B1`).
- gotcha Password-protected (encrypted) Excel files are not supported and cannot be read by `excelrd`.
- gotcha The library will ignore various embedded objects and features such as charts, macros, pictures, comments, hyperlinks, autofilters, advanced filters, pivot tables, conditional formatting, and data validation. Only the raw data content is extracted.
Install
-
pip install excelrd
Imports
- excelrd
import excelrd
- open_workbook
import excelrd book = excelrd.open_workbook("filename.xls")
Quickstart
import excelrd
import os
# Create a dummy Excel file for demonstration (replace with your actual file)
excel_file_path = "namesdemo.xls"
# In a real scenario, you'd have an existing .xls or .xlsx file.
# For a runnable example, we'll simulate opening one.
# Assuming 'namesdemo.xls' exists with at least 3 sheets.
# In a real app, ensure the file exists.
# Example of opening and reading data from an Excel file
try:
book = excelrd.open_workbook(excel_file_path)
print(f"The number of worksheets is {book.nsheets}")
print(f"Worksheet name(s): {', '.join(book.sheet_names())}")
# Access a specific sheet by index (e.g., the first sheet)
sh = book.sheet_by_index(0)
print(f"{sh.name}: rows={sh.nrows}, cols={sh.ncols}")
# Iterate through rows and print cell values
for row_idx in range(sh.nrows):
for col_idx in range(sh.ncols):
cell = sh.cell(row_idx, col_idx)
if cell.value is not None and str(cell.value).strip(): # Check for non-empty cells
print(f"row={row_idx}, col={col_idx}, value={cell.value}")
except FileNotFoundError:
print(f"Error: Excel file '{excel_file_path}' not found. Please create it or provide a valid path.")
except Exception as e:
print(f"An error occurred: {e}")