Tabulator
Tabulator provides a consistent interface for reading and writing tabular data from various sources (e.g., CSV, Excel, JSON, Google Sheets) in a streaming fashion. It aims to simplify data integration by abstracting away the underlying file formats. The current version is 1.53.5, and it maintains a regular release cadence with minor updates and bug fixes.
Common errors
-
ModuleNotFoundError: No module named 'tabulator'
cause The tabulator library is not installed in your current Python environment.fixInstall the library: `pip install tabulator` -
ModuleNotFoundError: No module named 'openpyxl'
cause You are trying to read an XLSX (Excel) file, but the `openpyxl` dependency is not installed.fixInstall the required extra: `pip install tabulator[xlsx]` -
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x__ in position __: invalid start byte
cause Your CSV file is not encoded in UTF-8, but tabulator is trying to decode it as such by default.fixSpecify the correct encoding for your file, e.g., `Stream('data.csv', encoding='latin-1')` or `encoding='cp1252'`. -
FileNotFoundError: [Errno 2] No such file or directory: 'your_file.csv'
cause The specified file path is incorrect or the file does not exist at that location.fixVerify the file path is absolute or relative to your script's working directory. Ensure the file actually exists.
Warnings
- breaking Tabulator v1.0.0 introduced a major API refactor. The 'Table' class was removed and replaced entirely by the 'Stream' class. Older code using 'Table' will no longer work.
- gotcha Reading certain file formats (e.g., Excel, JSON Lines, remote files) requires installing optional dependencies. Without them, you will encounter `ModuleNotFoundError`.
- gotcha CSV files often have encoding issues. If you encounter `UnicodeDecodeError`, it's likely due to an incorrect character encoding.
Install
-
pip install tabulator -
pip install tabulator[xlsx,xls,jsonl,http]
Imports
- Stream
from tabulator import Table
from tabulator import Stream
Quickstart
import os
from tabulator import Stream
# Example: Read a CSV file from a URL
csv_url = "https://raw.githubusercontent.com/frictionlessdata/tabulator-py/master/data/table.csv"
# Ensure the URL is accessible or provide a local path
try:
with Stream(csv_url) as stream:
print(f"Headers: {stream.headers}")
print("First 5 rows:")
for i, row in enumerate(stream):
if i >= 5:
break
print(row)
except Exception as e:
print(f"Could not read data from {csv_url}: {e}")
# Example: Reading a local CSV file (ensure 'example.csv' exists)
# You can create a dummy file for local testing:
# with open('example.csv', 'w') as f:
# f.write('id,name\n1,Alice\n2,Bob')
#
# local_csv_path = 'example.csv'
# if os.path.exists(local_csv_path):
# with Stream(local_csv_path) as stream:
# print(f"\nLocal CSV Headers: {stream.headers}")
# print(f"Local CSV First row: {next(iter(stream))}")
# else:
# print(f"\nSkipping local CSV example: {local_csv_path} not found.")