Tablib: Pythonic Tabular Datasets
Tablib is an MIT-licensed, format-agnostic tabular dataset library for Python, enabling Pythonic import, export, and manipulation of tabular data. It supports various formats like XLS, JSON, YAML, CSV, Pandas DataFrames, and HTML. Maintained by the Jazzband community, it is currently in version 3.9.0 and receives regular updates for Python compatibility and bug fixes.
Warnings
- breaking The logic for `Row.lpush` and `Row.rpush` methods was reversed in Tablib 2.0.0. In versions 1.x, `lpush` appended and `rpush` prepended, which was non-standard. This was corrected in 2.0.0.
- breaking Python 2 support was dropped with Tablib 1.0.0. Python 3.5 support was dropped with Tablib 3.0.0.
- breaking Starting with Tablib 1.0.0, all format dependencies became optional. Previously, they might have been installed by default. To install all possible format dependencies, you now need to use `pip install "tablib[all]"`.
- gotcha When exporting CSV files on Windows, if you don't specify `newline=''` when opening the file in write mode, Excel might display a blank line between each row.
- gotcha The legacy XLS format (Excel 97-2003) has a limitation of 65,000 rows. If you are dealing with larger datasets, you should use the XLSX format instead.
- gotcha When importing XLSX or ODS files with `read_only=True` (which is the default for XLSX), Tablib relies on the spreadsheet declaring correct dimensions. Some programs might generate files with incorrect dimensions, leading to incomplete reads.
Install
-
pip install tablib -
pip install "tablib[all]" -
pip install "tablib[xlsx, pandas, yaml]"
Imports
- Dataset
from tablib import Dataset
- Databook
from tablib import Databook
Quickstart
import tablib
# Create a new Dataset
data = tablib.Dataset()
# Add headers
data.headers = ['First Name', 'Last Name', 'Age']
# Add rows
data.append(['Kenneth', 'Reitz', 22])
data.append(['Bessie', 'Monke', 20])
# Add a new column with data
data.append_col([True, False], header='Is Student')
print("CSV Export:")
print(data.export('csv'))
print("\nJSON Export:")
print(data.export('json'))
# Example of saving to a file (uncomment to run)
# with open('output.xlsx', 'wb') as f:
# f.write(data.export('xlsx'))
# print("\nData exported to output.xlsx")