Table Schema
A utility library for working with Table Schema in Python, enabling validation, inference, and manipulation of tabular data based on the Table Schema standard. It is actively maintained with frequent releases, currently at version 1.21.0. An important notice indicates that the broader Frictionless Framework offers a more complete data solution, extending `tableschema`'s functionality.
Warnings
- breaking Version `1.0` introduced significant breaking changes, including API renames (e.g., `tableschema.push/pull_resource` to `tableschema.Table`, `tableschema.model` to `tableschema.Schema`, `tableschema.types` to `tableschema.Field`) and changes in parameter signatures for `Field.cast/test_value`.
- gotcha The `tableschema` library, while functional, is part of the broader Frictionless Data ecosystem. The Frictionless Framework (a separate library) provides extended and improved `tableschema` functionality as a more complete data solution. While existing `tableschema` code is not breaking, users are encouraged to consider the Frictionless Framework for new projects or future enhancements.
- gotcha The `validate` function (e.g., `tableschema.validate`) is designed to validate a Table Schema *descriptor* itself, not to validate *data* against a given schema. Passing data directly to `validate` will not perform data validation.
- gotcha The library uses semantic versioning, meaning major versions (e.g., v1.x.x to v2.x.x) can introduce breaking changes. Relying on `tableschema` without a version constraint in your `requirements.txt` can lead to unexpected breakages when new major versions are released.
Install
-
pip install tableschema
Imports
- Table
from tableschema import Table
- Schema
from tableschema import Schema
- Field
from tableschema import Field
- validate
from tableschema import validate
- infer
from tableschema import infer
Quickstart
import os
from tableschema import Table, Schema
# Create dummy data and schema files for demonstration
data_csv_content = """
id,name,age
1,Alice,30
2,Bob,24
3,Charlie,35
"""
schema_json_content = """
{
"fields": [
{"name": "id", "type": "integer"},
{"name": "name", "type": "string"},
{"name": "age", "type": "integer"}
]
}
"""
with open('data.csv', 'w') as f:
f.write(data_csv_content)
with open('schema.json', 'w') as f:
f.write(schema_json_content)
# 1. Create a Table instance with data and schema
table = Table('data.csv', schema='schema.json')
# 2. Print schema descriptor
print("Schema Descriptor:", table.schema.descriptor)
# 3. Read and print data as keyed rows
print("\nData Rows (keyed):")
for keyed_row in table.iter(keyed=True):
print(keyed_row)
# 4. Infer schema from data
headers = ['id', 'name', 'age']
rows = [[1, 'Alice', 30], [2, 'Bob', 24], [3, 'Charlie', 35]]
inferred_schema_descriptor = Schema.infer(rows, headers)
print("\nInferred Schema Descriptor:", inferred_schema_descriptor)
# Clean up dummy files
os.remove('data.csv')
os.remove('schema.json')