Presto Types Parser
`presto-types-parser` is a small Python library designed to parse and convert input rows returned by the Presto REST API according to their specified Presto data types. It ensures that data structures like arrays, maps, and rows, as well as primitive types, are correctly interpreted into Python equivalents. The current version is 0.0.2, and it appears to have a very slow release cadence with only one public release.
Common errors
-
IndexError: list index out of range
cause The number of values in a row did not match the number of types specified, or a nested array/map structure within a row was malformed.fixEnsure that `len(row)` is equal to `len(types)` for every `row` in your `rows` list. For nested types, verify that the data structure in the `rows` matches the declared type (e.g., a list for `ARRAY<...>`, a dictionary for `MAP<...>`). -
KeyError: 'UNKNOWN_TYPE'
cause An unsupported or misspelled Presto type string was provided in the `types` list.fixDouble-check the spelling of all type strings in your `types` list. Only standard Presto types like `INTEGER`, `VARCHAR`, `BOOLEAN`, `DOUBLE`, `ARRAY<...>`, `MAP<...>`, `ROW<...>`, `JSON` are supported. Refer to Presto documentation for valid type names.
Warnings
- gotcha The parser expects input rows and types to strictly adhere to the Presto API's format. Any mismatch in the number of values per row versus the number of provided types, or malformed complex type strings (e.g., `ARRAY<VARCHAR>`), will lead to parsing errors or incorrect output.
- gotcha Error messages for complex or deeply nested type mismatches might be generic. Debugging issues with `ARRAY`, `MAP`, or `ROW` types may require manual inspection of the input data and type definitions.
Install
-
pip install presto-types-parser
Imports
- parse_presto_rows
from presto_types_parser import parse_presto_rows
- parse_presto_rows_with_names
from presto_types_parser import parse_presto_rows_with_names
Quickstart
from presto_types_parser import parse_presto_rows rows = [[1, 'value_a', True], [2, 'value_b', False], [3, None, True]] types = ['INTEGER', 'VARCHAR', 'BOOLEAN'] parsed_data = parse_presto_rows(rows, types) print(parsed_data) # Expected output: # [[1, 'value_a', True], [2, 'value_b', False], [3, None, True]]