VTL Engine - VTL Script Runner and Validator

raw JSON →
1.6.8 verified Mon Apr 27 auth: no python

vtlengine is a Python library for running and validating VTL (Validation and Transformation Language) scripts. Version 1.6.8 supports Python >=3.9 and provides an interpreter with a pandas backend. It is released on PyPI with roughly monthly updates.

pip install vtlengine
error vtlengine.exceptions.VTLSemanticError: Operator '...' not defined
cause The VTL script references an operator or function that is not implemented or misspelled.
fix
Check the VTL syntax and available operators in the official documentation.
error vtlengine.exceptions.VTLSyntaxError: line X:Y mismatched input '...' expecting ...
cause Invalid VTL syntax, often missing semicolons or incorrect parentheses.
fix
Validate the script using Engine.from_string() and review the error line.
error TypeError: __init__() got an unexpected keyword argument 'backend'
cause The Engine constructor does not accept a 'backend' parameter anymore (if using old API).
fix
Instantiate Engine() without arguments: engine = Engine()
breaking The 'pandas' backend is the default and only supported backend. Using 'pandas' as the backend parameter is required; omitting it uses pandas, but explicitly passing other backends (e.g., 'spark') will raise an error.
fix Always use backend='pandas' or omit the parameter.
deprecated The legacy time period representation (time_period_output_format='legacy') was added in 1.6.0 but is deprecated and will be removed in a future version.
fix Use ISO 8601 format (default) or the new custom format.
gotcha DataFrame columns not matching the DataStructure (if provided) will raise an error since v1.6.4. Ensure column names and types align exactly.
fix Define DataStructure or ensure DataFrame columns match exactly.

Demonstrates basic VTL script execution using the Engine class with pandas DataFrames.

from vtlengine import Engine
import pandas as pd
from datetime import date

# Sample data
input_ds = pd.DataFrame({
    'id1': ['A', 'B'],
    'id2': ['X', 'Y'],
    'meas1': [10, 20],
    'meas2': [5.5, 7.2]
})

# Define VTL script
vtl_script = """
define operator mul(a, b) returns c
tr := a * b / 2
end operator
"""

# Initialize engine
engine = Engine()

# Run VTL script with datasets
result = engine.run(vtl_script, datasets={'input_ds': input_ds})
print(result['output_ds'])