Adjust Precision for Schema

0.3.4 · active · verified Mon Apr 13

This library (version 0.3.4) is designed for use in Singer.io data integration targets to address and overcome precision differences that can arise between various data source systems, Python's native numeric types, and target data warehouses or databases. It aims to ensure data consistency and accuracy, particularly for decimal and floating-point numbers, during the ETL process. The release cadence appears to be irregular, based on available PyPI data.

Warnings

Install

Imports

Quickstart

This example demonstrates how the `adjust_precision` function (hypothesized based on the library's purpose) might be used within a Singer.io data pipeline. It takes a data record and a JSON Schema, adjusting numeric values within the record to conform to the precision and scale implied by the schema, particularly for fields marked with `"_singer_type": "decimal"` and `"multipleOf"`.

import json
from adjust_precision_for_schema import adjust_precision

# Example Singer SCHEMA message (simplified)
# This schema defines a 'price' field with a logical 'decimal' type
# and an implied precision/scale (e.g., up to 2 decimal places).
schema_message = {
    "type": "SCHEMA",
    "stream": "products",
    "schema": {
        "type": "object",
        "properties": {
            "id": {"type": "integer"},
            "name": {"type": "string"},
            "price": {
                "type": ["number", "null"],
                ""_singer_type": "decimal",
                ""maximum": 1000000000000000000000000000000000000.00,
                ""multipleOf": 0.01
            }
        }
    },
    "key_properties": ["id"]
}

# Example Singer RECORD message
record_message = {
    "type": "RECORD",
    "stream": "products",
    "record": {
        "id": 1,
        "name": "Product A",
        "price": 123.456789  # Value with more precision than schema intends
    }
}

# Another record with a value that should be adjusted minimally
record_message_2 = {
    "type": "RECORD",
    "stream": "products",
    "record": {
        "id": 2,
        "name": "Product B",
        "price": 99.99999999999999 # Value that should round up
    }
}

# Hypothetical function call to adjust precision based on the schema
# The exact API (e.g., arguments, return type) is inferred.
adjusted_record_1 = adjust_precision(record_message['record'], schema_message['schema'])
adjusted_record_2 = adjust_precision(record_message_2['record'], schema_message['schema'])

print("Original Record 1 Price:", record_message['record']['price'])
print("Adjusted Record 1 Price:", adjusted_record_1['price'])

print("Original Record 2 Price:", record_message_2['record']['price'])
print("Adjusted Record 2 Price:", adjusted_record_2['price'])

view raw JSON →