{"id":8231,"library":"inference-schema","title":"Inference Schema","description":"The `inference-schema` package provides a uniform schema definition for common machine learning applications, specifically designed to aid in web-based ML prediction services. It offers decorators (`@input_schema`, `@output_schema`) that automatically validate and serialize input/output data based on user-defined schemas, integrating well with web frameworks. The current version is 1.8, and it sees periodic updates, often tied to dependency version bumps or feature additions for ML deployments.","status":"active","version":"1.8","language":"en","source_language":"en","source_url":"https://github.com/Azure/inference-schema","tags":["machine-learning","schema","validation","mlops","web-services","api"],"install":[{"cmd":"pip install inference-schema","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Core library for schema definition and validation.","package":"marshmallow","optional":false},{"reason":"Required for `NumpyParameterType` and often for internal data handling.","package":"numpy","optional":false},{"reason":"Required for `PandasParameterType` and commonly used for structured data input/output.","package":"pandas","optional":false}],"imports":[{"symbol":"input_schema","correct":"from inference_schema.schema_decorators import input_schema"},{"symbol":"output_schema","correct":"from inference_schema.schema_decorators import output_schema"},{"symbol":"PandasParameterType","correct":"from inference_schema.parameter_types import PandasParameterType"},{"symbol":"NumpyParameterType","correct":"from inference_schema.parameter_types import NumpyParameterType"}],"quickstart":{"code":"from inference_schema.schema_decorators import input_schema, output_schema\nfrom inference_schema.parameter_types import PandasParameterType\nimport pandas as pd\nimport json\n\n# Define sample input and output data structures\n# These samples are used to infer the schema for validation and serialization\nsample_input_df = pd.DataFrame({'feature1': [10.0, 20.0], 'feature2': [30.0, 40.0]})\nsample_output_dict = {'prediction': [40.0, 60.0]}\n\n@input_schema(PandasParameterType(sample_input_df))\n@output_schema(sample_output_dict)\ndef predict(input_data: pd.DataFrame) -> dict:\n    \"\"\"\n    A dummy prediction function that takes a DataFrame and returns a dictionary.\n    The decorators handle validation of `input_data` and serialization of the return value.\n    \"\"\"\n    # Example prediction logic: sum of features\n    predictions = (input_data['feature1'] + input_data['feature2']).tolist()\n    return {'prediction': predictions}\n\n# --- Example Usage --- \n# This is how you'd typically call it, with input that matches the schema\ninput_for_prediction = pd.DataFrame({'feature1': [5.0, 15.0], 'feature2': [25.0, 35.0]})\nresult = predict(input_for_prediction)\nprint(f\"Predicted result: {result}\")\n\n# If used in a web service, the input might come as JSON and be deserialized\n# and validated into a DataFrame before reaching `predict` function.\n# Example: raw_json_input = '{\"feature1\": [5.0, 15.0], \"feature2\": [25.0, 35.0]}' \n# (framework would parse, inference-schema would validate/convert)","lang":"python","description":"This quickstart demonstrates how to define input and output schemas for a Python function using `inference-schema` decorators. It uses `PandasParameterType` for structured DataFrame input and a simple dictionary for output. The decorators validate the incoming `input_data` against `sample_input_df` and ensure the function's return value conforms to `sample_output_dict`'s structure."},"warnings":[{"fix":"Always ensure your `sample_input` and `sample_output` accurately reflect the exact column names, keys, and data types (e.g., float, int, string) that your function expects and returns.","message":"The `sample_input` and `sample_output` provided to the decorators are critical. They define the *structure and data types* of the expected input and output, not just placeholder values. Mismatches between the actual data at runtime and these samples will cause schema validation errors.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Align your project's `marshmallow` version with the range specified by `inference-schema`, or consider using a dedicated virtual environment to isolate dependencies.","message":"Inference-schema pins its core dependency, `marshmallow`, to specific version ranges (e.g., `<3.18.0` for v1.8). If your project uses a different `marshmallow` version, it can lead to dependency conflicts or unexpected validation behavior.","severity":"breaking","affected_versions":"All versions"},{"fix":"Ensure that the input to the decorated function is a `pandas.DataFrame` or that the web framework integration correctly deserializes the raw request body into a DataFrame before passing it to your function.","message":"When using `PandasParameterType`, the decorated function is expected to receive a `pandas.DataFrame` object. If you directly call the function with a different type (e.g., a dictionary or list) without it being processed by the schema, it will likely fail.","severity":"gotcha","affected_versions":"All versions using `PandasParameterType`"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Verify that the data types in your actual input data (e.g., float vs. int, string vs. number) precisely match the types present in your `sample_input` DataFrame/dictionary.","cause":"The input data type for a specific field did not match the type inferred from the `sample_input` schema.","error":"marshmallow.exceptions.ValidationError: {'field_name': ['Invalid type.']}"},{"fix":"Ensure the function's return value strictly conforms to the structure implied by the `sample_output` provided to `@output_schema`. If `output_schema` expects a list of numbers, convert your DataFrame column to a list using `.tolist()`.","cause":"The decorated function returned a dictionary, but the `output_schema` implied that a Pandas DataFrame was expected, or vice-versa, leading to an incompatible method call during serialization.","error":"AttributeError: 'dict' object has no attribute 'tolist'"},{"fix":"Ensure your `output_schema` (the `sample_output` dictionary/list) defines a structure that is inherently JSON-serializable (e.g., nested dictionaries and lists of primitive types). If your function returns a `DataFrame`, make sure the output schema forces its conversion to a list of dicts or similar.","cause":"This error typically occurs when a web framework tries to serialize the output of your decorated function (which might be a `pandas.DataFrame`) directly to JSON, but the `output_schema` hasn't fully transformed it into a JSON-compatible type.","error":"TypeError: Object of type 'DataFrame' is not JSON serializable"}]}