{"id":9925,"library":"matrice-inference","title":"Matrice Inference Utilities","description":"matrice-inference is a Python library providing common server utilities for Matrice.ai services, specifically designed for building and deploying machine learning inference services using FastAPI. It offers foundational classes and models for defining inference endpoints, handling requests, and managing service lifecycle. Currently at version 0.1.166, its pre-1.0 status suggests a rapid release cadence with potential API changes.","status":"active","version":"0.1.166","language":"en","source_language":"en","source_url":"https://github.com/MatriceAI/matrice-inference","tags":["ML","AI","inference","FastAPI","server","utilities","deep learning"],"install":[{"cmd":"pip install matrice-inference","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core web framework for building inference APIs.","package":"fastapi"},{"reason":"ASGI server for running the FastAPI application.","package":"uvicorn"},{"reason":"Data validation and settings management, heavily used for request/response models and configuration.","package":"pydantic"},{"reason":"Numerical computing, common in ML workflows.","package":"numpy"},{"reason":"Backports and enhancements for Python's typing module.","package":"typing_extensions"},{"reason":"OpenCV library without GUI dependencies, often used for image processing in ML.","package":"opencv-python-headless"},{"reason":"Flexible logging library used internally.","package":"loguru"},{"reason":"Support for parsing multipart/form-data, used by FastAPI for file uploads.","package":"python-multipart"},{"reason":"Cross-platform library for retrieving process and system utilization.","package":"psutil"}],"imports":[{"note":"This is the abstract base class for defining your custom inference logic.","symbol":"BaseInferenceService","correct":"from matrice_inference.base_inference import BaseInferenceService"},{"note":"Function to create a FastAPI application based on your custom inference service.","symbol":"create_app","correct":"from matrice_inference.api.app import create_app"},{"note":"Pydantic model for service configuration.","symbol":"InferenceConfig","correct":"from matrice_inference.config import InferenceConfig"}],"quickstart":{"code":"import uvicorn\nfrom fastapi import FastAPI\nfrom pydantic import BaseModel\nfrom typing import Any, TypeVar, Generic, Awaitable\nimport asyncio # Required for async in warmup/predict\n\nfrom matrice_inference.api.app import create_app\nfrom matrice_inference.base_inference import BaseInferenceService\nfrom matrice_inference.config import InferenceConfig\n\n# Define your custom request and response models\nclass MyInferenceRequest(BaseModel):\n    text: str\n    upper_case: bool = False\n\nclass MyInferenceResponse(BaseModel):\n    processed_text: str\n    original_length: int\n\n# Implement your inference service\nclass MyService(BaseInferenceService[MyInferenceRequest, MyInferenceResponse]):\n    def __init__(self, config: InferenceConfig):\n        super().__init__(config)\n        self.is_ready = False\n        print(f\"Service '{config.service_name}' initialized.\")\n\n    async def warmup(self):\n        \"\"\"Simulate loading a model.\"\"\"\n        print(\"Warming up MyService...\")\n        await asyncio.sleep(0.01) # Simulate async I/O\n        self.is_ready = True\n        print(\"MyService is ready.\")\n\n    async def predict(self, request: MyInferenceRequest) -> MyInferenceResponse:\n        \"\"\"Perform actual inference.\"\"\"\n        if not self.is_ready:\n            raise RuntimeError(\"Service not warmed up.\")\n\n        processed_text = request.text\n        if request.upper_case:\n            processed_text = request.text.upper()\n\n        return MyInferenceResponse(\n            processed_text=processed_text,\n            original_length=len(request.text)\n        )\n\n# Create a minimal configuration\ninference_config = InferenceConfig(\n    service_name=\"MyUpperCaseService\",\n    model_name=\"text_processor\",\n    model_version=\"1.0.0\"\n)\n\n# Instantiate your service\nmy_service = MyService(inference_config)\n\n# Create the FastAPI application\napp: FastAPI = create_app(\n    inference_service=my_service,\n    request_model=MyInferenceRequest,\n    response_model=MyInferenceResponse\n)\n\n# To run this, save as `main.py` and execute in your terminal:\n# uvicorn main:app --host 0.0.0.0 --port 8000\n# Then open http://localhost:8000/docs in your browser to test the API.","lang":"python","description":"This quickstart demonstrates how to create a basic inference service using `matrice-inference`. It involves defining custom request/response models, implementing `BaseInferenceService` with `warmup` and `predict` methods, configuring the service, and finally using `create_app` to generate a runnable FastAPI application. Save the code as `main.py` and run it with `uvicorn main:app --host 0.0.0.0 --port 8000`. You can then interact with the generated API at `http://localhost:8000/docs`."},"warnings":[{"fix":"Refer to the GitHub repository's latest source code for up-to-date API usage. Pin dependencies to exact versions (e.g., `matrice-inference==0.1.166`) and test thoroughly before upgrading.","message":"As a pre-1.0 library (version 0.1.x), `matrice-inference` does not guarantee API stability between minor releases. Expect frequent breaking changes to class signatures, function parameters, or module structures without explicit warnings in a changelog.","severity":"breaking","affected_versions":"All versions < 1.0.0"},{"fix":"Ensure your service class properly subclasses `BaseInferenceService` and implements both `async def warmup(self):` and `async def predict(self, request: RequestType) -> ResponseType:` methods.","message":"The `BaseInferenceService` is an abstract base class requiring all abstract methods (`warmup` and `predict`) to be implemented as `async` functions in your concrete service class. Forgetting `async` or not implementing a method will result in a `TypeError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Use a dedicated virtual environment. Ensure all project dependencies are compatible with the required Pydantic v2 range. If conflicts arise, consider using tools like `pipdeptree` or `poetry` to resolve dependency graph issues, or temporarily isolate `matrice-inference` in a microservice.","message":"`matrice-inference` has strict Pydantic v2 dependencies (e.g., `pydantic>=2.7.0,<2.8.0`). If your project uses Pydantic v1 or an incompatible Pydantic v2 range due to other dependencies, you will encounter `PydanticImportError` or other runtime issues.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Carefully review the source code for implicit configurations or dependencies. Be prepared to debug unexpected startup or runtime issues that might stem from environment mismatches. If possible, consult the maintainers for clarification on external use.","message":"This library is primarily an internal utility for Matrice.ai. Its documentation for external users is minimal, and some assumptions about the operational environment or pre-existing infrastructure might not be explicitly stated, leading to unexpected behavior in different setups.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Ensure `matrice-inference` is installed in your active environment: `pip install matrice-inference`. Double-check the import statement for typos: `from matrice_inference.base_inference import BaseInferenceService`.","cause":"The `matrice-inference` package is not installed or the import path is incorrect.","error":"ModuleNotFoundError: No module named 'matrice_inference.base_inference'"},{"fix":"Implement both `async def warmup(self):` and `async def predict(self, request: RequestType) -> ResponseType:` in your service class, ensuring they are `async` functions.","cause":"You have subclassed `BaseInferenceService` but have not implemented one or both of the required abstract methods (`warmup` and `predict`) or have not defined them as `async`.","error":"TypeError: Can't instantiate abstract class MyService with abstract methods predict, warmup"},{"fix":"Upgrade Pydantic to the required v2 range: `pip install \"pydantic>=2.7.0,<2.8.0\" --upgrade`. If other packages prevent this, consider using a separate virtual environment or a dependency management tool to resolve conflicts.","cause":"Your Python environment has a conflicting version of Pydantic. `matrice-inference` specifically requires Pydantic v2 (e.g., `pydantic>=2.7.0,<2.8.0`).","error":"pydantic.v2.errors.PydanticImportError: Pydantic V1 is installed, but Matrice Inference requires Pydantic V2."},{"fix":"Ensure your `warmup` method correctly sets the service's ready state (e.g., `self.is_ready = True`) and that `create_app` is used, allowing FastAPI to manage the startup lifecycle. Do not call `predict` manually before the service has fully started.","cause":"Your custom `predict` method in `MyService` was called before the `warmup` method completed, or before the service's `is_ready` flag was set to True. The `create_app` function automatically sets up a startup event to call `warmup`, but manual calls or improper `is_ready` logic can lead to this.","error":"RuntimeError: Service not warmed up."}]}