{"library":"matrice-inference","title":"Matrice Inference Utilities","description":"matrice-inference is a Python library providing common server utilities for Matrice.ai services, specifically designed for building and deploying machine learning inference services using FastAPI. It offers foundational classes and models for defining inference endpoints, handling requests, and managing service lifecycle. Currently at version 0.1.166, its pre-1.0 status suggests a rapid release cadence with potential API changes.","language":"python","status":"active","last_verified":"Fri Apr 17","install":{"commands":["pip install matrice-inference"],"cli":null},"imports":["from matrice_inference.base_inference import BaseInferenceService","from matrice_inference.api.app import create_app","from matrice_inference.config import InferenceConfig"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import uvicorn\nfrom fastapi import FastAPI\nfrom pydantic import BaseModel\nfrom typing import Any, TypeVar, Generic, Awaitable\nimport asyncio # Required for async in warmup/predict\n\nfrom matrice_inference.api.app import create_app\nfrom matrice_inference.base_inference import BaseInferenceService\nfrom matrice_inference.config import InferenceConfig\n\n# Define your custom request and response models\nclass MyInferenceRequest(BaseModel):\n    text: str\n    upper_case: bool = False\n\nclass MyInferenceResponse(BaseModel):\n    processed_text: str\n    original_length: int\n\n# Implement your inference service\nclass MyService(BaseInferenceService[MyInferenceRequest, MyInferenceResponse]):\n    def __init__(self, config: InferenceConfig):\n        super().__init__(config)\n        self.is_ready = False\n        print(f\"Service '{config.service_name}' initialized.\")\n\n    async def warmup(self):\n        \"\"\"Simulate loading a model.\"\"\"\n        print(\"Warming up MyService...\")\n        await asyncio.sleep(0.01) # Simulate async I/O\n        self.is_ready = True\n        print(\"MyService is ready.\")\n\n    async def predict(self, request: MyInferenceRequest) -> MyInferenceResponse:\n        \"\"\"Perform actual inference.\"\"\"\n        if not self.is_ready:\n            raise RuntimeError(\"Service not warmed up.\")\n\n        processed_text = request.text\n        if request.upper_case:\n            processed_text = request.text.upper()\n\n        return MyInferenceResponse(\n            processed_text=processed_text,\n            original_length=len(request.text)\n        )\n\n# Create a minimal configuration\ninference_config = InferenceConfig(\n    service_name=\"MyUpperCaseService\",\n    model_name=\"text_processor\",\n    model_version=\"1.0.0\"\n)\n\n# Instantiate your service\nmy_service = MyService(inference_config)\n\n# Create the FastAPI application\napp: FastAPI = create_app(\n    inference_service=my_service,\n    request_model=MyInferenceRequest,\n    response_model=MyInferenceResponse\n)\n\n# To run this, save as `main.py` and execute in your terminal:\n# uvicorn main:app --host 0.0.0.0 --port 8000\n# Then open http://localhost:8000/docs in your browser to test the API.","lang":"python","description":"This quickstart demonstrates how to create a basic inference service using `matrice-inference`. It involves defining custom request/response models, implementing `BaseInferenceService` with `warmup` and `predict` methods, configuring the service, and finally using `create_app` to generate a runnable FastAPI application. Save the code as `main.py` and run it with `uvicorn main:app --host 0.0.0.0 --port 8000`. You can then interact with the generated API at `http://localhost:8000/docs`.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":null}