gllm-inference-binary (GenAI LLM Inference Binary)

raw JSON →
0.6.32 verified Fri May 01 auth: no python

A library containing components related to model inferences in Gen AI applications. Provides optimized binary inference modules for large language models. Current version 0.6.32, requires Python >=3.11,<3.14.

pip install gllm-inference-binary
error ModuleNotFoundError: No module named 'gllm_inference'
cause Trying to import the old package name; the package was renamed to gllm-inference-binary.
fix
Run pip uninstall gllm-inference and pip install gllm-inference-binary. Then use from gllm_inference_binary import ....
error AttributeError: module 'gllm_inference_binary' has no attribute 'BinaryModel'
cause BinaryModel is not directly in the top-level module; it's in the `models` submodule.
fix
Use from gllm_inference_binary.models import BinaryModel.
breaking Python 3.11+ required; Python 3.10 and below not supported.
fix Upgrade Python to 3.11, 3.12, or 3.13.
deprecated The old import path `from gllm_inference import ...` is deprecated and will be removed in v1.0. Use `gllm_inference_binary` instead.
fix Replace `gllm_inference` with `gllm_inference_binary` in all imports.
gotcha Model loading may require downloading large files on first use; ensure disk space and network.
fix Set environment variable GLLM_CACHE_DIR to a directory with sufficient space.

Basic inference using default model.

from gllm_inference_binary import InferencePipeline
pipeline = InferencePipeline(model_name="gpt2")
result = pipeline.run("Hello, world!")
print(result)