GLiNER2: Unified Schema-Based Information Extraction
GLiNER2 is an efficient, unified information extraction system that combines Named Entity Recognition (NER), Text Classification, Structured Data Extraction, and Relation Extraction into a single 205M-parameter model. Built on a fine-tuned transformer encoder, it provides CPU-based inference for local processing without requiring complex pipelines or external API dependencies, offering a powerful alternative to larger language models.
Common errors
-
ModuleNotFoundError: No module named 'gliner2'
cause The 'gliner2' Python package has not been installed in the current environment.fixInstall the library using pip: `pip install gliner2`. -
OSError: Cannot load model 'fastino/gliner2-base-v1'. Make sure that 'fastino/gliner2-base-v1' is a valid model ID or path and that you have internet connectivity.
cause The specified pre-trained model name is incorrect, or there is an issue with network connectivity preventing the model download from the Hugging Face Hub.fixVerify the model name against available models on Hugging Face (e.g., `fastino/gliner2-base-v1`, `fastino/gliner2-large-v1`). Ensure a stable internet connection. If behind a firewall, configure proxy settings. -
KeyError: 'entities' or unexpectedly empty/incorrect extraction results for entities.
cause The labels provided to `extract_entities` are too generic, ambiguous, or do not adequately describe the desired entities within the context of the text, leading the model to miss them or return less precise results.fixProvide descriptive and specific labels for entity types. For improved accuracy, consider using natural language descriptions for each entity type within the schema to guide the model more effectively (e.g., instead of just `['event']`, try `['historical events, wars, or conflicts']`).
Warnings
- breaking Migration from the original GLiNER (v1) to GLiNER2 involves a significant architectural shift to a unified, schema-driven approach for multi-task extraction. Code designed for GLiNER v1's task-specific methods will likely not be directly compatible with GLiNER2's `extractor.extract(text, schema)` pattern.
- gotcha When performing structured data extraction (e.g., with `extract_json`), attributes within the extracted JSON (like a 'name' field) can sometimes be `null`. The model excels at direct extraction but may struggle with tasks requiring complex reasoning or inference.
- gotcha Relation extraction performance is highly sensitive to the clarity and specificity of label naming and descriptions. Vague or overly similar relation labels can lead to the model populating one relation type but missing another with identical intent (e.g., 'alias' vs. 'same_as').
- gotcha Using the powerful GLiNER XL 1B model requires API access and an API key (PIONEER_API_KEY), which must be provided either as an environment variable or directly to `GLiNER2.from_api()`.
Install
-
pip install gliner2
Imports
- GLiNER2
from gliner2 import GLiNER2
Quickstart
import os
from gliner2 import GLiNER2
# For GLiNER XL 1B via API, uncomment and set environment variable:
# os.environ['PIONEER_API_KEY'] = os.getenv('PIONEER_API_KEY', 'your_api_key_here')
# extractor = GLiNER2.from_api()
# Load a local pre-trained model
extractor = GLiNER2.from_pretrained("fastino/gliner2-base-v1")
text = "Apple CEO Tim Cook announced iPhone 15 in Cupertino yesterday."
labels = ["company", "person", "product", "location"]
# Perform entity extraction
result = extractor.extract_entities(text, labels)
print(result)