Qwen3 Reranker 4B GGUF

JSON →
alibaba reranking
text

A GGUF quantized version of the Qwen3 4B reranker model for efficient local inference via llama.cpp.

context window 41K tokens
max output 41K tokens
streaming
releasedApr 2025