Qwen3 Reranker 0.6B GGUF
JSON →A GGUF quantized 0.6-billion parameter Qwen3 reranker for lightweight local inference via llama.cpp.
Specs
context window 32K tokens
max output 32K tokens
Capabilities
streaming
Dates
releasedApr 2025
A GGUF quantized 0.6-billion parameter Qwen3 reranker for lightweight local inference via llama.cpp.