jina-reranker-v3

Listwise reranker for SOTA multilingual document retrieval

License

CC-BY-NC-4.0

Release Date

2025-10-01

Input

Text (Query)

Text (Document)

Output

Rankings

Model Details

Parameters: 0.6B

Input Token Length: 131K

Output Dimension: 256

Language Support

🌍 Multilingual support

Quantizations

GGUF

Apple Silicon Support

MLX

Related Models

jina-reranker-v2-base-multilingual

jina-reranker-m0

Overview

jina-reranker-v3 is a 0.6B parameter multilingual document reranker introducing a novel last but not late interaction architecture. Unlike ColBERT's separate encoding with multi-vector matching, this model performs causal self-attention between query and documents within the same context window, enabling rich cross-document interactions before extracting contextual embeddings from the last token of each document. Built on Qwen3-0.6B with 28 transformer layers and a lightweight MLP projector (1024→512→256), it processes up to 64 documents simultaneously within 131K token context. The model achieves state-of-the-art BEIR performance with 61.94 nDCG-10 while being 10× smaller than generative listwise rerankers.

Methods

Employs three-stage progressive training with multi-objective loss combining InfoNCE, dispersive loss (0.45), dual matching loss (0.85), and similarity loss (0.85). Stage 1 uses LoRA fine-tuning (r=16, α=32) on domain-specific datasets including BGE-M3, Cornstack, with 16 documents per query. Stage 2 extends context to 8,192 tokens and mines hard negatives across retrieval systems with up to 25 negatives at τ=0.05. Stage 3 merges specialized models with weights 0.25-0.65. Special tokens doc_emb and query_emb mark embedding extraction positions. Training uses structured prompts with system/user/assistant roles, placing query at both beginning and end for bidirectional attention.

Performance

Achieves 61.94 nDCG-10 on BEIR, highest among all evaluated rerankers and 4.88% improvement over jina-reranker-v2. Excels in multi-hop retrieval with 78.56 on HotpotQA, fact verification reaching 93.95 on FEVER. Multilingual performance reaches 66.50 on MIRACL across 18 languages, with Arabic at 78.69 and Thai at 81.06. Code retrieval achieves 63.28 on CoIR. Outperforms 1.5B mxbai-rerank-large (61.44) with 2.5× fewer parameters. Shows 5.43% improvement over same-scale bge-reranker-v2-m3. Relatively stable across document orderings: random (62.54), descending (61.94), ascending (61.52).

Best Practice

Use structured prompt template with system/user/assistant roles and special tokens for embedding extraction. Process up to 64 documents per forward pass for collections exceeding 131K context. Optimal with documents ordered randomly or by descending relevance. Leverage cross-document interaction capability for comparative ranking tasks. For multilingual applications, model provides strong zero-shot transfer across 18 languages. Implement batch processing for large document sets, maintaining query embeddings consistently across batches. Consider the 256-dimensional output embeddings for efficient similarity computation. Ideal for applications requiring both ranking quality and inference efficiency, particularly multi-hop reasoning and fact verification tasks.

Blogs that mention this model

October 01, 2025

jina-reranker-v3: Last but Not Late Interaction for Document Reranking

jina-reranker-v3 is a 0.6B parameter multilingual document reranker that introduces a novel last but not late interaction. Unlike late interaction models such as ColBERT that perform separate encoding followed by multi-vector matching, our approach conducts causal self-attention between query and documents within the same context window, enabling rich cross-document interactions before extracting contextual embeddings from the last token of each document. This compact architecture achieves state-of-the-art BEIR performance with 61.94 nDCG@10 while being significant smaller than generative listwise rerankers.

arXiv

October 03, 2025 • 7 minutes read

Jina Reranker v3: 0.6B Listwise Reranker for SOTA Multilingual Retrieval

New 0.6B-parameter listwise reranker that considers the query and all candidate documents in a single context window.