jina-reranker-v2-base-multilingual

The latest and best reranker model with multilingual, function calling and code search support.

License

CC-BY-NC-4.0

Release Date

2024-06-25

Input

Text (Query)

Text (Document)

Output

Rankings

Model Details

Parameters: 278M

Input Token Length: 8K

Language Support

🌍 Multilingual support

Related Models

jina-reranker-v1-base-en

jina-reranker-v1-turbo-en

jina-reranker-v1-tiny-en

Overview

Jina Reranker v2 Base Multilingual is a cross-encoder model designed to enhance search accuracy across language barriers and data types. This reranker addresses the critical challenge of precise information retrieval in multilingual environments, particularly valuable for global enterprises needing to refine search results across different languages and content types. With support for over 100 languages and unique capabilities in function calling and code search, it serves as a unified solution for teams requiring accurate search refinement across international content, API documentation, and multilingual codebases. The model's compact 278M parameter design makes it particularly appealing for organizations seeking to balance high performance with resource efficiency.

Methods

The model employs a cross-encoder architecture enhanced with Flash Attention 2, enabling direct comparison between queries and documents for more accurate relevance assessment. Trained through a four-stage process, the model first establishes English language capabilities, then progressively incorporates cross-lingual and multilingual data, before final refinement with hard-negative examples. This innovative training approach, combined with the Flash Attention 2 implementation, allows the model to process sequences up to 524,288 tokens while maintaining exceptional speed. The architecture's efficiency enables it to handle complex reranking tasks across multiple languages with 6x higher throughput compared to its predecessor, while ensuring accurate relevance assessment through direct query-document interaction.

Performance

In real-world evaluations, the model demonstrates exceptional capabilities across diverse benchmarks. It achieves state-of-the-art performance on the AirBench leaderboard for RAG systems and shows strong results in multilingual tasks, including the MKQA dataset covering 26 languages. The model excels particularly in structured data tasks, achieving high recall scores in both function calling (ToolBench benchmark) and SQL schema matching (NSText2SQL benchmark). Most impressively, it delivers these results while processing documents 15 times faster than comparable models like bge-reranker-v2-m3, making it practical for real-time applications. However, users should note that optimal performance requires a CUDA-capable GPU for inference.

Best Practice

For optimal deployment, the model requires a CUDA-capable GPU and can be accessed through multiple channels including the Reranker API, major RAG frameworks like Haystack and LangChain, or deployed privately via cloud marketplaces. The model excels in scenarios requiring precise understanding across language barriers and data types, making it ideal for global enterprises working with multilingual content, API documentation, or code repositories. Its extensive context window of 524,288 tokens enables processing of large documents or entire codebases in a single pass. Teams should consider using this model when they need to enhance search accuracy across languages, require function calling capabilities for agentic RAG systems, or want to improve code search functionality across multilingual codebases. The model is particularly effective when used in conjunction with vector search systems, where it can significantly improve the final ranking of retrieved documents.

Blogs that mention this model

October 03, 2025 • 7 minutes read

Jina Reranker v3: 0.6B Listwise Reranker for SOTA Multilingual Retrieval

New 0.6B-parameter listwise reranker that considers the query and all candidate documents in a single context window.