News
Models
API
keyboard_arrow_down
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.

MCP Server
Add mcp.jina.ai as your MCP server to access our API in LLMs
open_in_new
API Docs
Auto codegen for your copilot IDE or LLM
open_in_new


Company
keyboard_arrow_down
About us
Contact sales
Intern program
Terms & Conditions
Download Jina logo
open_in_new
Download Elastic logo
open_in_new



Log in
login
Embeddings
Reranker
copyright CC BY-NC 4.0
open_in_new Release Post

jina-colbert-v2

The best multilingual ColBERT with top performance on embedding and reranking
License
copyright CC-BY-NC-4.0
Release Date
calendar_month
2024-08-31
Input
abc
Text
arrow_forward
Output
apps
Multi-Vector
Matryoshka Dimensions
64
96
128
Model Details
Parameters: 560M
Input Token Length: 8K
Output Dimension: 128
Language Support
šŸŒ Multilingual support
Related Models
link
jina-colbert-v1-en
Tags
multilingual
late-interaction
long-context
high-performance
production-ready
retriever
token-level
89-languages
cross-lingual
matryoshka
storage-efficient
Available via
Jina APIAWS SageMakerMicrosoft AzureGoogle CloudHugging Face
I/O graph 1
I/O graph 2
Choose models to compare
Publications (1)
EMNLP 2024
August 30, 2024
Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

Overview

Jina-ColBERT-v2 is a groundbreaking multilingual information retrieval model that solves the critical challenge of efficient, high-quality search across multiple languages. As the first multilingual ColBERT-like model to generate compact embeddings, it addresses the growing need for scalable, cost-effective multilingual search solutions in global applications. Organizations dealing with multilingual content, from e-commerce platforms to content management systems, can leverage this model to provide accurate search results across 89 languages while significantly reducing storage and computational costs through its innovative dimension reduction capabilities.

Methods

The model builds upon the ColBERT architecture, introducing a sophisticated late interaction mechanism that fundamentally changes how queries and documents are matched. At its core, it uses a modified XLM-RoBERTa backbone with 560M parameters, enhanced by rotary position embeddings and optimized with flash attention. The training process involves two key stages: initial pretraining with diverse weakly-supervised data from various languages, followed by fine-tuning with labeled triplet data and supervised distillation. What makes this approach unique is the implementation of Matryoshka representation learning, which enables the model to produce embeddings in multiple dimensions (128, 96, or 64) from a single training process, allowing for dynamic storage optimization without retraining.

Performance

In real-world testing, Jina-ColBERT-v2 demonstrates exceptional capabilities across multiple benchmarks. It achieves a 6.5% improvement over the original ColBERT-v2 on English tasks, with an average score of 0.521 across 14 BEIR benchmarks. More impressively, it outperforms traditional BM25-based retrieval methods across all tested languages on MIRACL benchmarks, showing particular strength in cross-lingual scenarios. The model maintains this high performance even when using reduced embedding dimensions - dropping from 128 to 64 dimensions results in only a 1.5% performance decrease while halving storage requirements. This translates to significant cost savings in production: for example, storing 100 million documents with 64-dimension vectors costs $659.62 per month on AWS, compared to $1,319.24 for 128 dimensions.

Best Practice

To effectively deploy Jina-ColBERT-v2, teams should consider several practical aspects. The model requires CUDA-capable hardware for optimal performance and supports document lengths up to 8,192 tokens (extendable to 12,288) while limiting queries to 32 tokens. For production deployment, the model is available through the Jina Search Foundation API, AWS marketplace, and Azure, with a non-commercial version accessible via Hugging Face. When implementing, teams should specify whether they're embedding queries or documents, as the model uses asymmetric encoding. The model isn't designed for real-time processing of extremely large document collections without proper indexing, and while it excels at multilingual retrieval, it may show slightly lower performance on specialized domain-specific tasks compared to models fine-tuned for those specific domains.
Blogs that mention this model
October 03, 2025 • 7 minutes read
Jina Reranker v3: 0.6B Listwise Reranker for SOTA Multilingual Retrieval
New 0.6B-parameter listwise reranker that considers the query and all candidate documents in a single context window.
Jina AI
Light blue background with stylized text in the center, composed of small dots or squares, evoking a modern and minimalistic
December 16, 2024 • 2 minutes read
ReĀ·Search: Order 2024 Yearbook of Search Foundation Advances
Discover ReĀ·Search, our premium yearbook showcasing our best research articles and search foundation models in 2024. Featuring spot UV-coated hardcover, 160 full-color pages, and meticulous design throughout. Available worldwide at $35, shipping included.
Jina AI
Open red publication "ReSearch" volume 24 displayed on a white surface with a distinctive shadow casting over the pages.
October 29, 2024 • 11 minutes read
Beyond CLIP: How Jina-CLIP Advances Multimodal Search
Learn how Jina-CLIP enhances OpenAI's CLIP with better retrieval accuracy and more diverse results through unified text-image embeddings.
Bo Wang
Alex C-G
Abstract digital landscape with wave-like green and pink dunes against a dark background, conveying a tranquil atmosphere.
August 30, 2024 • 10 minutes read
Jina ColBERT v2: Multilingual Late Interaction Retriever for Embedding and Reranking
Jina ColBERT v2 supports 89 languages with superior retrieval performance, user-controlled output dimensions, and 8192 token-length.
Jina AI
Dark-themed coding interface displaying English and Japanese characters with "JINA COLBERT V2" highlighted in the center.
February 20, 2024 • 16 minutes read
What is ColBERT and Late Interaction and Why They Matter in Search?
Jina AI's ColBERT on Hugging Face has set Twitter abuzz, bringing a fresh perspective to search with its 8192-token capability. This article unpacks the nuances of ColBERT and ColBERTv2, showcasing their innovative designs and why their late interaction feature is a game-changer for search.
Han Xiao
Neon theater or concert hall marquee letters lit up at night with city lights and faint "Adobe Sto" visible.
Offices
location_on
Sunnyvale, CA
710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA
location_on
Berlin, Germany (HQ)
Prinzessinnenstraße 19-20, 10969 Berlin, Germany
Search Foundation
Reader
Embeddings
Reranker
Get Jina API key
Rate Limit
API Status
Company
About us
Contact sales
News
Intern program
Download Jina logo
open_in_new
Download Elastic logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
email
Jina AI by Elastic Ā© 2020-2025.