News
Models
Products
keyboard_arrow_down
DeepSearch
Search, read and reason until best answer found.
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.
More
keyboard_arrow_down
Classifier
Zero-shot and few-shot classification for image and text.
Segmenter
Cut long text into chunks and do tokenization.

API Docs
Auto codegen for your copilot IDE or LLM
open_in_new


Company
keyboard_arrow_down
About us
Contact sales
Intern program
Join us
open_in_new
Download logo
open_in_new
Terms & Conditions


Log in
login
copyright

jina-colbert-v2

The best multilingual ColBERT with top performance on embedding and reranking
Release Postarrow_forward
License
copyright
CC-BY-NC-4.0
Release Date
calendar_month
2024-08-31
Input
abc
Text
arrow_forward
Output
apps
Multi-Vector
Model Details
Parameters: 560M
Input Token Length: 8K
Output Dimension: 128
Language Support
šŸŒ Multilingual support
Related Models
link
jina-colbert-v1-en
Tags
multilingual
late-interaction
long-context
high-performance
production-ready
retriever
token-level
89-languages
cross-lingual
matryoshka
storage-efficient
Available via
Jina APICommercial LicenseAWS SageMakerMicrosoft AzureGoogle CloudHugging Face
I/O graph 1
I/O graph 2
Choose models to compare
Publications (1)
EMNLP 2024
August 30, 2024
Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

Overview

Jina-ColBERT-v2 is a groundbreaking multilingual information retrieval model that solves the critical challenge of efficient, high-quality search across multiple languages. As the first multilingual ColBERT-like model to generate compact embeddings, it addresses the growing need for scalable, cost-effective multilingual search solutions in global applications. Organizations dealing with multilingual content, from e-commerce platforms to content management systems, can leverage this model to provide accurate search results across 89 languages while significantly reducing storage and computational costs through its innovative dimension reduction capabilities.

Methods

The model builds upon the ColBERT architecture, introducing a sophisticated late interaction mechanism that fundamentally changes how queries and documents are matched. At its core, it uses a modified XLM-RoBERTa backbone with 560M parameters, enhanced by rotary position embeddings and optimized with flash attention. The training process involves two key stages: initial pretraining with diverse weakly-supervised data from various languages, followed by fine-tuning with labeled triplet data and supervised distillation. What makes this approach unique is the implementation of Matryoshka representation learning, which enables the model to produce embeddings in multiple dimensions (128, 96, or 64) from a single training process, allowing for dynamic storage optimization without retraining.

Performance

In real-world testing, Jina-ColBERT-v2 demonstrates exceptional capabilities across multiple benchmarks. It achieves a 6.5% improvement over the original ColBERT-v2 on English tasks, with an average score of 0.521 across 14 BEIR benchmarks. More impressively, it outperforms traditional BM25-based retrieval methods across all tested languages on MIRACL benchmarks, showing particular strength in cross-lingual scenarios. The model maintains this high performance even when using reduced embedding dimensions - dropping from 128 to 64 dimensions results in only a 1.5% performance decrease while halving storage requirements. This translates to significant cost savings in production: for example, storing 100 million documents with 64-dimension vectors costs $659.62 per month on AWS, compared to $1,319.24 for 128 dimensions.

Best Practice

To effectively deploy Jina-ColBERT-v2, teams should consider several practical aspects. The model requires CUDA-capable hardware for optimal performance and supports document lengths up to 8,192 tokens (extendable to 12,288) while limiting queries to 32 tokens. For production deployment, the model is available through the Jina Search Foundation API, AWS marketplace, and Azure, with a non-commercial version accessible via Hugging Face. When implementing, teams should specify whether they're embedding queries or documents, as the model uses asymmetric encoding. The model isn't designed for real-time processing of extremely large document collections without proper indexing, and while it excels at multilingual retrieval, it may show slightly lower performance on specialized domain-specific tasks compared to models fine-tuned for those specific domains.
Blogs that mention this model
December 16, 2024 • 2 minutes read
ReĀ·Search: Order 2024 Yearbook of Search Foundation Advances
Discover ReĀ·Search, our premium yearbook showcasing our best research articles and search foundation models in 2024. Featuring spot UV-coated hardcover, 160 full-color pages, and meticulous design throughout. Available worldwide at $35, shipping included.
Jina AI
Open red publication "ReSearch" volume 24 displayed on a white surface with a distinctive shadow casting over the pages.
October 29, 2024 • 11 minutes read
Beyond CLIP: How Jina-CLIP Advances Multimodal Search
Learn how Jina-CLIP enhances OpenAI's CLIP with better retrieval accuracy and more diverse results through unified text-image embeddings.
Bo Wang
Alex C-G
Abstract digital landscape with wave-like green and pink dunes against a dark background, conveying a tranquil atmosphere.
August 30, 2024 • 10 minutes read
Jina ColBERT v2: Multilingual Late Interaction Retriever for Embedding and Reranking
Jina ColBERT v2 supports 89 languages with superior retrieval performance, user-controlled output dimensions, and 8192 token-length.
Jina AI
Dark-themed coding interface displaying English and Japanese characters with "JINA COLBERT V2" highlighted in the center.
February 20, 2024 • 16 minutes read
What is ColBERT and Late Interaction and Why They Matter in Search?
Jina AI's ColBERT on Hugging Face has set Twitter abuzz, bringing a fresh perspective to search with its 8192-token capability. This article unpacks the nuances of ColBERT and ColBERTv2, showcasing their innovative designs and why their late interaction feature is a game-changer for search.
Han Xiao
Neon theater or concert hall marquee letters lit up at night with city lights and faint "Adobe Sto" visible.
Offices
location_on
Sunnyvale, CA
710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA
location_on
Berlin, Germany (HQ)
Prinzessinnenstraße 19-20, 10969 Berlin, Germany
location_on
Beijing, China
Level 5, Building 6, No.48 Haidian West St. Beijing, China
location_on
Shenzhen, China
402 Floor 4, Fu'an Technology Building, Shenzhen, China
Search Foundation
DeepSearch
Reader
Embeddings
Reranker
Classifier
Segmenter
API Documentation
Get Jina API key
Rate Limit
API Status
Company
About us
Contact sales
Newsroom
Intern program
Join us
open_in_new
Download logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
email
Jina AI Ā© 2020-2025.