jina-embeddings-v5-text-nano

Embeddings

CC BY-NC 4.0

Release Post

jina-embeddings-v5-text-nano

SOTA multilingual embeddings for edge deployment

License

CC-BY-NC-4.0

Release Date

2026-02-18

Input

Text

Output

Vector

Matryoshka Dimensions

128

256

512

768

Late Chunking

Model Details

Parameters: 239M

Input Token Length: 8K

Output Dimension: 768

Base Model

EuroBERT-210M

Trained Languages

32 languages

Supported Languages

108 languages

Quantizations

GGUF

Apple Silicon Support

MLX

Related Models

jina-embeddings-v3

jina-embeddings-v5-text-small

Supported Tasks

Retrieval

Text Matching

Clustering

Classification

Overview

jina-embeddings-v5-text-nano is a 239M parameter multilingual text embedding model built on the EuroBERT-210M backbone, a bidirectional encoder pretrained on 15 major European and global languages. It produces 768-dimensional embeddings via last-token pooling and supports context lengths up to 32K tokens. The model includes four task-specific LoRA adapters (6.7M parameters each) for retrieval, semantic similarity, clustering, and classification. Matryoshka Representation Learning enables embedding truncation to dimensions as low as 32. Trained using embedding distillation from Qwen3-Embedding-4B followed by task-specific adapter training, the model achieves performance competitive with models more than twice its size, making it suitable for latency-sensitive and edge deployments.

Methods

Training follows the same two-stage process as jina-embeddings-v5-text-small but applied to the EuroBERT-210M backbone. First-stage embedding distillation transfers knowledge from Qwen3-Embedding-4B using cosine distance loss with a linear projection layer mapping the student's 768-dimensional embeddings into the teacher's space. Training uses diverse multilingual text pairs from over 300 datasets. In the second stage, four task-specific LoRA adapters (6.7M parameters each) are trained on frozen backbone weights: retrieval (InfoNCE + distillation + GOR), text-matching (CoSENT + distillation), clustering (re-distillation with task-specific teacher instructions), and classification (bidirectional InfoNCE + relational knowledge distillation). The EuroBERT backbone provides strong multilingual coverage across 15 major European and global languages including English, French, German, Spanish, Chinese, Japanese, Arabic, and Hindi.

Performance

On MMTEB (multilingual), jina-embeddings-v5-text-nano achieves 65.5 average (task-level) and 57.7 average (type-level) at just 239M parameters, outperforming all models under 500M parameters including KaLM-mini-v2.5 (60.1, 494M params), voyage-4-nano (58.9, 480M params), and Gemma-300M (61.1, 308M params). It scores 69.2 on classification, 52.7 on clustering, 81.9 on pair classification, 64.6 on reranking, 63.3 on retrieval, and 78.2 on STS. On English MTEB, it achieves 71.0 average, nearly matching the much larger jina-embeddings-v5-text-small (71.7). On retrieval benchmarks, it scores 63.26 on MTEB-M, 64.08 on RTEB, 56.06 on BEIR, and 63.65 on LongEmbed. Embeddings remain robust under binary quantization, with GOR regularization limiting performance degradation to under 2 points on MTEB retrieval.

Best Practice

Select the appropriate LoRA adapter for your task: 'retrieval' for asymmetric query-document search (prepend 'Query:' to queries and 'Document:' to passages), 'text-matching' for symmetric similarity tasks (uses 'Document:' prefix for both inputs), 'clustering' for grouping related documents, and 'classification' for categorization. The nano model is optimized for latency-sensitive and resource-constrained deployments while maintaining competitive accuracy with models more than twice its size. Matryoshka truncation allows reducing embeddings from 768 to as low as 32 dimensions; keep dimensions above 256 for best results. Binary quantization is supported. The EuroBERT backbone provides strong coverage for 15 major languages including English, French, German, Spanish, Chinese, Japanese, Arabic, and Hindi. Use cosine similarity for embedding comparison. Available via Jina AI API, Hugging Face (Sentence Transformers and vLLM), and quantized variants for llama.cpp.

Blogs that mention this model

May 12, 2026 • 7 minutes read

jina-embeddings-v5-omni: Embeddings for Text, Image, Audio and Video

One model, four modalities: text, image, audio, video. Best-in-class omni embeddings in 1.6B and 0.9B.

March 06, 2026 • 6 minutes read

Identifying Embedding Models from Raw Numerical Values

A tiny transformer that fingerprints embedding models by reading raw numerical digits. No feature engineering.

February 19, 2026 • 7 minutes read

jina-embeddings-v5-text: New SOTA Small Multilingual Embeddings

Two sub-1B multilingual embeddings with best-in-class performance, available on Elastic Inference Service, Llama.cpp and MLX.