News
Models
API
keyboard_arrow_down
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.

MCP Server
Add mcp.jina.ai as your MCP server to access our API in LLMs
open_in_new
API Docs
child_careHumanssmart_toyAgentsdata_objectSchema



Log in
login
Embeddings
copyright CC BY-NC 4.0
open_in_new Release Post

jina-embeddings-v5-text-small

SOTA multilingual embeddings with task-specific adapters
License
copyright CC-BY-NC-4.0
Release Date
calendar_month
2026-02-18
Input
abc
Text
arrow_forward
Output
more_horiz
Vector
Matryoshka Dimensions help_outline
32
64
128
256
512
1024
Model Details
Parameters: 677M
Input Token Length: 32K
Output Dimension: 1024
Base Model help_outline
open_in_new
Qwen3-0.6B-Base
Trained Languages help_outline
32 languages
Supported Languages help_outline
93 languages
Quantizations help_outline
GGUF
Apple Silicon Support help_outline
MLX
Related Models
link
jina-embeddings-v3
link
jina-embeddings-v5-text-nano
Supported Tasks
search Retrieval
compare_arrows Text Matching
bubble_chart Clustering
label Classification
Tags
text-embedding
multilingual
long-context
production
matryoshka
last-token-pooling
Available via
Elastic Inference ServiceJina APIHugging Face
I/O graph

Text

jina-embeddings-v5-text-small

Task

Vector

Choose models to compare
Publications (1)
arXiv
February 17, 2026
jina-embeddings-v5-text: Task-Targeted Embedding Distillation

Overview

jina-embeddings-v5-text-small is a 0.6B parameter multilingual text embedding model built on the Qwen3-0.6B-Base backbone. It produces 1024-dimensional embeddings via last-token pooling and supports context lengths up to 32K tokens through rotary positional embeddings (RoPE) with adjusted base frequencies. The model includes four task-specific LoRA adapters for retrieval, semantic similarity, clustering, and classification, trained independently on frozen backbone weights. Matryoshka Representation Learning enables embedding truncation to dimensions as low as 32. The model is trained using a two-stage process: first, embedding distillation from Qwen3-Embedding-4B to transfer knowledge from the larger teacher model, followed by task-specific adapter training with specialized loss functions for each task category. It supports asymmetric retrieval with 'Query:' and 'Document:' prefixes.

Methods

Training proceeds in two stages. In the first stage, embedding distillation transfers knowledge from Qwen3-Embedding-4B (a 4B parameter teacher model) to the Qwen3-0.6B-Base student model using a cosine distance loss between projected student embeddings and teacher embeddings. A linear projection layer maps the student's 1024-dimensional space into the teacher's higher-dimensional space. General-purpose distillation uses over 300 datasets in 30+ languages for 50,000 steps, followed by long-context training on synthetic and natural long documents (1,000-4,096 tokens) with adjusted RoPE parameters. In the second stage, four LoRA adapters are trained on frozen backbone weights: the retrieval adapter combines InfoNCE contrastive loss with hard negatives, continued distillation loss, and a Global Orthogonal Regularizer (GOR) for quantization robustness; the text-matching adapter uses CoSENT ranking loss for graded similarity with distillation on unscored pairs; the clustering adapter uses re-distillation with a clustering-specific teacher instruction; and the classification adapter uses bidirectional InfoNCE loss with relational knowledge distillation regularization. Final retrieval adapter weights are averaged across checkpoints.

Performance

On MMTEB (multilingual), jina-embeddings-v5-text-small achieves 67.0 average (task-level) and 58.9 average (type-level), the highest among all models under 1B parameters. It scores 71.3 on classification, 53.4 on clustering, 82.9 on pair classification, 65.7 on reranking, 64.9 on retrieval, and 78.9 on STS. On English MTEB, it achieves 71.7 average, outperforming Qwen3-0.6B with instructions (70.5) and jina-embeddings-v3 (65.7). On retrieval-specific benchmarks, it scores 64.88 on MTEB-M retrieval, 66.84 on RTEB, 56.67 on BEIR, and 66.39 on LongEmbed. The model surpasses its teacher Qwen3-4B on pair classification (42.0 vs 26.8 on MMTEB) while maintaining competitive scores across all other categories despite being 6x smaller.

Best Practice

Select the appropriate LoRA adapter for your task: 'retrieval' for asymmetric query-document search (prepend 'Query:' to queries and 'Document:' to passages), 'text-matching' for symmetric similarity tasks like duplicate detection and paraphrase identification (uses 'Document:' prefix for both inputs), 'clustering' for grouping related documents, and 'classification' for categorization and sentiment analysis. For retrieval tasks, always use the correct prefix as the model is trained with asymmetric encoding. Matryoshka truncation allows reducing embeddings from 1024 to as low as 32 dimensions; performance remains strong above 256 dimensions but degrades noticeably below that threshold, consistent with Johnson-Lindenstrauss limits. Binary quantization is supported with minimal performance loss thanks to GOR regularization. The 32K context window handles long documents natively, but the model was additionally trained on long-context data for robust long-document retrieval. Use cosine similarity for embedding comparison. The model is available via Jina AI API, Hugging Face (with Sentence Transformers and vLLM integration), and quantized variants for llama.cpp.
Blogs that mention this model
February 19, 2026 • 7 minutes read
jina-embeddings-v5-text: New SOTA Small Multilingual Embeddings
Two sub-1B multilingual embeddings with best-in-class performance, available on Elastic Inference Service, Llama.cpp and MLX.
Han Xiao
Abstract digital artwork in black and white, featuring scattered dots forming letters in a halftone effect. The central lette
Offices
location_on
Sunnyvale, CA
710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA
location_on
Berlin, Germany
Prinzessinnenstraße 19-20, 10969 Berlin, Germany
Search Foundation
Reader
Embeddings
Reranker
Get Jina API key
Rate Limit
API Status
Company
About us
Contact sales
News
Intern program
Download Jina logo
open_in_new
Download Elastic logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
email
Jina AI by Elastic © 2020-2026.