API

读取URL或搜索为大模型提供更好的依据。

世界一流的多模态多语言向量模型。

世界一流的重排器，最大限度地提高搜索相关性。

Elastic Inference Service

在 Elasticsearch 中原生运行 Jina 模型。

MCP 命令行 llms.txt 代理人模式文档

主题

新闻

加速搜索 AI，集腋成裘。

甄选

jina-embeddings-v5-omni：支持文本、图像、音频和视频的向量模型

一个模型，四种模态：文本、图像、音频、视频。业界领先的 1.6B 和 0.9B 全能型向量模型。

五月 12, 2026 • 7 分钟的读取量

jina-embeddings-v5-text：全新的 SOTA 小型多语言向量模型

两款性能领先的 1B 以下多语言向量模型，现已在 Elastic Inference Service、Llama.cpp 和 MLX 上可用。

二月 19, 2026 • 7 分钟的读取量

Abstract digital artwork in black and white, featuring scattered dots forming letters in a halftone effect. The central lette

Jina-VLM：小型多语言视觉语言模型

全新 2B 视觉语言模型在多语言 VQA 上实现 SOTA，在纯文本任务上没有灾难性遗忘。

十二月 04, 2025 • 7 分钟的读取量

Artistic representation of "Vln" in vibrant, rainbow-like colors on a minimalistic white background, with a focus on color di

学术论文

五月 11, 2026

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

二月 17, 2026

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

二月 11, 2026

Embedding Inversion via Conditional Masked Diffusion Language Models

一月 22, 2026

Embedding Compression via Spherical Coordinates

十二月 29, 2025

Vision Encoders in Vision-Language Models: A Survey

十二月 04, 2025

Jina-VLM: Small Multilingual Vision Language Model

十月 01, 2025

jina-reranker-v3: Last but Not Late Interaction for Document Reranking

八月 31, 2025

Efficient Code Embeddings from Code Generation Models

六月 24, 2025

jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

三月 04, 2025

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

十二月 17, 2024

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

十二月 12, 2024

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

九月 18, 2024

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

九月 07, 2024

Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models

八月 30, 2024

Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

六月 21, 2024

Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models

五月 30, 2024

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

二月 26, 2024

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

十月 30, 2023

Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

七月 20, 2023

Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models

共计 20 篇论文。

甄选

学术论文

全部

新闻稿

技术博客

活动

观点

五月 12, 2026 • 7 分钟的读取量

jina-embeddings-v5-omni：支持文本、图像、音频和视频的向量模型

一个模型，四种模态：文本、图像、音频、视频。业界领先的 1.6B 和 0.9B 全能型向量模型。

五月 11, 2026

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

We introduce frozen-encoder model composition, a novel approach to multimodal embedding models. We build on the VLM-style architecture, in which non-text encoders are adapted to produce input for a language model, which in turn generates embeddings for all varieties of input. The backbone text embedding models and the added non-text media encoders remain frozen. We only trained the connecting components, representing 0.35% of the total weights. The resulting jina-embeddings-v5-omni suite encodes text, image, audio, and video into a single semantic embedding space, producing competitive results with models 5-7x its size.

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

三月 11, 2026 • 7 分钟的读取量

从多模态大模型中引导音频向量模型

将任何多模态大模型转化为小型音频向量模型，仅需 1/25 的数据量即可超越 CLAP。

Abstract illustration of a sound wave or heartbeat, formed by blue, orange, and gray dots on a white background.

三月 06, 2026 • 6 分钟的读取量

通过原始数值识别向量模型

一个通过读取原始数字来为向量模型提取指纹的微型 Transformer。无需特征工程。

Fingerprint illustration made from numbers, showcasing digital and high-tech design on a light background.

二月 19, 2026 • 7 分钟的读取量

jina-embeddings-v5-text：全新的 SOTA 小型多语言向量模型

两款性能领先的 1B 以下多语言向量模型，现已在 Elastic Inference Service、Llama.cpp 和 MLX 上可用。

Abstract digital artwork in black and white, featuring scattered dots forming letters in a halftone effect. The central lette

二月 17, 2026

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

Text embedding models are widely used for semantic similarity tasks, including information retrieval, clustering, and classification. General-purpose models are typically trained with single- or multi-stage processes using contrastive loss functions. We introduce a novel training regimen that combines model distillation techniques with task-specific contrastive loss to produce compact, high-performance embedding models. Our findings suggest that this approach is more effective for training small models than purely contrastive or distillation-based training paradigms alone. Benchmark scores for the resulting models, jina-embeddings-v5-text-small and jina-embeddings-v5-text-nano, exceed or match the state-of-the-art for models of similar size. jina-embeddings-v5-text models additionally support long texts (up to 32k tokens) in many languages, and generate embeddings that remain robust under truncation and binary quantization. Model weights are publicly available, hopefully inspiring further advances in embedding model development.

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

二月 11, 2026

Embedding Inversion via Conditional Masked Diffusion Language Models

We frame embedding inversion as conditional masked diffusion, recovering all tokens in parallel through iterative denoising rather than sequential autoregressive generation. A masked diffusion language model is conditioned on the target embedding via adaptive layer normalization, requiring only 8 forward passes through a 78M parameter model with no access to the target encoder. On 32-token sequences across three embedding models, the method achieves 81.3% token accuracy and 0.87 cosine similarity.

Embedding Inversion via Conditional Masked Diffusion Language Models

一月 22, 2026

Embedding Compression via Spherical Coordinates

We present a compression method for unit-norm embeddings that achieves 1.5x compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around pi/2, causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is below 1e-7, under float32 machine epsilon. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent improvement. The method requires no training.

Embedding Compression via Spherical Coordinates

十二月 29, 2025

Vision Encoders in Vision-Language Models: A Survey

Vision encoders have remained comparatively small while language models scaled from billions to hundreds of billions of parameters. This survey analyzes vision encoders across 70+ vision-language models from 2023–2025 and finds that training methodology matters more than encoder size: improvements in loss functions, data curation, and feature objectives yield larger gains than scaling by an order of magnitude. Native resolution handling improves document understanding, and multi-encoder fusion captures complementary features no single encoder provides. We organize encoders into contrastive, self-supervised, and LLM-aligned families, providing a taxonomy and practical selection guidance for encoder design and deployment.

Vision Encoders in Vision-Language Models: A Survey

十二月 04, 2025 • 7 分钟的读取量

Jina-VLM：小型多语言视觉语言模型

全新 2B 视觉语言模型在多语言 VQA 上实现 SOTA，在纯文本任务上没有灾难性遗忘。

Artistic representation of "Vln" in vibrant, rainbow-like colors on a minimalistic white background, with a focus on color di

按标题搜索

按产品过滤

按作者过滤

按模型筛选

搜索底座

Elastic Inference Service

获取 Jina API 密钥

公司

下载 Jina 标志

下载 Elastic 徽标

条款

条款及条件

Elastic © 2020-2026.