新聞
模型
API
keyboard_arrow_down
讀取器
讀取URL或搜索為大模型提供更好的依據。
向量模型
世界一流的多模態多語言向量模型。
重排器
世界一流的重排器,最大限度地提高搜索相關性。
Elastic Inference Service
在 Elasticsearch 中原生運行 Jina 模型。
MCP terminal命令行articlellms.txtsmart_toy代理人data_object模式menu_book文檔



登錄
login

新聞

加速搜索 AI,集腋成裘。

rss_feedRSS
folder_special
甄選
jina-embeddings-v5-omni:支援文字、圖片、音訊與影片的向量模型
單一模型,四種模態:文字、圖像、音訊、影片。同級最佳的 1.6B 與 0.9B 全能型向量模型。
Han Xiao
五月 12, 2026 • 7 分鐘的讀取量
jina-embeddings-v5-text:全新的 SOTA 小型多語言向量模型
兩款效能領先的 1B 以下多語言向量模型,現已於 Elastic Inference Service、Llama.cpp 與 MLX 上架。
Han Xiao
二月 19, 2026 • 7 分鐘的讀取量
Abstract digital artwork in black and white, featuring scattered dots forming letters in a halftone effect. The central lette
Jina-VLM:小型多語言視覺語言模型
全新 2B 視覺語言模型在多語言 VQA 上實現 SOTA,在純文字任務上沒有災難性遺忘。
Jina AI
十二月 04, 2025 • 7 分鐘的讀取量
Artistic representation of "Vln" in vibrant, rainbow-like colors on a minimalistic white background, with a focus on color di
school
學術論文
arXiv
五月 11, 2026
jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition
SIGIR 2026
二月 17, 2026
jina-embeddings-v5-text: Task-Targeted Embedding Distillation
ICLR 2026
一月 22, 2026
Embedding Compression via Spherical Coordinates
arXiv
十二月 29, 2025
Vision Encoders in Vision-Language Models: A Survey
ICLR 2026
十二月 04, 2025
Jina-VLM: Small Multilingual Vision Language Model
AAAI 2026
十月 01, 2025
jina-reranker-v3: Last but Not Late Interaction for Document Reranking
NeurIPS 2025
八月 31, 2025
Efficient Code Embeddings from Code Generation Models
EMNLP 2025
六月 24, 2025
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval
ICLR 2025
三月 04, 2025
ReaderLM-v2: Small Language Model for HTML to Markdown and JSON
ACL 2025
十二月 17, 2024
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
ICLR 2025
十二月 12, 2024
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
ECIR 2025
九月 18, 2024
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
SIGIR 2025
九月 07, 2024
Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models
EMNLP 2024
八月 30, 2024
Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever
WWW 2025
六月 21, 2024
Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models
ICML 2024
五月 30, 2024
Jina CLIP: Your CLIP Model Is Also Your Text Retriever
arXiv
二月 26, 2024
Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings
arXiv
十月 30, 2023
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents
EMNLP 2023
七月 20, 2023
Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models
共計 19 篇論文。
folder_special
甄選
school
學術論文
全部
新聞稿
技術博客
活動
觀點
chevron_leftchevron_right

五月 12, 2026 • 7 分鐘的讀取量
jina-embeddings-v5-omni:支援文字、圖片、音訊與影片的向量模型
單一模型,四種模態:文字、圖像、音訊、影片。同級最佳的 1.6B 與 0.9B 全能型向量模型。
Han Xiao
五月 11, 2026
jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition
We introduce frozen-encoder model composition, a novel approach to multimodal embedding models. We build on the VLM-style architecture, in which non-text encoders are adapted to produce input for a language model, which in turn generates embeddings for all varieties of input. The backbone text embedding models and the added non-text media encoders remain frozen. We only trained the connecting components, representing 0.35% of the total weights. The resulting jina-embeddings-v5-omni suite encodes text, image, audio, and video into a single semantic embedding space, producing competitive results with models 5-7x its size.
jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition
arXiv
三月 11, 2026 • 7 分鐘的讀取量
從多模態大模型引導音訊向量模型
將任何多模態大模型轉換為小型音訊向量模型,僅需 25 分之一的數據量,表現即可超越 CLAP。
Han Xiao
Abstract illustration of a sound wave or heartbeat, formed by blue, orange, and gray dots on a white background.
三月 06, 2026 • 6 分鐘的讀取量
從原始數值辨識向量模型
一個透過讀取原始數值來對向量模型進行指紋識別的微型 Transformer。無需特徵工程。
Han Xiao
Fingerprint illustration made from numbers, showcasing digital and high-tech design on a light background.
二月 19, 2026 • 7 分鐘的讀取量
jina-embeddings-v5-text:全新的 SOTA 小型多語言向量模型
兩款效能領先的 1B 以下多語言向量模型,現已於 Elastic Inference Service、Llama.cpp 與 MLX 上架。
Han Xiao
Abstract digital artwork in black and white, featuring scattered dots forming letters in a halftone effect. The central lette
二月 17, 2026
jina-embeddings-v5-text: Task-Targeted Embedding Distillation
Text embedding models are widely used for semantic similarity tasks, including information retrieval, clustering, and classification. General-purpose models are typically trained with single- or multi-stage processes using contrastive loss functions. We introduce a novel training regimen that combines model distillation techniques with task-specific contrastive loss to produce compact, high-performance embedding models. Our findings suggest that this approach is more effective for training small models than purely contrastive or distillation-based training paradigms alone. Benchmark scores for the resulting models, jina-embeddings-v5-text-small and jina-embeddings-v5-text-nano, exceed or match the state-of-the-art for models of similar size. jina-embeddings-v5-text models additionally support long texts (up to 32k tokens) in many languages, and generate embeddings that remain robust under truncation and binary quantization. Model weights are publicly available, hopefully inspiring further advances in embedding model development.
jina-embeddings-v5-text: Task-Targeted Embedding Distillation
SIGIR 2026
一月 22, 2026
Embedding Compression via Spherical Coordinates
We present a compression method for unit-norm embeddings that achieves 1.5x compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around pi/2, causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is below 1e-7, under float32 machine epsilon. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent improvement. The method requires no training.
Embedding Compression via Spherical Coordinates
ICLR 2026
十二月 29, 2025
Vision Encoders in Vision-Language Models: A Survey
Vision encoders have remained comparatively small while language models scaled from billions to hundreds of billions of parameters. This survey analyzes vision encoders across 70+ vision-language models from 2023–2025 and finds that training methodology matters more than encoder size: improvements in loss functions, data curation, and feature objectives yield larger gains than scaling by an order of magnitude. Native resolution handling improves document understanding, and multi-encoder fusion captures complementary features no single encoder provides. We organize encoders into contrastive, self-supervised, and LLM-aligned families, providing a taxonomy and practical selection guidance for encoder design and deployment.
Vision Encoders in Vision-Language Models: A Survey
arXiv
十二月 04, 2025 • 7 分鐘的讀取量
Jina-VLM:小型多語言視覺語言模型
全新 2B 視覺語言模型在多語言 VQA 上實現 SOTA,在純文字任務上沒有災難性遺忘。
Jina AI
Artistic representation of "Vln" in vibrant, rainbow-like colors on a minimalistic white background, with a focus on color di
十二月 04, 2025
Jina-VLM: Small Multilingual Vision Language Model
We present jina-vlm, a 2.4B parameter vision-language model that achieves state-of-the-art multilingual visual question answering among open 2B-scale VLMs. The model couples a SigLIP2 vision encoder with a Qwen3 language backbone through an attention-pooling connector that enables token-efficient processing of arbitrary-resolution images. Across standard VQA benchmarks and multilingual evaluations, jina-vlm achieves leading results while preserving competitive text-only performance. Model weights and code are publicly released.
Jina-VLM: Small Multilingual Vision Language Model
ICLR 2026
搜索底座
讀取器
向量模型
重排器
Elastic Inference Service
open_in_new
獲取 Jina API 密鑰
速率限制
API 狀態
公司
關於我們
新聞
下載 Jina 標誌
open_in_new
下載 Elastic 徽標
open_in_new
條款
安全
條款及條件
隱私
管理 Cookie
Elastic © 2020-2026.