Products

Search, read and reason until best answer found.

Convert any URL to Markdown for better grounding LLMs.

World-class multimodal multilingual embeddings.

World-class reranker for maximizing search relevancy.

Zero-shot and few-shot classification for image and text.

Cut long text into chunks and do tokenization.

Auto codegen for your copilot IDE or LLM

Company

Terms & Conditions

Newsroom

Accelerate search AI, one word at a time.

Featured

Modern dot matrix text display on a dark blue background, conveying a digital feel.

April 08, 2025 • 21 minutes read

jina-reranker-m0: Multilingual Multimodal Document Reranker

Introducing jina-reranker-m0, our new multilingual multimodal reranker for retrieving visual documents, with SOTA performance on multilingual long documents and code searching tasks.

February 25, 2025 • 19 minutes read

A Practical Guide to Implementing DeepSearch/DeepResearch

QPS out, depth in. DeepSearch is the new norm. Find answers through read-search-reason loops. Learn what it is and how to build it.

Abstract interlocking circles pattern in black on orange, with text 'THINK:SEARCH:THINK' below.

January 15, 2025 • 17 minutes read

ReaderLM v2: Frontier Small Language Model for HTML to Markdown and JSON

ReaderLM-v2 is a 1.5B small language model for HTML-to-Markdown conversion and HTML-to-JSON extraction with exceptional quality.

Orange text "ReaderLM-u2" on a vibrant dark red digital screen.

Latest

May 07, 2025 • 9 minutes read

Model Soup’s Recipe for Embeddings

Black background with a simple white ruler marked in centimeters, emphasizing a minimalist design.

April 16, 2025 • 10 minutes read

On the Size Bias of Text Embeddings and Its Impact in Search

Brown background with a stylized whale graphic and the text "THINK:" and ":SEARCH>" in code-like font.

April 01, 2025 • 17 minutes read

Using DeepSeek R1 Reasoning Model in DeepSearch

Academic Publications

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

December 17, 2024

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

December 12, 2024

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

September 18, 2024

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

September 07, 2024

Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models

August 30, 2024

Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

February 26, 2024

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

October 30, 2023

Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models

11 publications in total.

Featured

All

Press release

Tech blog

Opinion

Event

May 07, 2025 • 9 minutes read

Model Soup’s Recipe for Embeddings

Boost robustness and performance with model soups: averaging weights. No extra cost, better results.

April 16, 2025 • 10 minutes read

On the Size Bias of Text Embeddings and Its Impact in Search

Size bias refers to how the length of text inputs affects similarity, regardless of semantic relevance. It explains why search systems sometimes return long, barely-relevant documents instead of shorter, more precise matches to your query.

Black background with a simple white ruler marked in centimeters, emphasizing a minimalist design.

April 08, 2025 • 21 minutes read

jina-reranker-m0: Multilingual Multimodal Document Reranker

Introducing jina-reranker-m0, our new multilingual multimodal reranker for retrieving visual documents, with SOTA performance on multilingual long documents and code searching tasks.

Modern dot matrix text display on a dark blue background, conveying a digital feel.

April 01, 2025 • 17 minutes read

Using DeepSeek R1 Reasoning Model in DeepSearch

Standard LLM or reasoning model, which is better for DeepSearch? In this post, we explored using DeepSeek-R1 in the DeepSearch implementation for choosing the next action.

Brown background with a stylized whale graphic and the text "THINK:" and ":SEARCH>" in code-like font.

March 31, 2025 • 7 minutes read

DeepSearch on Private Visual Documents: An Enterprise Case Study

Our DeepSearch works with private PDFs and visual documents right out of the box. Discover how DeepSearch can unlock valuable insights from your enterprise data.

Five hanging light bulbs against a black backdrop, with the central bulb glowing yellow, symbolizing individuality.

March 12, 2025 • 11 minutes read

Snippet Selection and URL Ranking in DeepSearch/DeepResearch

Nailing these two details takes your DeepSearch from mid to GOAT: selecting the best snippets from lengthy webpages and ranking URLs before crawling.

March 07, 2025 • 14 minutes read

Long-Context Embedding Models are Blind Beyond 4K Tokens

We investigate embedding models on new "needle-in-haystack" tasks and find that beyond 4K tokens, they're just rolling dice - even with exact lexical matches or query expansion, they can't tell signal from noise in long context.

Vertical repetition of the word 'HAYSTACK' with a solitary 'NEEDLE' on a yellowish background.

February 27, 2025 • 5 minutes read

LLM-as-SERP: Search Engine Result Pages from Large Language Models

This idea either extremely smart or extremely stupid—no in-between. Read till the end and find out why this could be useful.

Screenshot of a search engine displaying results for 'Jina AI' featuring links to GitHub, Wikipedia, and other resources on a

February 25, 2025 • 19 minutes read

A Practical Guide to Implementing DeepSearch/DeepResearch

QPS out, depth in. DeepSearch is the new norm. Find answers through read-search-reason loops. Learn what it is and how to build it.

Abstract interlocking circles pattern in black on orange, with text 'THINK:SEARCH:THINK' below.

February 18, 2025 • 9 minutes read

Query Expansion with LLMs: Searching Better by Saying More

Search has changed a lot since embedding models were introduced. Is there still a role for lexical techniques like query expansion in AI? We think so.

Black circular pattern with radiating white lines on a rust-colored background, resembling a stylized sun or web.

Search by title

Filter by product

Filter by author

Offices

Sunnyvale, CA

710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA

Berlin, Germany (HQ)

Prinzessinnenstraße 19-20, 10969 Berlin, Germany

Beijing, China

Level 5, Building 6, No.48 Haidian West St. Beijing, China

Shenzhen, China

402 Floor 4, Fu'an Technology Building, Shenzhen, China

Search Foundation

API Documentation

Get Jina API key

Company

Terms

Terms & Conditions

Jina AI © 2020-2025.