Embeddings

Top-performing multimodal multilingual long-context embeddings for search, RAG, agents applications.

Embedding API

Try our embedding models to improve your search and RAG systems. Start with a free trial!

Rate Limit

Raise issue

FAQ

Status

Select embeddings

L2 normalization

Scale embeddings to unit length (L2 norm = 1). Required for cosine similarity via dot product.

Output data type

embedding_type

encoding_format

output_dtype

embedding_types

Choose output format: float (default), binary (compact storage), or base64 (efficient transmission).

Default (as float)

Example input

Change them and see how the response changes!

Organic skincare for sensitive skin with aloe vera and chamomile: Imagine the soothing embrace of nature with our organic skincare range, crafted specifically for sensitive skin. Infused with the calming properties of aloe vera and chamomile, each product provides gentle nourishment and protection. Say goodbye to irritation and hello to a glowing, healthy complexion.

Bio-Hautpflege für empfindliche Haut mit Aloe Vera und Kamille: Erleben Sie die wohltuende Wirkung unserer Bio-Hautpflege, speziell für empfindliche Haut entwickelt. Mit den beruhigenden Eigenschaften von Aloe Vera und Kamille pflegen und schützen unsere Produkte Ihre Haut auf natürliche Weise. Verabschieden Sie sich von Hautirritationen und genießen Sie einen strahlenden Teint.

Cuidado de la piel orgánico para piel sensible con aloe vera y manzanilla: Descubre el poder de la naturaleza con nuestra línea de cuidado de la piel orgánico, diseñada especialmente para pieles sensibles. Enriquecidos con aloe vera y manzanilla, estos productos ofrecen una hidratación y protección suave. Despídete de las irritaciones y saluda a una piel radiante y saludable.

针对敏感肌专门设计的天然有机护肤产品：体验由芦荟和洋甘菊提取物带来的自然呵护。我们的护肤产品特别为敏感肌设计，温和滋润，保护您的肌肤不受刺激。让您的肌肤告别不适，迎来健康光彩。

新しいメイクのトレンドは鮮やかな色と革新的な技術に焦点を当てています: 今シーズンのメイクアップトレンドは、大胆な色彩と革新的な技術に注目しています。ネオンアイライナーからホログラフィックハイライターまで、クリエイティビティを解き放ち、毎回ユニークなルックを演出しましょう。

Request

Bash

Language

curl https://api.jina.ai/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer " \
  -d @- <<EOFEOF
  {
    "normalized": true,
    "embedding_type": "float",
    "input": [
        "Organic skincare for sensitive skin with aloe vera and chamomile: Imagine the soothing embrace of nature with our organic skincare range, crafted specifically for sensitive skin. Infused with the calming properties of aloe vera and chamomile, each product provides gentle nourishment and protection. Say goodbye to irritation and hello to a glowing, healthy complexion.",
        "Bio-Hautpflege für empfindliche Haut mit Aloe Vera und Kamille: Erleben Sie die wohltuende Wirkung unserer Bio-Hautpflege, speziell für empfindliche Haut entwickelt. Mit den beruhigenden Eigenschaften von Aloe Vera und Kamille pflegen und schützen unsere Produkte Ihre Haut auf natürliche Weise. Verabschieden Sie sich von Hautirritationen und genießen Sie einen strahlenden Teint.",
        "Cuidado de la piel orgánico para piel sensible con aloe vera y manzanilla: Descubre el poder de la naturaleza con nuestra línea de cuidado de la piel orgánico, diseñada especialmente para pieles sensibles. Enriquecidos con aloe vera y manzanilla, estos productos ofrecen una hidratación y protección suave. Despídete de las irritaciones y saluda a una piel radiante y saludable.",
        "针对敏感肌专门设计的天然有机护肤产品：体验由芦荟和洋甘菊提取物带来的自然呵护。我们的护肤产品特别为敏感肌设计，温和滋润，保护您的肌肤不受刺激。让您的肌肤告别不适，迎来健康光彩。",
        "新しいメイクのトレンドは鮮やかな色と革新的な技術に焦点を当てています: 今シーズンのメイクアップトレンドは、大胆な色彩と革新的な技術に注目しています。ネオンアイライナーからホログラフィックハイライターまで、クリエイティビティを解き放ち、毎回ユニークなルックを演出しましょう。"
    ]
  }
EOFEOF

API key

Available tokens

This is your unique key. Store it securely!

v5-omni: One Embedding for All

Text, image, audio, video — one shared embedding space, two sizes. v5-omni-small (1.6B) is the best-performing open-weight omni model under 2B parameters. v5-omni-nano (0.9B) delivers competitive retrieval at under 1B. Both are byte-for-byte compatible with v5-text — no reindexing needed.

v5-text: New SOTA Small Multilingual Embeddings

jina-embeddings-v5-text delivers fifth-generation embedding quality in two efficient sizes — a 677M small and 239M nano model — with task-specific LoRA adapters, Matryoshka dimensions, 32K context, and GGUF/MLX quantization for edge deployment, setting new benchmarks across MMTEB, MTEB English, and retrieval tasks.

Two Ways to Purchase

Subscribe to our API or purchase through cloud providers.

With 3 cloud service providers

Using AWS or Azure? You can deploy our models directly on your company's cloud platform and handle billing through the CSP account.

With Jina Search Foundation API

The easiest way to access all of our products. Top-up tokens as you go.

Enter the API key you wish to recharge

Top up this API key with more tokens

Depending on your location, you may be charged in USD, EUR, or other currencies. Taxes may apply.

Please input the right API key to top up

Understand the rate limit

Rate limits are the maximum number of requests that can be made to an API within a minute per IP address/API key (RPM). Find out more about the rate limits for each product and tier below.

Rate Limit

Rate limits are tracked in three ways: RPM (requests per minute), and TPM (tokens per minute). Limits are enforced per IP/API key and will be triggered when either the RPM or TPM threshold is reached first. When you provide an API key in the request header, we track rate limits by key rather than IP address.

Columns

Product	API Endpoint	Description	w/o API Key	w/ Free API Key	w/ Paid API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	500 RPM	500 RPM	5000 RPM	7.9s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		100 RPM	100 RPM	1000 RPM	2.5s	Every request costs a fixed number of tokens, starting from 10000 tokens	GET/POST
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		100 RPM & 100,000 TPM	500 RPM & 2,000,000 TPM	5,000 RPM & 50,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Rank documents by query		100 RPM & 100,000 TPM	500 RPM & 2,000,000 TPM	5,000 RPM & 50,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		25 RPM & 25,000 TPM	125 RPM & 500,000 TPM	1,250 RPM & 12,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		25 RPM & 25,000 TPM	125 RPM & 500,000 TPM	1,250 RPM & 12,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		25 RPM & 25,000 TPM	125 RPM & 500,000 TPM	1,250 RPM & 12,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Segmenter API	`https://api.jina.ai/v1/segment`	Tokenize and segment long text	20 RPM	200 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Reason, search and iterate to find the best answer		50 RPM	50 RPM	500 RPM	56.7s	Count the total number of tokens in the whole process.	POST

Auto top-up on low token balance

Recommended for uninterrupted service in production. When your token balance drops below the set threshold, we will automatically recharge your saved payment method for the last purchased package, until the threshold is met.

We introduced a new pricing model on May 6th, 2025. If you enabled auto-recharge before this date, you'll continue to pay the old price (the one when you purchased). The new pricing only applies if you modify your auto-recharge settings or purchase a new API key.

< 1M Tokens

Top up when

On-premises deployment

Deploy Jina Embeddings models in AWS Sagemaker and Microsoft Azure, and soon in Google Cloud Services, or contact our sales team to get customized Kubernetes deployments for your Virtual Private Cloud and on-premises servers.

AWS SageMaker

Embeddings

Reranker

Microsoft Azure

Embeddings

Reranker

Google Cloud

Embeddings

API Integrations

Our Embedding API is natively integrated with various renowned databases, vector stores, RAG, and LLMOps frameworks. To begin, just copy and paste your API key into any of the listed integrations for a quick and seamless start.

Vector Store

LLMOps

RAG

Observability

MongoDB

DataStax

Qdrant

Pinecone

Chroma

Weaviate

Milvus

Epsilla

MyScale

LlamaIndex

Haystack

Langchain

Dify

SuperDuperDB

DashVector

Portkey

Baseten

TiDB

LanceDB

Carbon

Our Publications

Understand how our frontier search models were trained from scratch, check out our latest publications. Meet our team at EMNLP, SIGIR, ICLR, NeurIPS, and ICML!

arXiv

May 11, 2026

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

SIGIR 2026

February 17, 2026

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

ICLR 2026

January 22, 2026

Embedding Compression via Spherical Coordinates

arXiv

December 29, 2025

Vision Encoders in Vision-Language Models: A Survey

ICLR 2026

December 04, 2025

Jina-VLM: Small Multilingual Vision Language Model

AAAI 2026

October 01, 2025

jina-reranker-v3: Last but Not Late Interaction for Document Reranking

NeurIPS 2025

August 31, 2025

Efficient Code Embeddings from Code Generation Models

EMNLP 2025

June 24, 2025

jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

ICLR 2025

March 04, 2025

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

ACL 2025

December 17, 2024

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

ICLR 2025

December 12, 2024

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

ECIR 2025

September 18, 2024

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

SIGIR 2025

September 07, 2024

Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models

EMNLP 2024

August 30, 2024

Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

WWW 2025

June 21, 2024

Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models

ICML 2024

May 30, 2024

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

arXiv

February 26, 2024

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

arXiv

October 30, 2023

Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

EMNLP 2023

July 20, 2023

Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models

19 publications in total.

Learning about Embeddings

Where to start with embeddings? We've got you covered. Learn about embeddings from the ground up with our comprehensive guide.

Comparison of Reranker, Vector Search, and BM25

The table below provides a comprehensive comparison of the Reranker, Vector/Embeddings Search, and BM25, highlighting their strengths and weaknesses across various categories.

	Reranker	Vector Search	BM25
Best For	Enhanced search precision and relevance	Initial, rapid filtering	General text retrieval across wide-ranging queries
Granularity	Detailed: Sub-document and query segment	Broad: Entire documents	Intermediate: Various text segments
Query Time Complexity	High	Medium	Low
Indexing Time Complexity	Not required	High	Low, utilizes pre-built index
Training Time Complexity	High	High	Not required
Search Quality	Superior for nuanced queries	Balanced between efficiency and accuracy	Consistent and reliable for a broad set of queries
Strengths	Highly accurate with deep contextual understanding	Quick and efficient, with moderate accuracy	Highly scalable, with established efficacy
	Try reranker API for free	Try embedding API for free

The Evolution of Embeddings Poster

Discover the ideal poster for your space, featuring captivating infographics or breathtaking visuals tracing the evolution of text embedding models since 1950.

Learn how we made it

Buy a hard copy

FAQ

How were the Jina embedding models trained?

What are your multimodal embedding models?

Which languages do your models support?

What is the maximum length for a single sentence input?

What is the maximum number of sentences I can include in a single request?

How do I send images to multimodal embedding models?

How do Jina Embeddings models compare to OpenAI's and Cohere's latest embeddings?

How seamless is the transition from OpenAI's text-embedding-3-large to your solution?

How tokens are calculated when using jina-clip and jina-embeddings models?

Do you provide models for embedding images or audio?

Can Jina Embedding models be fine-tuned with private or company data?

Can your endpoints be hosted privately on AWS, Azure, or GCP?

What is the 'task' parameter and when should I use it?

What is late-interaction retrieval and which models support it?

What is late chunking and when should I use it?

Why does the API support a different context length than the model's maximum capacity?

Why is jina-embeddings-v4 free, and why is it slow?

What are the rate limits for the Embeddings API?

What are the context length limits for each embedding model?

What are the file size limits for images and PDFs?

How to get my API key?

What's the rate limit?

Rate Limit

Columns

Product	API Endpoint	Description	w/o API Key	w/ Free API Key	w/ Paid API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	500 RPM	500 RPM	5000 RPM	7.9s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		100 RPM	100 RPM	1000 RPM	2.5s	Every request costs a fixed number of tokens, starting from 10000 tokens	GET/POST
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		100 RPM & 100,000 TPM	500 RPM & 2,000,000 TPM	5,000 RPM & 50,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Rank documents by query		100 RPM & 100,000 TPM	500 RPM & 2,000,000 TPM	5,000 RPM & 50,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		25 RPM & 25,000 TPM	125 RPM & 500,000 TPM	1,250 RPM & 12,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		25 RPM & 25,000 TPM	125 RPM & 500,000 TPM	1,250 RPM & 12,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		25 RPM & 25,000 TPM	125 RPM & 500,000 TPM	1,250 RPM & 12,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Segmenter API	`https://api.jina.ai/v1/segment`	Tokenize and segment long text	20 RPM	200 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Reason, search and iterate to find the best answer		50 RPM	50 RPM	500 RPM	56.7s	Count the total number of tokens in the whole process.	POST

Do I need a commercial license?

CC BY-NC License Self-Check

Are you using our official API or official images on Azure, AWS, or GCP?

Yes

Can I use the same API key for reader, embedding, reranking, classifying and fine-tuning APIs?

Can I monitor the token usage of my API key?

What should I do if I forget my API key?

Do API keys expire?

Can I transfer tokens between API keys?

Can I revoke my API key?

Why is the first request for some models slow?

Is my API data used to train your models?

What are the rate limits for Jina APIs?

Are there batch size limits for the APIs?

Is billing based on the number of sentences or requests?

Is there a free trial available for new users?

Are tokens charged for failed requests?

What payment methods are accepted?

Is invoicing available for token purchases?