Embeddings
v4 release!

Top-performing multimodal multilingual long-context embeddings for search, RAG, agents applications.

Embedding API

Try our world-class embedding models to improve your search and RAG systems. Start with a free trial!

Rate Limit

Raise issue

FAQ

Status

Select embeddings

L2 normalization

Scales the embedding so its Euclidean (L2) norm becomes 1, preserving direction. Useful when downstream involves dot-product, classification, visualization.

Output data type

Instead of float, you can set it to binary for faster vector retrieval, or as base64 encoding for faster transmission.

Default (as float)

Example input

Change them and see how the response changes!

Request

Bash

Language

curl https://api.jina.ai/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer " \
  -d @- <<EOFEOF
  {
    "normalized": true,
    "embedding_type": "float",
    "input": [
        "Organic skincare for sensitive skin with aloe vera and chamomile: Imagine the soothing embrace of nature with our organic skincare range, crafted specifically for sensitive skin. Infused with the calming properties of aloe vera and chamomile, each product provides gentle nourishment and protection. Say goodbye to irritation and hello to a glowing, healthy complexion.",
        "Bio-Hautpflege für empfindliche Haut mit Aloe Vera und Kamille: Erleben Sie die wohltuende Wirkung unserer Bio-Hautpflege, speziell für empfindliche Haut entwickelt. Mit den beruhigenden Eigenschaften von Aloe Vera und Kamille pflegen und schützen unsere Produkte Ihre Haut auf natürliche Weise. Verabschieden Sie sich von Hautirritationen und genießen Sie einen strahlenden Teint.",
        "Cuidado de la piel orgánico para piel sensible con aloe vera y manzanilla: Descubre el poder de la naturaleza con nuestra línea de cuidado de la piel orgánico, diseñada especialmente para pieles sensibles. Enriquecidos con aloe vera y manzanilla, estos productos ofrecen una hidratación y protección suave. Despídete de las irritaciones y saluda a una piel radiante y saludable.",
        "针对敏感肌专门设计的天然有机护肤产品：体验由芦荟和洋甘菊提取物带来的自然呵护。我们的护肤产品特别为敏感肌设计，温和滋润，保护您的肌肤不受刺激。让您的肌肤告别不适，迎来健康光彩。",
        "新しいメイクのトレンドは鮮やかな色と革新的な技術に焦点を当てています: 今シーズンのメイクアップトレンドは、大胆な色彩と革新的な技術に注目しています。ネオンアイライナーからホログラフィックハイライターまで、クリエイティビティを解き放ち、毎回ユニークなルックを演出しましょう。"
    ]
  }
EOFEOF

API key

Available tokens

This is your unique key. Store it securely!

v4: Universal Embeddings for Multimodal Multilingual Retrieval

jina-embeddings-v4 is our most significant leap yet — a 3.8B model that embeds text and images through a unified pathway, supporting both dense and late-interaction retrieval while outperforming proprietary models from Google, OpenAI and Voyage AI especially on visually rich document retrieval.

Three Ways to Purchase

Subscribe to our API, purchase through cloud providers, or obtain a commercial license for your organization.

With 3 cloud service providers

Using AWS or Azure? You can deploy our models directly on your company's cloud platform and handle billing through the CSP account.

With Jina Search Foundation API

The easiest way to access all of our products. Top-up tokens as you go.

Enter the API key you wish to recharge

Top up this API key with more tokens

Depending on your location, you may be charged in USD, EUR, or other currencies. Taxes may apply.

Please input the right API key to top up

Understand the rate limit

Rate limits are the maximum number of requests that can be made to an API within a minute per IP address/API key (RPM). Find out more about the rate limits for each product and tier below.

Rate Limit

Rate limits are tracked in three ways: RPM (requests per minute), and TPM (tokens per minute). Limits are enforced per IP/API key and will be triggered when either the RPM or TPM threshold is reached first. When you provide an API key in the request header, we track rate limits by key rather than IP address.

Columns

Product	API Endpoint	Description	w/o API Key	w/ API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	500 RPM	5000 RPM	7.9s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		100 RPM	1000 RPM	2.5s	Every request costs a fixed number of tokens, starting from 10000 tokens	GET/POST
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Reason, search and iterate to find the best answer		50 RPM	500 RPM	56.7s	Count the total number of tokens in the whole process.	POST
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Rank documents by query		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		200 RPM & 500,000 TPM	1,000 RPM & 3,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Segmenter API	`https://api.jina.ai/v1/segment`	Tokenize and segment long text	20 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST

Auto top-up on low token balance

Recommended for uninterrupted service in production. When your token balance drops below the set threshold, we will automatically recharge your saved payment method for the last purchased package, until the threshold is met.

We introduced a new pricing model on May 6th, 2025. If you enabled auto-recharge before this date, you'll continue to pay the old price (the one when you purchased). The new pricing only applies if you modify your auto-recharge settings or purchase a new API key.

< 1M Tokens

Top up when

With a commercial license for on-prem use

Require 100% control and privacy? Purchase a commercial license to use our models on-premises.

On-premises deployment

Deploy Jina Embeddings models in AWS Sagemaker and Microsoft Azure, and soon in Google Cloud Services, or contact our sales team to get customized Kubernetes deployments for your Virtual Private Cloud and on-premises servers.

AWS SageMaker

Embeddings

Reranker

Microsoft Azure

Embeddings

Reranker

Google Cloud

Embeddings

Reranker

API Integrations

Our Embedding API is natively integrated with various renowned databases, vector stores, RAG, and LLMOps frameworks. To begin, just copy and paste your API key into any of the listed integrations for a quick and seamless start.

Vector Store

LLMOps

RAG

Observability

MongoDB

DataStax

Qdrant

Pinecone

Chroma

Weaviate

Milvus

Epsilla

MyScale

LlamaIndex

Haystack

Langchain

Dify

SuperDuperDB

DashVector

Portkey

Baseten

TiDB

LanceDB

Carbon

Our Publications

Understand how our frontier search models were trained from scratch, check out our latest publications. Meet our team at EMNLP, SIGIR, ICLR, NeurIPS, and ICML!

arXiv

June 24, 2025

jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

ICLR 2025

March 04, 2025

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

ACL 2025

December 17, 2024

AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark

ICLR 2025

December 12, 2024

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

ECIR 2025

September 18, 2024

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

SIGIR 2025

September 07, 2024

Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models

EMNLP 2024

August 30, 2024

Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever

WWW 2025

June 21, 2024

Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models

ICML 2024

May 30, 2024

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

arXiv

February 26, 2024

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

arXiv

October 30, 2023

Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

EMNLP 2023

July 20, 2023

Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models

12 publications in total.

Learning about Embeddings

Where to start with embeddings? We've got you covered. Learn about embeddings from the ground up with our comprehensive guide.

Comparison of Reranker, Vector Search, and BM25

The table below provides a comprehensive comparison of the Reranker, Vector/Embeddings Search, and BM25, highlighting their strengths and weaknesses across various categories.

	Reranker	Vector Search	BM25
Best For	Enhanced search precision and relevance	Initial, rapid filtering	General text retrieval across wide-ranging queries
Granularity	Detailed: Sub-document and query segment	Broad: Entire documents	Intermediate: Various text segments
Query Time Complexity	High	Medium	Low
Indexing Time Complexity	Not required	High	Low, utilizes pre-built index
Training Time Complexity	High	High	Not required
Search Quality	Superior for nuanced queries	Balanced between efficiency and accuracy	Consistent and reliable for a broad set of queries
Strengths	Highly accurate with deep contextual understanding	Quick and efficient, with moderate accuracy	Highly scalable, with established efficacy
	Try reranker API for free	Try embedding API for free

The Evolution of Embeddings Poster

Discover the ideal poster for your space, featuring captivating infographics or breathtaking visuals tracing the evolution of text embedding models since 1950.

Learn how we made it

Buy a hard copy

FAQ

How were the jina-embeddings-v3 models trained?

What are the jina-clip models, and can I use them for text and image search?

Which languages do your models support?

What is the maximum length for a single sentence input?

What is the maximum number of sentences I can include in a single request?

How do I send images to the jina-clip models?

How do Jina Embeddings models compare to OpenAI's and Cohere's latest embeddings?

How seamless is the transition from OpenAI's text-embedding-3-large to your solution?

How tokens are calculated when using jina-clip and jina-embeddings models?

Do you provide models for embedding images or audio?

Can Jina Embedding models be fine-tuned with private or company data?

Can your endpoints be hosted privately on AWS, Azure, or GCP?

How to get my API key?

What's the rate limit?

Rate Limit

Columns

Product	API Endpoint	Description	w/o API Key	w/ API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	500 RPM	5000 RPM	7.9s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		100 RPM	1000 RPM	2.5s	Every request costs a fixed number of tokens, starting from 10000 tokens	GET/POST
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Reason, search and iterate to find the best answer		50 RPM	500 RPM	56.7s	Count the total number of tokens in the whole process.	POST
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Rank documents by query		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		200 RPM & 500,000 TPM	1,000 RPM & 3,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Segmenter API	`https://api.jina.ai/v1/segment`	Tokenize and segment long text	20 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST

Do I need a commercial license?

CC BY-NC License Self-Check

Are you using our official API or official images on Azure or AWS?

Yes

Are you using a paid API key or free trial key?

Are you using our official model images on AWS and Azure?

Can I use the same API key for reader, embedding, reranking, classifying and fine-tuning APIs?

Can I monitor the token usage of my API key?

What should I do if I forget my API key?

Do API keys expire?

Can I transfer tokens between API keys?

Can I revoke my API key?

Why is the first request for some models slow?

Is user input data used for training your models?

Is billing based on the number of sentences or requests?

Is there a free trial available for new users?

Are tokens charged for failed requests?

What payment methods are accepted?

Is invoicing available for token purchases?

Embeddings new_releases v4 release!

Embeddings
v4 release!