Contact sales
Grow your business with Jina AI.
Three Ways to Purchase
Subscribe to our API, purchase through cloud providers, or obtain a commercial license for your organization.
radio_button_unchecked
cloud
With 3 cloud service providers
Using AWS or Azure? You can deploy our models directly on your company's cloud platform and handle billing through the CSP account.
Rate Limit
Rate limits are tracked in two ways: RPM (requests per minute) and TPM (tokens per minute). Limits are enforced per IP/API key and can be reached based on whichever threshold—RPM or TPM—is hit first. Note, when API key is provided in the request, rate limits are tracked per key, not per IP address.
Columns
arrow_drop_down
Product | API Endpoint | Descriptionarrow_upward | w/o API Key | w/ API Key | w/ Premium API Key | Average Latency | Token Usage Counting | Allowed Request | |
---|---|---|---|---|---|---|---|---|---|
Embedding API | https://api.jina.ai/v1/embeddings | Convert text/images to fixed-length vectors | block | 500 RPM & 1,000,000 TPM | 2,000 RPM & 5,000,000 TPM | bolt depends on the input size help | Count the number of tokens in the input request. | POST | |
Reranker API | https://api.jina.ai/v1/rerank | Tokenize and segment long text | block | 500 RPM & 1,000,000 TPM | 2,000 RPM & 5,000,000 TPM | bolt depends on the input size help | Count the number of tokens in the input request. | POST | |
Reader API | https://r.jina.ai | Convert URL to LLM-friendly text | 20 RPM | 200 RPM | 1000 RPM | 4.6s | Count the number of tokens in the output response. | GET/POST | |
DeepSearch | https://deepsearch.jina.ai/v1/chat/completions | Reason, search and iterate to find the best answer | 2 RPM | 10 RPM | 30 RPM | 56.7s | Count the total number of tokens in the whole process. | POST | |
Reader API | https://s.jina.ai | Search the web and convert results to LLM-friendly text | block | 40 RPM | 100 RPM | 8.7s | Count the number of tokens in the output response. | GET/POST | |
Reader API | https://g.jina.ai | Grounding a statement with web knowledge | block | 10 RPM | 30 RPM | 22.7s | Count the total number of tokens in the whole process. | GET/POST | |
Classifier API (Zero-shot) | https://api.jina.ai/v1/classify | Classify inputs using zero-shot classification | block | 200 RPM & 500,000 TPM | 1,000 RPM & 3,000,000 TPM | bolt depends on the input size | Tokens counted as: input_tokens + label_tokens | POST | |
Classifier API (Few-shot) | https://api.jina.ai/v1/classify | Classify inputs using a trained few-shot classifier | block | 20 RPM & 200,000 TPM | 60 RPM & 1,000,000 TPM | bolt depends on the input size | Tokens counted as: input_tokens | POST | |
Classifier API | https://api.jina.ai/v1/train | Train a classifier using labeled examples | block | 20 RPM & 200,000 TPM | 60 RPM & 1,000,000 TPM | bolt depends on the input size | Tokens counted as: input_tokens × num_iters | POST | |
Segmenter API | https://api.jina.ai/v1/segment | Tokenize and segment long text | 20 RPM | 200 RPM | 1,000 RPM | 0.3s | Token is not counted as usage. | GET/POST |
CC BY-NC License Self-Check
play_arrow
Are you using our official API or official images on Azure or AWS?
play_arrow
done
Yes
play_arrow
Are you using a paid API key or free trial key?
play_arrow
Are you using our official model images on AWS and Azure?
play_arrow
close
No
DeepSearch-related common questions
What is DeepSearch?
keyboard_arrow_down
How is DeepSearch different from OpenAI and Gemini's deep research capabilities?
keyboard_arrow_down
What API key do I need to use DeepSearch?
keyboard_arrow_down
What happens when DeepSearch reaches its token budget? Does it return an incomplete answer?
keyboard_arrow_down
Does DeepSearch guarantee accurate answers?
keyboard_arrow_down
How long does a typical DeepSearch query take?
keyboard_arrow_down
Can DeepSearch work with any OpenAI-compatible client like Chatwise, CherryStudio or ChatBox?
keyboard_arrow_down
What are the rate limits for the API?
keyboard_arrow_down
What is the content inside the <think> tag?
keyboard_arrow_down
Does DeepSearch use Jina Reader for web search and reading?
keyboard_arrow_down
Why does DeepSearch use so many tokens for my queries?
keyboard_arrow_down
Is there a way to control or limit the number of steps?
keyboard_arrow_down
How reliable are the references in the answers?
keyboard_arrow_down
Can DeepSearch handle questions about future events?
keyboard_arrow_down
Reader-related common questions
What are the costs associated with using the Reader API?
keyboard_arrow_down
How does the Reader API function?
keyboard_arrow_down
Is the Reader API open source?
keyboard_arrow_down
What is the typical latency for the Reader API?
keyboard_arrow_down
Why should I use the Reader API instead of scraping the page myself?
keyboard_arrow_down
Does the Reader API support multiple languages?
keyboard_arrow_down
What should I do if a website blocks the Reader API?
keyboard_arrow_down
Can the Reader API extract content from PDF files?
keyboard_arrow_down
Can the Reader API process media content from web pages?
keyboard_arrow_down
Is it possible to use the Reader API on local HTML files?
keyboard_arrow_down
Does Reader API cache the content?
keyboard_arrow_down
Can I use the Reader API to access content behind a login?
keyboard_arrow_down
Can I use the Reader API to access PDF on arXiv?
keyboard_arrow_down
How does image caption work in Reader?
keyboard_arrow_down
What is the scalability of the Reader? Can I use it in production?
keyboard_arrow_down
What is the rate limit of the Reader API?
keyboard_arrow_down
What is Reader-LM? How can I use it?
keyboard_arrow_down
Reranker-related common questions
How much does the Reranker API cost?
keyboard_arrow_down
What is the difference between the two rerankers?
keyboard_arrow_down
Are Jina Rerankers open source?
keyboard_arrow_down
Do the rerankers support multiple languages?
keyboard_arrow_down
What is the maximum length for queries and documents?
keyboard_arrow_down
What is the maximum number of documents I can rerank per query?
keyboard_arrow_down
What is the batch size and how many query-document tuples can I send in one request?
keyboard_arrow_down
What latency can I expect when reranking 100 documents?
keyboard_arrow_down
Can your endpoints be hosted privately on AWS, Azure, or GCP?
keyboard_arrow_down
Do you offer a fine-tuned reranker on domain-specific data?
keyboard_arrow_down
Embeddings-related common questions
How were the jina-embeddings-v3 models trained?
keyboard_arrow_down
What are the jina-clip models, and can I use them for text and image search?
keyboard_arrow_down
Which languages do your models support?
keyboard_arrow_down
What is the maximum length for a single sentence input?
keyboard_arrow_down
What is the maximum number of sentences I can include in a single request?
keyboard_arrow_down
How do I send images to the jina-clip models?
keyboard_arrow_down
How do Jina Embeddings models compare to OpenAI's and Cohere's latest embeddings?
keyboard_arrow_down
How seamless is the transition from OpenAI's text-embedding-3-large to your solution?
keyboard_arrow_down
How tokens are calculated when using jina-clip models?
keyboard_arrow_down
Do you provide models for embedding images or audio?
keyboard_arrow_down
Can Jina Embedding models be fine-tuned with private or company data?
keyboard_arrow_down
Can your endpoints be hosted privately on AWS, Azure, or GCP?
keyboard_arrow_down
Classifier-related common questions
What's different about labels in zero-shot vs few-shot?
keyboard_arrow_down
What's num_iters for and how should I use it?
keyboard_arrow_down
How does public classifier sharing work?
keyboard_arrow_down
How much data do I need for few-shot to work well?
keyboard_arrow_down
Can it handle multiple languages and both text/images?
keyboard_arrow_down
What are the hard limits I should know about?
keyboard_arrow_down
How do I handle data changes over time?
keyboard_arrow_down
What happens to my training data after I send it?
keyboard_arrow_down
Zero-shot vs few-shot - when to use which?
keyboard_arrow_down
Can I use different models for different languages/tasks?
keyboard_arrow_down
Segmenter-related common questions
How much does the Segmenter API cost?
keyboard_arrow_down
If I don't provide an API key, what is the rate limit?
keyboard_arrow_down
If I provide an API key, what is the rate limit?
keyboard_arrow_down
Will you charge the tokens from my API key?
keyboard_arrow_down
Does the Segmenter API support multiple languages?
keyboard_arrow_down
What is the difference between GET and POST requests?
keyboard_arrow_down
What is the maximum length I can tokenize per request?
keyboard_arrow_down
How does the chunking feature work? Is it semantic chunking?
keyboard_arrow_down
How do you handle special tokens such as 'endoftext' in the Segmenter API?
keyboard_arrow_down
Does chunking support other languages than English?
keyboard_arrow_down
Auto Fine-Tuning-related common questions
How much does the Fine-tuning API cost?
keyboard_arrow_down
What do I need to input? Do I need to provide training data?
keyboard_arrow_down
How long does it take to fine-tune a model?
keyboard_arrow_down
Where are the fine-tuned models stored?
keyboard_arrow_down
If I provide a reference URL, how does the system use it?
keyboard_arrow_down
Can I fine-tune a model for a specific language?
keyboard_arrow_down
Can I fine-tune non-Jina embeddings, e.g., bge-M3?
keyboard_arrow_down
How do you ensure the quality of the fine-tuned models?
keyboard_arrow_down
How do you generate synthetic data?
keyboard_arrow_down
Can I keep my fine-tuned models and synthetic data private?
keyboard_arrow_down
How can I use the fine-tuned model?
keyboard_arrow_down
I never received the email with the evaluation results. What should I do?
keyboard_arrow_down
API-related common questions
code
Can I use the same API key for reader, embedding, reranking, classifying and fine-tuning APIs?
keyboard_arrow_down
code
Can I monitor the token usage of my API key?
keyboard_arrow_down
code
What should I do if I forget my API key?
keyboard_arrow_down
code
Do API keys expire?
keyboard_arrow_down
code
Can I transfer tokens between API keys?
keyboard_arrow_down
code
Can I revoke my API key?
keyboard_arrow_down
code
Why is the first request for some models slow?
keyboard_arrow_down
code
Is user input data used for training your models?
keyboard_arrow_down
Billing-related common questions
attach_money
Is billing based on the number of sentences or requests?
keyboard_arrow_down
attach_money
Is there a free trial available for new users?
keyboard_arrow_down
attach_money
Are tokens charged for failed requests?
keyboard_arrow_down
attach_money
What payment methods are accepted?
keyboard_arrow_down
attach_money
Is invoicing available for token purchases?
keyboard_arrow_down