DeepSearch

Search, read and reason until best answer found.

Demo

DeepSearch API

Fully compatible with OpenAI's Chat API schema, simply swap api.openai.com with deepsearch.jina.ai to get started.

Vibe check with a simple chat UI. DeepSearch is best for complex questions that require iteratively reasoning, world-knowledge or up-to-date information.

Streaming

Delivers events as they occur through server-sent events, including reasoning steps and final answers. We strongly recommend keeping this option enabled since DeepSearch requests can take significant time to complete. Disabling streaming may result in '524 timeout' errors.

Reasoning Effort

Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

medium

Budget Tokens

This determines the maximum number of tokens is allowed use for DeepSearch process. Larger budgets can improve response quality by enabling more exhausive search for complex queries, although DeepSearch may not use the entire budget allocated. This overrides the reasoning_effort parameter.

Max Attempts

The maximum number of retries for solving a problem in DeepSearch process. A larger value allows DeepSearch to retry solving the problem by using different reasoning approaches and strategies. This parameter overrides the reasoning_effort parameter.

Agentic Team Size

The number of agents that will work on the problem in parallel. All agents will share one token budget but independent max_attempts, and they will collaborate to produce a final answer.

No Direct Answer

Forces the model to take further thinking/search steps even when the query seems trivial. This is useful if you're using DeepSearch in scenarios where you're certain the query always needs DeepSearch, rather than for trivial questions like '1+1=?'

Arxiv Optimized Search

Experimental

Optimized search engine for research arXiv papers. This will restrict all search to arXiv only.

Structured Output

This enables Structured Outputs which ensures the final answer from the model will match your supplied JSON schema.

Search Query Language

Force the language to use for the search query. Useful when resources are more likely to be in a specific language. By default it is automatically determined by the system.

Answer & Think Language

Force the language of the answer and think with the given language code. By default it is automatically determined from the primary language of the input messages. The quality of the answer may be subtly affected by the language.

Good Domains

A list of domains that are given a higher priority for content retrieval. Useful for domain-specific, high-quality sources that provide valuable content.

Bad Domains

A list of domains to be strictly excluded from content retrieval. Typically used to filter out known spam, low-quality, or irrelevant websites.

Only Domains

A list of domains to be exclusively included in content retrieval. All other domains will be ignored. Useful for domain-specific searches.

Max Returned URLs

The maximum number of URLs to include in the final answer/chunk. URLs are sorted by relevance and other important factors.

Messages

A list of messages between the user and the assistant comprising the conversation so far. You can add images (webp, png, jpeg) or files (txt, pdf) to the message.

User

Assistant

User

Attach Image/Document

Different message types (modalities) are supported, like text (.txt, .pdf), images (.png, .webp, .jpeg). Files are supported up to 10MB and must be encoded into data URI upfront.

Plain text message

{
  "role": "user",
  "content": "hi"
}

Request

Bash

Language

curl https://deepsearch.jina.ai/v1/chat/completions \
  -H "Content-Type: application/json"\
  -H "Authorization: Bearer " \
  -d @- <<EOFEOF
  {
    "model": "jina-deepsearch-v1",
    "messages": [
        {
            "role": "user",
            "content": "Hi!"
        },
        {
            "role": "assistant",
            "content": "Hi, how can I help you?"
        },
        {
            "role": "user",
            "content": "what's the latest blog post from jina ai?"
        }
    ],
    "stream": true,
    "reasoning_effort": "medium"
  }
EOFEOF

This is the last chunk of the stream which contains the final answer, visited URLs and the token usage. Click the button above to get real-time response.

Response

200 OK

0.0 s

196,526 Tokens

{
  "id": "1742181758589",
  "object": "chat.completion.chunk",
  "created": 1742181758,
  "model": "jina-deepsearch-v1",
  "system_fingerprint": "fp_1742181758589",
  "choices": [
    {
      "index": 0,
      "delta": {
        "content": "The latest blog post from Jina AI is titled \"Snippet Selection and URL Ranking in DeepSearch/DeepResearch,\" published on March 12, 2025 [^1]. This post discusses how to improve the quality of DeepSearch by using late-chunking embeddings for snippet selection and rerankers to prioritize URLs before crawling. You can read the full post here: https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch\n\n[^1]: Since our DeepSearch release on February 2nd 2025 we ve discovered two implementation details that greatly improved quality In both cases multilingual embeddings and rerankers are used in an in context manner operating at a much smaller scale than the traditional pre computed indices these models typically require  [jina.ai](https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch)",
        "type": "text",
        "annotations": [
          {
            "type": "url_citation",
            "url_citation": {
              "title": "Snippet Selection and URL Ranking in DeepSearch/DeepResearch",
              "exactQuote": "Since our DeepSearch release on February 2nd 2025, we've discovered two implementation details that greatly improved quality. In both cases, multilingual embeddings and rerankers are used in an _\"in-context\"_ manner - operating at a much smaller scale than the traditional pre-computed indices these models typically require.",
              "url": "https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch",
              "dateTime": "2025-03-13 06:48:01"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 169670,
    "completion_tokens": 27285,
    "total_tokens": 196526
  },
  "visitedURLs": [
    "https://github.com/jina-ai/node-DeepResearch/blob/main/src/utils/url-tools.ts",
    "https://huggingface.co/jinaai/jina-embeddings-v3",
    "https://github.com/jina-ai/reader",
    "https://zilliz.com/blog/training-text-embeddings-with-jina-ai",
    "https://threads.net/@unwind_ai/post/DGmhWCVswbe/media",
    "https://twitter.com/JinaAI_/status/1899840196507820173",
    "https://jina.ai/news?tag=tech-blog",
    "https://docs.llamaindex.ai/en/stable/examples/embeddings/jinaai_embeddings",
    "https://x.com/jinaai_",
    "https://x.com/JinaAI_/status/1899840202358784170",
    "https://tracxn.com/d/companies/jina-ai/__IQ81fOnU0FsDpagFjG-LrG0DMWHELqI6znTumZBQF-A/funding-and-investors",
    "https://jina.ai/models",
    "https://linkedin.com/posts/imohitmayank_jinaai-has-unveiled-the-ultimate-developer-activity-7300401711242711040-VD64",
    "https://medium.com/@tossy21/trying-out-jina-ais-node-deepresearch-c5b55d630ea6",
    "https://huggingface.co/jinaai/jina-clip-v2",
    "https://arxiv.org/abs/2409.10173",
    "https://milvus.io/docs/embed-with-jina.md",
    "https://seedtable.com/best-startups-in-china",
    "https://threads.net/@sung.kim.mw/post/DGhG-J_vREu/jina-ais-a-practical-guide-to-implementing-deepsearchdeepresearchthey-cover-desi",
    "https://elastic.co/search-labs/blog/jina-ai-embeddings-rerank-model-open-inference-api",
    "http://status.jina.ai/",
    "https://apidog.com/blog/recreate-openai-deep-research",
    "https://youtube.com/watch?v=QxHE4af5BQE",
    "https://sdxcentral.com/articles/news/cisco-engages-businesses-on-ai-strategies-at-greater-bay-area-2025/2025/02",
    "https://aws.amazon.com/blogs/machine-learning/build-rag-applications-using-jina-embeddings-v2-on-amazon-sagemaker-jumpstart",
    "https://reddit.com/r/perplexity_ai/comments/1ejbdqa/fastest_open_source_ai_search_engine",
    "https://search.jina.ai/",
    "https://sebastian-petrus.medium.com/build-openais-deep-research-open-source-alternative-4f21aed6d9f0",
    "https://medium.com/@elmo92/jina-reader-transforming-web-content-to-feed-llms-d238e827cc27",
    "https://openai.com/index/introducing-deep-research",
    "https://python.langchain.com/docs/integrations/tools/jina_search",
    "https://varindia.com/news/meta-is-in-talks-for-usd200-billion-ai-data-center-project",
    "https://varindia.com/news/Mira-Murati%E2%80%99s-new-AI-venture-eyes-$9-billion-valuation",
    "https://53ai.com/news/RAG/2025031401342.html",
    "https://arxiv.org/abs/2409.04701",
    "https://bigdatawire.com/this-just-in/together-ai-raises-305m-series-b-to-power-ai-model-training-and-inference",
    "https://github.blog/",
    "https://cdn-uploads.huggingface.co/production/uploads/660c3c5c8eec126bfc7aa326/MvwT9enRT7gOESHA_tpRj.jpeg",
    "https://cdn-uploads.huggingface.co/production/uploads/660c3c5c8eec126bfc7aa326/JNs_DrpFbr6ok_pSRUK4j.jpeg",
    "https://app.dealroom.co/lists/33530",
    "https://api-docs.deepseek.com/news/news250120",
    "https://sdxcentral.com/articles/news/ninjaone-raises-500-million-valued-at-5-billion/2025/02",
    "https://linkedin.com/sharing/share-offsite?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://twitter.com/intent/tweet?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://platform.openai.com/docs/api-reference/chat/create",
    "https://mp.weixin.qq.com/s/-pPhHDi2nz8hp5R3Lm_mww",
    "https://huggingface.us17.list-manage.com/subscribe?id=9ed45a3ef6&u=7f57e683fa28b51bfc493d048",
    "https://automatio.ai/",
    "https://sdk.vercel.ai/docs/introduction",
    "https://app.eu.vanta.com/jinaai/trust/vz7f4mohp0847aho84lmva",
    "https://apply.workable.com/huggingface/j/AF1D4E3FEB",
    "https://facebook.com/sharer/sharer.php?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://facebook.com/sharer/sharer.php?u=http%3A%2F%2F127.0.0.1%3A3000%2Fen-US%2Fnews%2Fsnippet-selection-and-url-ranking-in-deepsearch-deepresearch%2F",
    "https://reddit.com/submit?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://apply.workable.com/huggingface",
    "https://news.ycombinator.com/submitlink?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://news.ycombinator.com/submitlink?u=http%3A%2F%2F127.0.0.1%3A3000%2Fen-US%2Fnews%2Fsnippet-selection-and-url-ranking-in-deepsearch-deepresearch%2F",
    "https://docs.github.com/site-policy/privacy-policies/github-privacy-statement",
    "https://discord.jina.ai/",
    "https://docs.github.com/site-policy/github-terms/github-terms-of-service",
    "https://bigdatawire.com/this-just-in/qumulo-announces-30-million-funding",
    "https://x.ai/blog/grok-3",
    "https://m-ric-open-deep-research.hf.space/",
    "https://youtu.be/sal78ACtGTc?feature=shared&t=52",
    "https://mp.weixin.qq.com/s/apnorBj4TZs3-Mo23xUReQ",
    "https://perplexity.ai/hub/blog/introducing-perplexity-deep-research",
    "https://githubstatus.com/",
    "https://github.blog/changelog/2021-09-30-footnotes-now-supported-in-markdown-fields",
    "https://openai.com/index/introducing-operator",
    "mailto:[email protected]",
    "https://resources.github.com/learn/pathways",
    "https://status.jina.ai/",
    "https://reuters.com/technology/artificial-intelligence/tencents-messaging-app-weixin-launches-beta-testing-with-deepseek-2025-02-16",
    "https://scmp.com/tech/big-tech/article/3298981/baidu-adopts-deepseek-ai-models-chasing-tencent-race-embrace-hot-start",
    "https://microsoft.com/en-us/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks",
    "javascript:UC_UI.showSecondLayer();",
    "https://resources.github.com/",
    "https://storm-project.stanford.edu/research/storm",
    "https://blog.google/products/gemini/google-gemini-deep-research",
    "https://youtu.be/vrpraFiPUyA",
    "https://chat.baidu.com/search?extParamsJson=%7B%22enter_type%22%3A%22ai_explore_home%22%7D&isShowHello=1&pd=csaitab&setype=csaitab&usedModel=%7B%22modelName%22%3A%22DeepSeek-R1%22%7D",
    "https://app.dover.com/jobs/jinaai",
    "http://localhost:3000/",
    "https://docs.cherry-ai.com/",
    "https://en.wikipedia.org/wiki/Delayed_gratification",
    "https://support.github.com/?tags=dotcom-footer",
    "https://docs.jina.ai/",
    "https://skills.github.com/",
    "https://partner.github.com/",
    "https://help.x.com/resources/accessibility",
    "https://business.twitter.com/en/help/troubleshooting/how-twitter-ads-work.html",
    "https://business.x.com/en/help/troubleshooting/how-twitter-ads-work.html",
    "https://support.twitter.com/articles/20170514",
    "https://support.x.com/articles/20170514",
    "https://t.co/jnxcxPzndy",
    "https://t.co/6EtEMa9P05",
    "https://help.x.com/using-x/x-supported-browsers",
    "https://legal.twitter.com/imprint.html"
  ],
  "readURLs": [
    "https://jina.ai/news/a-practical-guide-to-implementing-deepsearch-deepresearch",
    "https://github.com/jina-ai/node-DeepResearch",
    "https://huggingface.co/blog/open-deep-research",
    "https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch",
    "https://x.com/jinaai_?lang=en",
    "https://jina.ai/news",
    "https://x.com/joedevon/status/1896984525210837081",
    "https://github.com/jina-ai/node-DeepResearch/blob/main/src/tools/jina-latechunk.ts"
  ],
  "numURLs": 98
}

API key

Available tokens

This is your unique key. Store it securely!

DeepSearch Parameters Guide

Learn how to set the right parameters and get the best results.

Quality Control

In DeepSearch, there’s generally a trade-off: the more steps the system takes, the higher quality results you’ll get, but you’ll also consume more tokens. This improved quality comes from broader, more exhaustive searches and deeper reflection. Four main parameters control the quality of DeepSearch: budget_tokens, max_attempts, team_size, and reasoning_effort. The reasoning_effort parameter is essentially a preset combination of budget_tokens and max_attempts that’s been carefully tuned. For most users, adjusting reasoning_effort is the simplest approach.

Budget Tokens

budget_tokens sets the maximum number of tokens allowed for the entire DeepSearch process. This covers all operations including web searches, reading web pages, reflection, summarization, and coding. Larger budgets naturally lead to better response quality. The DeepSearch process will stop when either the budget is exhausted or it finds a satisfactory answer, whichever comes first. If the budget runs out first, you’ll still get an answer, but it might not be the final, fully-refined response since it hasn’t passed all the quality checks defined by max_attempts.

Max Attempts

max_attempts determines how many times the system will retry to solve a problem during the DeepSearch process. Each time DeepSearch produces an answer, it must pass certain quality tests defined by an internal evaluator. If the answer fails these tests, the evaluator provides feedback, and the system uses this feedback to continue searching and refining the answer. Setting max_attempts too low means you’ll get results quickly, but the quality may suffer since the answer might not pass all quality checks. Setting it too high can make the process feel stuck in an endless retry loop where it keeps attempting and failing.

The system returns a final answer when either budget_tokens or max_attempts is exceeded (whichever happens first), or when the answer passes all tests while still having remaining budget and attempts available.

Team Size

team_size affects quality in a fundamentally different way than max_attempts and budget_tokens. When team_size is set to more than one, the system decomposes the original problem into sub-problems and researches them independently. Think of it like a map-reduce pattern, where a large job gets broken down into smaller tasks that run in parallel. The final answer is then a synthesis of each worker’s results. We call it “team_size” because it simulates a research team where multiple agents investigate different aspects of the same problem and collaborate on a final report.

Keep in mind that all agents’ token consumption counts toward your total budget_tokens, but each agent has independent max_attempts. This means that with a larger team_size but the same budget_tokens, agents might return answers sooner than expected due to budget constraints. We recommend increasing both team_size and budget_tokens together to give each agent sufficient resources to do thorough work.

Finally, you can think of team_size as controlling the breadth of the search—it determines how many different aspects will be researched. Meanwhile, budget_tokens and max_attempts control the depth of the search—how thoroughly each aspect gets explored.

Source Control

DeepSearch relies heavily on grounding—the sources it uses for information. Quality isn’t just about algorithmic depth and breadth; where DeepSearch gets its information is equally important, and often the deciding factor. Let’s explore the key parameters that control this.

No Direct Answer

no_direct_answer is a simple toggle that prevents the system from returning an answer at step 1. When enabled, it disables the system’s ability to use internal knowledge and forces it to always search the web first. Turning this on will make the system “overthink” even simple questions like “what day is it,” “how are you doing,” or basic factual knowledge that’s definitely in the model’s training data, like “who was the 40th president of the US.”

Hostname Controls

Three parameters—boost_hostnames, bad_hostnames, and only_hostnames—tell DeepSearch which webpages to prioritize, avoid, or exclusively use. To understand how these work, think about the search-and-read process in DeepSearch:

Search phase: The system searches the web and retrieves a list of website URLs with their snippets
Selection phase: The system decides which URLs to actually visit (it doesn’t visit all of them due to time and cost constraints)

boost_hostnames: Domains listed here get higher priority and are more likely to be visited
bad_hostnames: These domains will never be visited
only_hostnames: When defined, only URLs matching these hostnames will be visited

Here are some important notes on hostname parameters. First, the system always uses snippets returned by search engines as initial clues for building reasoning chains. These hostname parameters only affect which webpages the system visits, not how it formulates search queries.

Second, if the collected URLs don’t contain domains specified in only_hostnames, the system might stop reading webpages entirely. We recommend using these parameters only when you’re familiar with your research question and understand where potential answers are likely to be found (or where they definitely shouldn’t be found).

Special Case: Academic Research

For academic research, you might want searches and reads restricted to arxiv.org. In this case, simply set "search_provider": "arxiv" and everything will be grounded on arxiv as the sole source. However, generic or trivial questions may not get efficient answers with this restriction, so only use "search_provider": "arxiv" for serious academic research.

Search Language Code

search_language_code is another parameter that affects web sources by forcing the system to generate queries in a specific language, regardless of the original input or intermediate reasoning steps. Generally, the system automatically decides the query language to get the best search coverage, but sometimes manual control is useful.

Use Cases for Language Control

International market research: When studying a local brand or company’s impact in international markets, you can force queries to always use English with "search_language_code": "en" for global coverage, or use the local language for more tailored regional information.

Global research with non-English prompts: If your input is always in Chinese or Japanese (because your end users primarily speak these languages), but your research scope is global rather than just local Chinese or Japanese websites, the system might automatically lean toward your prompt’s language. Use this parameter to force English queries for broader international coverage.

Chat with DeepSearch

Vibe check with a simple chat UI. DeepSearch is best for complex questions that require iteratively reasoning, world-knowledge or up-to-date information.

We've just launched a new DeepSearch UI that's lightning-fast, minimalist and FREE. Check it out at https://search.jina.ai or click the button below to give it a try!Visit new UI

Chat Clients

For the best experience, we recommend using professional chat clients. DeepSearch is fully compatible with OpenAI's Chat API schema, making it easy to use with any OpenAI-compatible client.

API Endpoint

Model Name

API Key

Chatwise

Cherry Studio

Chatbox

LobeChat

NextChat

What is DeepSearch?

DeepSearch combines web searching, reading, and reasoning for comprehensive investigation. Think of it as an agent that you give a research task to - it searches extensively and works through multiple iterations before providing an answer.

Standard LLMs

about 1000 tokens

about 1s

Quick answers to general knowledge questions

Cannot access real-time or post-training information

Answers are generated purely from pretrained knowledge with a fixed cutoff date

RAG and Grounded LLMs

about 10,000 tokens

about 3s

Questions requiring current or domain-specific information

Struggles with complex questions requiring multi-hop reasoning

Answers generated by summarizing a single-pass search results

Can access current information beyond training cutoff

DeepSearch

about 500,000 tokens

about 50s

Complex questions requiring thorough research and reasoning

Takes longer than simple LLM or RAG approaches

Autonomous agent that iteratively searches, reads, and reasons

Dynamically decides next steps based on current findings

Self-evaluates answer quality before returning results

Can perform deep dives into topics through multiple search and reasoning cycles

API Pricing

API pricing is based on the token usage. One API key gives you access to all search foundation products.

With Jina Search Foundation API

The easiest way to access all of our products. Top-up tokens as you go.

Enter the API key you wish to recharge

Top up this API key with more tokens

Depending on your location, you may be charged in USD, EUR, or other currencies. Taxes may apply.

Please input the right API key to top up

Understand the rate limit

Rate limits are the maximum number of requests that can be made to an API within a minute per IP address/API key (RPM). Find out more about the rate limits for each product and tier below.

Rate Limit

Rate limits are tracked in three ways: RPM (requests per minute), and TPM (tokens per minute). Limits are enforced per IP/API key and will be triggered when either the RPM or TPM threshold is reached first. When you provide an API key in the request header, we track rate limits by key rather than IP address.

Columns

Product	API Endpoint	Description	w/o API Key	w/ API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	500 RPM	5000 RPM	7.9s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		100 RPM	1000 RPM	2.5s	Every request costs a fixed number of tokens, starting from 10000 tokens	GET/POST
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Reason, search and iterate to find the best answer		50 RPM	500 RPM	56.7s	Count the total number of tokens in the whole process.	POST
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Rank documents by query		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		200 RPM & 500,000 TPM	1,000 RPM & 3,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Segmenter API	`https://api.jina.ai/v1/segment`	Tokenize and segment long text	20 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST

Auto top-up on low token balance

Recommended for uninterrupted service in production. When your token balance drops below the set threshold, we will automatically recharge your saved payment method for the last purchased package, until the threshold is met.

We introduced a new pricing model on May 6th, 2025. If you enabled auto-recharge before this date, you'll continue to pay the old price (the one when you purchased). The new pricing only applies if you modify your auto-recharge settings or purchase a new API key.

< 1M Tokens

Top up when

FAQ

What is DeepSearch?

How is DeepSearch different from OpenAI and Gemini's deep research capabilities?

What API key do I need to use DeepSearch?

What happens when DeepSearch reaches its token budget? Does it return an incomplete answer?

Does DeepSearch guarantee accurate answers?

How long does a typical DeepSearch query take?

Can DeepSearch work with any OpenAI-compatible client like Chatwise, CherryStudio or ChatBox?

What are the rate limits for the API?

What is the content inside the <think> tag?

Does DeepSearch use Jina Reader for web search and reading?

Why does DeepSearch use so many tokens for my queries?

Is there a way to control or limit the number of steps?

How reliable are the references in the answers?

Can DeepSearch handle questions about future events?

How to get my API key?

What's the rate limit?

Rate Limit

Columns

Product	API Endpoint	Description	w/o API Key	w/ API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	500 RPM	5000 RPM	7.9s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		100 RPM	1000 RPM	2.5s	Every request costs a fixed number of tokens, starting from 10000 tokens	GET/POST
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Reason, search and iterate to find the best answer		50 RPM	500 RPM	56.7s	Count the total number of tokens in the whole process.	POST
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Rank documents by query		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		200 RPM & 500,000 TPM	1,000 RPM & 3,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Segmenter API	`https://api.jina.ai/v1/segment`	Tokenize and segment long text	20 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST

Can I use the same API key for reader, embedding, reranking, classifying and fine-tuning APIs?

Can I monitor the token usage of my API key?

What should I do if I forget my API key?

Do API keys expire?

Can I transfer tokens between API keys?

Can I revoke my API key?

Why is the first request for some models slow?

Is user input data used for training your models?

Is billing based on the number of sentences or requests?

Is there a free trial available for new users?

Are tokens charged for failed requests?

What payment methods are accepted?

Is invoicing available for token purchases?