Reader
Get LLM-friendly input from a URL or a web search, by simply adding
r.jina.ai
in front.Feeding web information into LLMs is an important step of grounding, yet it can be challenging. The simplest method is to scrape the webpage and feed the raw HTML. However, scraping can be complex and often blocked, and raw HTML is cluttered with extraneous elements like markups and scripts. The Reader API addresses these issues by extracting the core content from a URL and converting it into clean, LLM-friendly text, ensuring high-quality input for your agent and RAG systems.
LLMs have a knowledge cut-off, meaning they can't access the latest world knowledge. This leads to problems such as misinformation, outdated responses, hallucinations, and other factuality issues. Grounding is absolutely essential for GenAI applications. Reader allows you to ground your LLM with the latest information from the web. Simply prepend
https://s.jina.ai/
to your query, and Reader will search the web and return the top five results with their URLs and contents, each in clean, LLM-friendly text. This way, you can always keep your LLM up-to-date, improve its factuality, and reduce hallucinations.Images on the webpage are automatically captioned using a vision language model in the reader and formatted as image alt tags in the output. This gives your downstream LLM just enough hints to incorporate those images into its reasoning and summarizing processes. This means you can ask questions about the images, select specific ones, or even forward their URLs to a more powerful VLM for deeper analysis!
Yes, Reader natively supports PDF reading. It's compatible with most PDFs, including those with many images, and it's lightning fast! Combined with an LLM, you can easily build a ChatPDF or document analysis AI in no time.
The best part? It's free!
Reader API is available for free and offers flexible rate limit and pricing. Built on a scalable infrastructure, it offers high accessibility, concurrency, and reliability. We strive to be your preferred grounding solution for your LLMs.
Rate Limit
Product | API Endpoint | Description | Allowed Request | Without API Key (RPM) | With API Key (RPM) | With Premium API Key (RPM) | Average Latency (s) | Token Usage Counting | |
---|---|---|---|---|---|---|---|---|---|
Reader API | https://r.jina.ai | Convert URL to LLM-friendly text | GET/POST | 20 | 200 | 1000 | 1.6 | Count the number of tokens in the output response. | |
Reader API | https://s.jina.ai | Search the web and convert results to LLM-friendly text | GET/POST | 5 | 40 | 100 | 7.7 | Count the number of tokens in the output response. | |
Segmenter API | https://segment.jina.ai | Tokenize and segment long text | GET/POST | 20 | 200 | 1000 | 0.3 | Token is not counted as usage. | |
Embedding API | https://api.jina.ai/v1/embeddings | Convert text/images to fixed-length vectors | POST | block | 60keyboard_double_arrow_up | 300keyboard_double_arrow_up | bolt depends on the input size | Count the number of tokens in the input request. | |
Reranker API | https://api.jina.ai/v1/rerank | Tokenize and segment long text | POST | block | 60keyboard_double_arrow_up | 300keyboard_double_arrow_up | bolt depends on the input size | Count the number of tokens in the input request. |
Don't panic! Every new API key contains one million free tokens!
Reader API
Get LLM-friendly input from a URL or a web search, by simply adding
r.jina.ai
in front.Basic Usage
double_arrow
Add
https://r.jina.ai/
to any URL in your code or tool where LLM access is expected. This will return the main content of the page in clean, LLM-friendly text.search
Add
https://s.jina.ai/
to your query. This will call the search engine and returns top-5 results with their URLs and contents, each in clean, LLM-friendly text.Advanced Usage
Timeout
Target Selector
Wait For Selector
Gather All Links At the End
Gather All Images At the End
Use POST Method
JSON Response
Forward Cookie
Image Caption
Use a Proxy Server
Bypass the Cache
Stream Mode
Browser Locale
Local PDF/HTML file
POST
upload
upload
Request
Bash
Language
curl 'https://r.jina.ai/https://example.com'
upload
Request (javascript)
fetch('https://r.jina.ai/https://example.com', {
method: 'GET',
})
key
API key
Available tokens
0
API Pricing
Our API pricing is structured around the number of tokens sent in the requests. For Reader API, it is the number of tokens in the responses. This pricing model is applicable to all products in Jina AI's search foundation: Embedding, Reranking, Reader, Auto Fine-Tuning APIs. With the same API key, you have access to all API services.
Enter the API key you wish to recharge
Auto-recharge when tokens are low
≤ 1M Tokens
Recharge threshold
speed
Understand the rate limit
Top up this API key with more tokens
Please input the right API key to top up
Rate Limit
Columns
Product | API Endpoint | Description | Allowed Request | Without API Key (RPM) | With API Key (RPM) | With Premium API Key (RPM) | Average Latency (s) | Token Usage Counting | |
---|---|---|---|---|---|---|---|---|---|
Reader API | https://r.jina.ai | Convert URL to LLM-friendly text | GET/POST | 20 | 200 | 1000 | 1.6 | Count the number of tokens in the output response. | |
Reader API | https://s.jina.ai | Search the web and convert results to LLM-friendly text | GET/POST | 5 | 40 | 100 | 7.7 | Count the number of tokens in the output response. | |
Segmenter API | https://segment.jina.ai | Tokenize and segment long text | GET/POST | 20 | 200 | 1000 | 0.3 | Token is not counted as usage. | |
Embedding API | https://api.jina.ai/v1/embeddings | Convert text/images to fixed-length vectors | POST | block | 60keyboard_double_arrow_up | 300keyboard_double_arrow_up | bolt depends on the input size | Count the number of tokens in the input request. | |
Reranker API | https://api.jina.ai/v1/rerank | Tokenize and segment long text | POST | block | 60keyboard_double_arrow_up | 300keyboard_double_arrow_up | bolt depends on the input size | Count the number of tokens in the input request. |
Reader-related common questions
What are the costs associated with using the Reader API?
How does the Reader API function?
Is the Reader API open source?
What is the typical latency for the Reader API?
Why should I use the Reader API instead of scraping the page myself?
Does the Reader API support multiple languages?
What should I do if a website blocks the Reader API?
Can the Reader API extract content from PDF files?
Can the Reader API process media content from web pages?
Is it possible to use the Reader API on local HTML files?
Does Reader API cache the content?
Can I use the Reader API to access content behind a login?
Can I use the Reader API to access PDF on arXiv?
How does image caption work in Reader?
What is the scalability of the Reader? Can I use it in production?
What is the rate limit of the Reader API?
API-related common questions
code
Can I use the same API key for embedding, reranking, reader, fine-tuning APIs?
code
Can I monitor the token usage of my API key?
code
What should I do if I forget my API key?
code
Do API keys expire?
code
Why is the first request for some models slow?
code
Is user input data used for training your models?
Billing-related common questions
attach_money
Is billing based on the number of sentences or requests?
attach_money
Is there a free trial available for new users?
attach_money
Are tokens charged for failed requests?
attach_money
What payment methods are accepted?
attach_money
Is invoicing available for token purchases?