News
Models
Products
keyboard_arrow_down
DeepSearch
Search, read and reason until best answer found.
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.
More
keyboard_arrow_down
Classifier
Zero-shot and few-shot classification for image and text.
Segmenter
Cut long text into chunks and do tokenization.

API Docs
Auto codegen for your copilot IDE or LLM
open_in_new


Company
keyboard_arrow_down
About us
Contact sales
Intern program
Join us
open_in_new
Download logo
open_in_new
Terms & Conditions


Log in
login
warning
This model is deprecated by newer models.

jina-reranker-v1-turbo-en

The best combination of fast inference speed and accurate relevance scores
Release Postarrow_forward
License
license
Apache-2.0
Release Date
calendar_month
2024-04-18
Input
abc
Text (Query)
abc
Text (Document)
arrow_forward
Output
format_list_numbered
Rankings
Model Details
Parameters: 37.8M
Input Token Length: 8K
Language Support
🇺🇸 English
Related Models
link
jina-reranker-v1-base-en
link
jina-reranker-v1-tiny-en
Tags
high-speed
memory-efficient
english
production-ready
reranker
rag-optimized
high-performance
cost-effective
Available via
Jina APIAWS SageMakerMicrosoft AzureHugging Face
Choose models to compare

Overview

Jina Reranker v1 Turbo English addresses a critical challenge in production search systems: the trade-off between result quality and computational efficiency. While traditional rerankers offer improved search accuracy, their computational demands often make them impractical for real-time applications. This model breaks that barrier by delivering 95% of the base model's accuracy while processing documents three times faster and using 75% less memory. For organizations struggling with search latency or computational costs, this model offers a compelling solution that maintains high-quality search refinement while significantly reducing infrastructure requirements and operational costs.

Methods

The model achieves its efficiency through an innovative six-layer architecture that compresses the sophisticated reranking capabilities of its larger counterpart into just 37.8 million parameters—a dramatic reduction from the base model's 137 million. This streamlined design employs knowledge distillation, where the larger base model acts as a teacher, training the turbo variant to match its behavior while using fewer resources. The architecture maintains the core BERT-based cross-attention mechanism for token-level interactions between queries and documents, but optimizes it for speed through reduced layer count and efficient parameter allocation. The model supports sequences up to 8,192 tokens, enabling comprehensive document analysis while maintaining fast inference speeds through sophisticated optimization techniques.

Performance

In comprehensive benchmarks, the turbo variant demonstrates remarkable efficiency without significant accuracy trade-offs. On the BEIR benchmark, it achieves an NDCC-10 score of 49.60, retaining 95% of the base model's performance (52.45) while outperforming many larger competitors like bge-reranker-base (47.89, 278M parameters). In RAG applications, it maintains an impressive 83.51% hit rate and 0.6498 MRR, showing particular strength in practical retrieval tasks. The model's speed improvements are even more striking—it processes documents three times faster than the base model, with throughput scaling nearly linearly with reduced parameter count. However, users should note slightly lower performance on extremely nuanced ranking tasks where the full parameter count of larger models provides marginal advantages.

Best Practice

The model requires CUDA-capable hardware for optimal performance and can be deployed through AWS SageMaker or accessed via API endpoints. For production deployments, organizations should implement a two-stage pipeline where vector search provides initial candidates for reranking. While the model supports 8,192 tokens, users should consider the latency impact of longer sequences—processing time increases with document length. The sweet spot for most applications is reranking 100-200 candidates per query, which balances quality and speed. The model is specifically optimized for English content and may not perform optimally on multilingual documents. Memory requirements are significantly lower than the base model, typically requiring only 150MB of GPU memory compared to 550MB, making it suitable for deployment on smaller instances and enabling significant cost savings in cloud environments.
Blogs that mention this model
April 18, 2024 • 7 minutes read
Smaller, Faster, Cheaper: Introducing Jina Rerankers Turbo and Tiny
Jina AI announces new reranker models: Jina Rerankers Turbo (jina-reranker-v1-turbo-en) and Tiny (jina-reranker-v1-tiny-en), now available on AWS Sagemaker and Hugging Face, offering faster, memory-efficient, high-performance reranking.
Yuting Zhang
Scott Martens
Four interconnected white wireframe spheres on a deep blue background, symbolizing global networking and technological connec
May 07, 2024 • 12 minutes read
When AI Makes AI: Synthetic Data, Model Distillation, And Model Collapse
AI creating AI! Is it the end of the world? Or just another tool to make models do value-adding work? Let’s find out!
Scott Martens
Abstract depiction of a brain in purple and pink hues with a fluid, futuristic design against a blue and purple background.
April 29, 2024 • 7 minutes read
Jina Embeddings and Reranker on Azure: Scalable Business-Ready AI Solutions
Jina Embeddings and Rerankers are now available on Azure Marketplace. Enterprises that prioritize privacy and security can now easily integrate Jina AI's state-of-the-art models right in their existing Azure ecosystem.
Susana Guzmán
Futuristic black background with a purple 3D grid, featuring the "Embeddings" and "Reranker" logos with a stylized "A".
Offices
location_on
Sunnyvale, CA
710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA
location_on
Berlin, Germany (HQ)
PrinzessinnenstraĂźe 19-20, 10969 Berlin, Germany
location_on
Beijing, China
Level 5, Building 6, No.48 Haidian West St. Beijing, China
location_on
Shenzhen, China
402 Floor 4, Fu'an Technology Building, Shenzhen, China
Search Foundation
DeepSearch
Reader
Embeddings
Reranker
Classifier
Segmenter
API Documentation
Get Jina API key
Rate Limit
API Status
Company
About us
Contact sales
Newsroom
Intern program
Join us
open_in_new
Download logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
email
Jina AI © 2020-2025.