News
Models
API
keyboard_arrow_down
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.
Elastic Inference Service
Run Jina models natively inside Elasticsearch.
MCP terminalCLIarticlellms.txtsmart_toyAgentsdata_objectSchemamenu_bookDocs



Log in
login
Embeddings
copyright CC BY-NC 4.0
open_in_new Release Post

jina-code-embeddings-1.5b

Efficient code embeddings from code generation models
License
copyright CC-BY-NC-4.0
Release Date
calendar_month
2025-09-01
Input
abc
Text (Code)
arrow_forward
Output
more_horiz
Vector
Matryoshka Dimensions help_outline
128
256
512
1024
1536
Late Chunking help_outline
cancel
No
Model Details
Parameters: 1.5B
Input Token Length: 32K
Output Dimension: 1536
Base Model help_outline
open_in_new
Qwen2.5-Coder-1.5B
Trained Languages help_outline
1 languages
Supported Languages help_outline
29 languages
Quantizations help_outline
GGUF
Related Models
link
jina-code-embeddings-0.5b
link
jina-embeddings-v2-base-code
Supported Tasks
translate NL→Code
help_center Tech QA
sync_alt Code→Code
description Code→NL
auto_fix_high Completion
Tags
code-embeddings
programming-languages
semantic-code-search
code-similarity
long-context
text-embeddings
multilingual-code
docstring-search
Available via
Jina APIAWS SageMakerMicrosoft AzureGoogle CloudHugging Face
I/O graph

Code

jina-code-embeddings

Task

Vector

Choose models to compare
Publications (1)
NeurIPS 2025
August 31, 2025
Efficient Code Embeddings from Code Generation Models

Overview

jina-code-embeddings-1.5b is a 1.54 billion parameter model representing a significant advancement in code retrieval capabilities. Built on Qwen2.5-Coder-1.5B backbone with last-token pooling, it moves beyond traditional training on limited aligned data to leverage vast unaligned code and documentation corpora. The model implements comprehensive task-specific instructions across five categories: NL2Code, TechQA, Code2Code, Code2NL, and Code2Completion, each with distinct prefixes for queries and documents. Supports Matryoshka representation learning for flexible embedding truncation. Despite larger size, maintains practical deployment characteristics while achieving benchmark performance competitive with substantially larger alternatives.

Methods

Implements contrastive training with InfoNCE loss using temperature τ=0.05, batch size 256 (adjusted for memory efficiency), sequence length 512. Training for 1500 steps on four A100 GPUs took 12 hours. Comprehensive training data includes MTEB splits, CoSQA+, CodeSearchNet, CommitPackFT, and GPT-4o synthetic data for underrepresented scenarios like framework translations. Task-specific prefixes enable nuanced understanding - Code2Code uses 'Find an equivalent code snippet given the following code snippet:' for queries. Last-token pooling confirmed superior through ablation. Contrastive learning multiplies training signal by using all batch combinations as positive/negative pairs.

Performance

Achieves 79.04% overall average and 78.94% MTEB Code average, establishing new benchmarks for its parameter class. Exceptional scores include 98.41% on HumanEval, 90.13% on MBPP, 98.02% on WikiSQL, and 99.44% on CodeChefXLang. Code-to-code retrieval shows 92.54% on CodeTransOceanContest. NL2Code delivers 86.45% on COIR-CodeSearchNet and 96.34% on Doc2Code. Technical Q&A achieves 92.37% on StackOverflowQA. Surpasses larger alternatives and shows consistent improvements over 0.5B variant, particularly on complex tasks like SWE-Bench (86.33% vs 83.00%).

Best Practice

Strategically employ instruction prefixes based on retrieval requirements, maintaining consistency across pipeline. Enhanced capacity ideal for complex scenarios involving multiple paradigms and extensive codebases. Profile use cases to determine optimal Matryoshka dimension balancing quality and resources. Use batch size 256 for production alignment with training. Excellent for cross-repository and cross-language searches given 99.44% CodeChefXLang performance. Implement as primary retrieval component in RAG systems. Consider confidence scoring based on embedding similarities. Optimal for enterprise deployments requiring both performance and efficiency with sub-second latency. Cache frequent embeddings and use hierarchical indexing for speed.
Blogs that mention this model
September 30, 2025 • 8 minutes read
Embeddings Are AI’s Red-Headed Stepchild
Embedding models aren't the most glamorous aspect of the AI industry, but image generators and chatbots couldn't exist without them.
Scott Martens
Humorous office cartoon depicting a team gathered around robots; signs labeled "embeddings", "tools", "reasoning", and "lol"
September 04, 2025 • 6 minutes read
Jina Code Embeddings: SOTA Code Retrieval at 0.5B and 1.5B
Code generation LLMs → code embeddings: 0.5B/1.5B models achieve SOTA performance across 25 code retrieval benchmarks.
Jina AI
Green "Code Embeddings" text displayed in a LED dot style on a black background, evoking a futuristic and technological atmos
Offices
location_on
Sunnyvale, CA
710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA
location_on
Berlin, Germany
Prinzessinnenstraße 19-20, 10969 Berlin, Germany
Search Foundation
Reader
Embeddings
Reranker
Elastic Inference Service
open_in_new
Get Jina API key
Rate Limit
API Status
Company
About us
Contact sales
News
Intern program
Download Jina logo
open_in_new
Download Elastic logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
email
Jina AI by Elastic © 2020-2026.