News
Models
API
keyboard_arrow_down
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.
Elastic Inference Service
Run Jina models natively inside Elasticsearch.
MCP terminalCLIarticlellms.txtsmart_toyAgentsdata_objectSchemamenu_bookDocs



Log in
login
Embeddings
copyright CC BY-NC 4.0
open_in_new Release Post

jina-code-embeddings-0.5b

Efficient code embeddings from code generation models
License
copyright CC-BY-NC-4.0
Release Date
calendar_month
2025-09-01
Input
abc
Text (Code)
arrow_forward
Output
more_horiz
Vector
Matryoshka Dimensions help_outline
64
128
256
512
896
Late Chunking help_outline
cancel
No
Model Details
Parameters: 494M
Input Token Length: 32K
Output Dimension: 896
Base Model help_outline
open_in_new
Qwen2.5-Coder-0.5B
Trained Languages help_outline
1 languages
Supported Languages help_outline
29 languages
Quantizations help_outline
GGUF
Related Models
link
jina-code-embeddings-1.5b
link
jina-embeddings-v2-base-code
Supported Tasks
translate NL→Code
help_center Tech QA
sync_alt Code→Code
description Code→NL
auto_fix_high Completion
Tags
code-embeddings
programming-languages
semantic-code-search
code-similarity
long-context
text-embeddings
multilingual-code
docstring-search
Available via
Jina APIAWS SageMakerMicrosoft AzureGoogle CloudHugging Face
I/O graph

Code

jina-code-embeddings

Task

Vector

Choose models to compare
Publications (1)
NeurIPS 2025
August 31, 2025
Efficient Code Embeddings from Code Generation Models

Overview

jina-code-embeddings-0.5b is a 494 million parameter code embedding model designed for retrieving code from natural language queries, technical Q&A, and identifying similar code across languages. Built on Qwen2.5-Coder-0.5B backbone, it generates embeddings via last-token pooling and addresses the fundamental limitation of traditional code embedding models that rely on scarce aligned data like comments and docstrings. The model leverages abundant unaligned code and documentation used in LLM training, achieving state-of-the-art performance despite its compact size. It supports five task categories with specific instruction prefixes: NL2Code, TechQA, Code2Code, Code2NL, and Code2Completion. The model implements Matryoshka representation learning for truncatable embeddings, allowing flexible precision-resource trade-offs.

Methods

The model employs contrastive training using InfoNCE loss with temperature τ=0.05, batch size 512, and sequence length 512. Training data includes MTEB code tasks, CoSQA+, adapted public datasets, and GPT-4o synthetic data for rare scenarios. Task-specific instruction prefixes condition the model differently for queries and documents - for example, NL2Code uses 'Find the most relevant code snippet given the following query:' for queries. Training on four A100 GPUs for 1500 steps took 8.3 hours. Last-token pooling outperformed mean and latent attention pooling in ablation studies. The contrastive approach treats query-document pairs as positive and cross-combinations as negative examples within each batch.

Performance

Achieves 78.41% overall average and 78.72% MTEB Code average across benchmarks. Notable scores include 96.77% on HumanEval, 89.01% on MBPP, 98.31% on WikiSQL, and 99.70% on CodeChefXLang. Outperforms similar-sized Qwen3-Embedding-0.6B and larger models like jina-embeddings-v4 (74.11%) and gemini-embedding-001 (77.38%). Excels in code-to-code retrieval with 90.37% on CodeTransOceanContest. Strong NL2Code performance with 85.73% on COIR-CodeSearchNet and 95.98% on Doc2Code. Technical Q&A capabilities demonstrated with 91.04% on StackOverflowQA.

Best Practice

Always use appropriate task-specific instruction prefixes for queries and documents. Leverage Matryoshka embeddings to balance quality and resources - start with full dimensions and truncate as needed. Optimal batch size is 512, sequence length 512 tokens. Use cosine similarity for embedding comparison. Excellent for multilingual code search given 99.70% CodeChefXLang performance. Consider two-stage retrieval with initial candidates from this model followed by reranking. Ideal for edge deployment and real-time applications due to compact size. Cache frequently accessed embeddings and implement hierarchical indexing for large codebases.
Blogs that mention this model
October 03, 2025 • 7 minutes read
Jina Reranker v3: 0.6B Listwise Reranker for SOTA Multilingual Retrieval
New 0.6B-parameter listwise reranker that considers the query and all candidate documents in a single context window.
Jina AI
Light blue background with stylized text in the center, composed of small dots or squares, evoking a modern and minimalistic
September 30, 2025 • 8 minutes read
Embeddings Are AI’s Red-Headed Stepchild
Embedding models aren't the most glamorous aspect of the AI industry, but image generators and chatbots couldn't exist without them.
Scott Martens
Humorous office cartoon depicting a team gathered around robots; signs labeled "embeddings", "tools", "reasoning", and "lol"
September 04, 2025 • 6 minutes read
Jina Code Embeddings: SOTA Code Retrieval at 0.5B and 1.5B
Code generation LLMs → code embeddings: 0.5B/1.5B models achieve SOTA performance across 25 code retrieval benchmarks.
Jina AI
Green "Code Embeddings" text displayed in a LED dot style on a black background, evoking a futuristic and technological atmos
Search Foundation
Reader
Embeddings
Reranker
Elastic Inference Service
open_in_new
Get Jina API key
Rate Limit
API Status
Company
About us
News
Download Jina logo
open_in_new
Download Elastic logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
Elastic © 2020-2026.