Welcome to Jina!#
Survey
Take our user experience survey to let us know your thoughts and help shape the future of Jina!
Jina lets you build multimodal AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production. You can focus on your logic and algorithms, without worrying about the infrastructure complexity.
Jina provides a smooth Pythonic experience for serving ML models transitioning from local deployment to advanced orchestration frameworks like Docker-Compose, Kubernetes, or Jina AI Cloud. Jina makes advanced solution engineering and cloud-native technologies accessible to every developer.
Build and serve models for any data type and any mainstream deep learning framework.
Design high-performance services, with easy scaling, duplex client-server streaming, batching, dynamic batching, async/non-blocking data processing and any protocol.
Docker container integration via Executor Hub, OpenTelemetry/Prometheus observability.
Streamlined CPU/GPU hosting via Jina AI Cloud.
Deploy to your own cloud or system with our Kubernetes and Docker Compose integration.
Wait, how is Jina different from FastAPI?
Jina's value proposition may seem quite similar to that of FastAPI. However, there are several fundamental differences:Data structure and communication protocols
FastAPI communication relies on Pydantic and Jina relies on DocArray allowing Jina to support multiple protocols to expose its services. The support for gRPC protocol is specially useful for data intensive applications as for embedding services where the embeddings and tensors can be more efficiently serialized.
Advanced orchestration and scaling capabilities
Jina allows you to easily containerize and orchestrate your services and models, providing concurrency and scalability.
Jina lets you deploy applications formed from multiple microservices that can be containerized and scaled independently.
Journey to the cloud
Jina provides a smooth transition from local development (using DocArray) to local serving using Deployment and Flow to having production-ready services by using Kubernetes capacity to orchestrate the lifetime of containers.
By using Jina AI Cloud you have access to scalable and serverless deployments of your applications in one command.
Install#
Make sure that you have Python 3.7+ installed on Linux/macOS/Windows.
pip install -U jina
conda install jina -c conda-forge
Getting Started#
Jina supports developers in building AI services and pipelines:
Let’s build a fast, reliable and scalable gRPC-based AI service. In Jina we call this an Executor. Our simple Executor will wrap the StableLM LLM from Stability AI. We’ll then use a Deployment to serve it.
Note A Deployment serves just one Executor. To combine multiple Executors into a pipeline and serve that, use a Flow.
Let’s implement the service’s logic:
executor.py |
---|
from jina import Executor, requests
from docarray import DocList, BaseDoc
from transformers import pipeline
class Prompt(BaseDoc):
text: str
class Generation(BaseDoc):
prompt: str
text: str
class StableLM(Executor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.generator = pipeline(
'text-generation', model='stabilityai/stablelm-base-alpha-3b'
)
@requests
def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
generations = DocList[Generation]()
prompts = docs.text
llm_outputs = self.generator(prompts)
for prompt, output in zip(prompts, llm_outputs):
generations.append(Generation(prompt=prompt, text=output))
return generations
|
Then we deploy it with either the Python API or YAML:
Python API: deployment.py |
YAML: deployment.yml |
---|---|
from jina import Deployment
from executor import StableLM
dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)
with dep:
dep.block()
|
jtype: Deployment
with:
uses: StableLM
py_modules:
- executor.py
timeout_ready: -1
port: 12345
And run the YAML Deployment with the CLI: |
Use Jina Client to make requests to the service:
from jina import Client
from docarray import DocList, BaseDoc
class Prompt(BaseDoc):
text: str
class Generation(BaseDoc):
prompt: str
text: str
prompt = Prompt(
text='suggest an interesting image generation prompt for a mona lisa variant'
)
client = Client(port=12345) # use port from output above
response = client.post(on='/', inputs=[prompt], return_type=DocList[Generation])
print(response[0].text)
a steampunk version of the Mona Lisa, incorporating mechanical gears, brass elements, and Victorian era clothing details
Sometimes you want to chain microservices together into a pipeline. That’s where a Flow comes in.
A Flow is a DAG pipeline, composed of a set of steps, It orchestrates a set of Executors and a Gateway to offer an end-to-end service.
Note If you just want to serve a single Executor, you can use a Deployment.
For instance, let’s combine our StableLM language model with a Stable Diffusion image generation model. Chaining these services together into a Flow will give us a service that will generate images based on a prompt generated by the LLM.
text_to_image.py |
---|
import numpy as np
from jina import Executor, requests
from docarray import BaseDoc, DocList
from docarray.documents import ImageDoc
class Generation(BaseDoc):
prompt: str
text: str
class TextToImage(Executor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
from diffusers import StableDiffusionPipeline
import torch
self.pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16
).to("cuda")
@requests
def generate_image(self, docs: DocList[Generation], **kwargs) -> DocList[ImageDoc]:
result = DocList[ImageDoc]()
images = self.pipe(
docs.text
).images # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)
result.tensor = np.array(images)
return result
|
Build the Flow with either Python or YAML:
Python API: flow.py |
YAML: flow.yml |
---|---|
from jina import Flow
from executor import StableLM
from text_to_image import TextToImage
flow = (
Flow(port=12345)
.add(uses=StableLM, timeout_ready=-1)
.add(uses=TextToImage, timeout_ready=-1)
)
with flow:
flow.block()
|
jtype: Flow
with:
port: 12345
executors:
- uses: StableLM
timeout_ready: -1
py_modules:
- executor.py
- uses: TextToImage
timeout_ready: -1
py_modules:
- text_to_image.py
Then run the YAML Flow with the CLI: |
Then, use Jina Client to make requests to the Flow:
from jina import Client
from docarray import DocList, BaseDoc
from docarray.documents import ImageDoc
class Prompt(BaseDoc):
text: str
prompt = Prompt(
text='suggest an interesting image generation prompt for a mona lisa variant'
)
client = Client(port=12345) # use port from output above
response = client.post(on='/', inputs=[prompt], return_type=DocList[ImageDoc])
response[0].display()
Next steps#
Executor Hub allows you to containerize, share, explore and make Executors ready for the cloud.
Support#
Join our Discord community and chat with other community members about ideas.
Subscribe to the latest video tutorials on our YouTube channel
Join Us#
Jina is backed by Jina AI and licensed under Apache-2.0.