Jina is a MLOps framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer.
Release Note (3.14.0
)
This release contains 11 new features, 6 refactors, 12 bug fixes and 10 documentation improvements.
๐ Features
Reshaping Executors as standalone services with the Deployment layer (#5563, #5590, #5628, #5672 and #5673)
In this release we aim to unlock more use cases, mainly building highly performant and scalable services. With its built-in layers of abstraction, Jina lets users build scalable, containerized, cloud-native components which we call Executors. Executors have always been services, but they were mostly used in Flows to form pipelines.
Now you can deploy an Executor on its own, without needing a Flow. Whether it's for model inference, prediction, embedding, generation or search, an Executor can wrap your business logic, and you get a gRPC microservice with Jina's cloud-native features (shards, replicas, dynamic batching, etc.)
To do this we offer the Deployment layer to deploy an Executor. Just like a Flow groups and orchestrates many Executors, a Deployment orchestrates just one Executor.
A Deployment can be used with both the Python API and YAML. For instance, after you define an Executor, use the Deployment class to serve it:
from jina import Deployment
with Deployment(uses=MyExecutor, port=12345, replicas=2) as dep:
dep.block() # serve forever
โโโโโโโโโโโโโโโโโโโโโโโ ๐ Deployment is ready to serve! โโโโโโโโโโโโโโโโโโโโโโโ
โญโโโโโโโโโโโโโโ ๐ Endpoint โโโโโโโโโโโโโโโโโฎ
โ โ Protocol GRPC โ
โ ๐ Local 0.0.0.0:12345 โ
โ ๐ Private 192.168.3.147:12345 โ
โ ๐ Public 87.191.159.105:12345 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Or implement a Deployment in YAML and run it from the CLI:
jtype: Deployment
with:
port: 12345
replicas: 2
uses: MyExecutor
py_modules:
- my_executor.py
jina deployment --uses deployment.yml
The Deployment class offers the same interface as a Flow, so it can be used as a client too:
from jina import Deployment
with Deployment(uses=MyExecutor, port=12345, replicas=2) as dep:
docs = dep.post(on='/foo', inputs=DocumentArray.empty(1)
print(docs.texts)
Furthermore, you can use the Deployment to create Kubernetes and Docker Compose YAML configurations of a single Executor deployment. So, to export to Kubernetes with the Python API:
from jina import Deployment
dep = Deployment(uses='jinaai+docker://jina-ai/DummyHubExecutor', port_expose=8080, replicas=3)
dep.to_kubernetes_yaml('/tmp/config_out_folder', k8s_namespace='my-namespace')
And exporting to Kubernetes with the CLI is just as straightforward:
jina export kubernetes deployment.yml output_path
As is exporting to Docker Compose with the Python API:
from jina import Deployment
dep = Deployment(uses='jinaai+docker://jina-ai/DummyHubExecutor', port_expose=8080, replicas=3)
dep.to_docker_compose_yaml(
output_path='/tmp/docker-compose.yml',
)
And of course, you can also export to Docker Compose with the CLI:
jina export docker-compose deployment.yml output_path
Read more about serving standalone Executors in our documentation.
(Beta) Support DocArray v2 (#5603)
As the DocArray refactoring is shaping up nicely, we've decided to integrate initial support. Although this support is still experimental, we believe DocArray v2 offers nice abstractions to clearly define the data of your services, especially with the single Executor deployment that we introduce in this release.
With this new experimental feature, you can define your input and output schemas with DocArray v2 and use type hints to define schemas of each endpoint:
from jina import Executor, requests
from docarray import BaseDocument, DocumentArray
from docarray.typing import AnyTensor, ImageUrl
class InputDoc(BaseDocument):
img: ImageUrl
class OutputDoc(BaseDocument):
embedding: AnyTensor
class MyExec(Executor):
@requests(on='/bar')
def bar(
self, docs: DocumentArray[InputDoc], **kwargs
) -> DocumentArray[OutputDoc]:
return_docs = DocumentArray[OutputDoc](
[OutputDoc(embedding=embed(doc.img)) for doc in docs]
)
return return_docs
Read more about the integration in the DocArray v2 section of our docs.
Communicate with individual Executors in Custom Gateways (#5558)
Custom Gateways can now make separate calls to specific Executors without respecting the Flow's topology.
With this feature, we target a different set of use cases, where the task does not necessarily have to be defined by a DAG pipeline. Rather, you define processing order using explicit calls to Executors and implement any use case where there's a central service (Gateway) communicating with remote services (Executors).
For instance, you can implement a Gateway like so:
from jina.serve.runtimes.gateway.http.fastapi import FastAPIBaseGateway
from jina import Document, DocumentArray, Flow, Executor, requests
from fastapi import FastAPI
class MyGateway(FastAPIBaseGateway):
@property
def app(self):
app = FastAPI()
@app.get("/endpoint")
async def get(text: str):
doc1 = await self.executor['executor1'].post(on='/', inputs=DocumentArray([Document(text=text)]))
doc2 = await self.executor['executor2'].post(on='/', inputs=DocumentArray([Document(text=text)]))
return {'result': doc1.texts + doc2.texts}
return app
# Add the Gateway and Executors to a Flow
flow = Flow() \
.config_gateway(uses=MyGateway, protocol='http', port=12345) \
.add(uses=FirstExec, name='executor1') \
.add(uses=SecondExec, name='executor2')
Read more about calling individual Executors.
Add secrets to Jina on Kubernetes (#5557)
To support building secure apps, we've added support for secrets on Kubernetes in Jina. Mainly, you can create environment variables whose sources are Kubernetes Secrets.
Add the secret using the env_from_secret
parameter either in Python API or YAML:
from jina import Flow
f = (
Flow().add(
uses='jinaai+docker://jina-ai/DummyHubExecutor',
env_from_secret={
'SECRET_USERNAME': {'name': 'mysecret', 'key': 'username'},
'SECRET_PASSWORD': {'name': 'mysecret', 'key': 'password'},
},
)
)
f.to_kubernetes_yaml('./k8s_flow', k8s_namespace='custom-namespace')
Add GatewayStreamer.stream()
to yield response and Executor errors (#5650)
If you're implementing a custom Gateway, you can use the GatewayStreamer.stream()
method to catch errors raised in Executors. Catching such errors wasn't possible with the GatewayStreamer.stream_docs()
method.
async for docs, error in self.streamer.stream(
docs=my_da,
exec_endpoint='/',
):
if error:
# raise error
else:
# process results
Read more about the feature in the documentation.
Add argument suppress_root_logging
to remove or preserve root logging handlers (#5635)
In this release, we've added the argument suppress_root_logging
to (you guessed it) suppress root logger messages. By default, root logs are suppressed.
Kudos to our community member @Jake-00 for the contribution!
Add gRPC streaming endpoint to Worker and Head runtimes (#5614)
To empower Executors we've added a gRPC streaming endpoint to both the Worker and Head runtimes. This means that an Executor or Head gRPC server exposes the same interface as a Jina gRPC Gateway. Therefore, you can use Jina's gRPC Client with each of those entities.
Add prefetch
argument to client post method (#5607)
A prefetch
argument has been added to the Client.post()
method. Previously, this argument was only available to the Gateway and it controlled how many requests a Gateway could send to Executors at a time.
However, it was not possible to control how many requests a Gateway (or Executor in case of a single Executor Deployment) could receive at a time.
Therefore, we've added the argument to the Client.post()
to give you better control over your requests.
Read more in the documentation.
Run warmup on runtimes and Executor (#5579)
On startup, all Jina entities that hold gRPC connections and stubs to other entities (Head and Gateway) now start warming up before the services become ready. This ensures lower latencies on first requests submitted to the Flow.
Make gRPC Client thread safe (#5533)
Previously, as gRPC asyncio clients offer limited support for multi-threading, using the Jina gRPC Client in multiple threads would print errors.
Therefore, in this release, we make the gRPC Client thread-safe in the sense that a thread can re-use it multiple times without another thread using it simultaneously. This means you can use the gRPC Client with multi-threading, while being sure only asyncio tasks belonging to the same thread have access to the gRPC stub at the same time.
Add user survey (#5667)
When running a Flow, a message now shows up in the terminal with a survey link. Feel free to fill in our survey to help us improve Jina. Your feedback is much appreciated!
โ Refactoring
Use single Gateway streamer for multiprotocol Gateway (#5598)
When we released multiprotocol Gateways, the implementation relied on exposing a separate gRPC connection and stubs protocol. As this turned out to be unnecessary, this release re-uses the same connections and stubs.
Remove manual deletion of channel resources (#5633)
This release refactors how we handle channel resources. Mainly, deletion of channel resources is no longer handled manually and is left to the garbage collector.
No need to run summary in thread (#5632)
Getting and printing Flow summary information is no longer executed in a separate thread, and is now handled in the main thread.
Refactor GRPCConnectionPool
implementation into a package (#5623)
All gRPC connection pool logic has been refactored into a separate package, rather than having a GRPCConnectionPool
class.
Remove reversing request order (#5580)
Reversing request order has been removed from runtime logic.
Simplify get_docs_from_request
helper function (#5567)
The get_docs_from_request
helper function in the request handler module has been simplified and no longer accepts unneeded parameters.
๐ Bug Fixes
Relax protobuf version (#5591)
To better support different environments, we've relaxed Jina's protobuf version so it no longer conflicts with Google Colab's pre-installed version (which may result in breaking some installed dependencies).
Fix loading Gateway arguments from YAML (#5664 and #5678)
Prior to this release, loading Gateway configurations from YAML had a few bugs. Mainly, some parameters were not passed correctly to the Gateway runtime when configs were loaded. Also, other default runtime Gateway arguments would always override arguments from YAML configs.
Fix usage with cuda_visible_devices
(#5654)
Prior to this release, using replicas on multiple GPUs would fail if the CUDA_VISIBLE_DEVICES
environment variable was passed with the env
parameter, rather than actually being set in the environment variables.
Properly use logging configuration in Executor, Gateway and Client (#5638)
This release unifies the logging configuration in Executors, Gateways and Clients and exposes configuration parameters properly. You can now pass log configuration to your Clients and Flows and expect consistent logging behavior:
from jina import Client
client = Client(log_config='./logging.json.yml')
# or with a Flow object
from jina import Flow
f = Flow(log_config='./logging.json.yml')
with f:
# the implicit client automatically uses the log_config from the Flow for consistency
f.post('/')
Pass extra arguments to the Gateway runtime in case of containerized Gateways (#5631)
Prior to this release, if a containerized Custom Gateway was started, Jina wouldn't pass some arguments to the container entrypoint. This could break some behaviour, for instance, the runtime not knowing which port to use for serving the Gateway. The issue is fixed in this release.
Clean up OpenTelemetry resources in the Flow context manager exit procedure (#5619)
This release adds proper cleanup to the OpenTelemetry resources. The cleanup logic has been added to the Client
class and is called automatically in the Flow's context manager.
If you're using the Client
with OpenTelemetry enabled, call client.teardown_instrumentation()
to have correct spans of the client.
Improve error messages for gRPC NOT_FOUND
errors (#5617)
When an external Executor/Flow is behind an API Gateway (which is the case for JCloud), but is down, then DNS resolution succeeds but a "resource" (the Executor/Flow) cannot be found, resulting in a gRPC error with the NOT_FOUND
code.
This error case wasn't properly handled before, giving output like the following:
This output did not include information about which part of the Flow failed.
In this release, the affected deployment and its address are displayed:
ERROR gateway/rep-0/GatewayRuntime@123711 Error while [01/23/23 12:35:15]
getting responses from deployments: no Route matched
with those values
trailing_metadata=Metadata((('date', 'Mon, 23 Jan
2023 11:35:15 GMT'), ('content-length', '0'),
('x-kong-response-latency', '0'), ('server',
'kong/3.0.2')))
trailing_metadata=Metadata((('date', 'Mon, 23 Jan
2023 11:35:15 GMT'), ('content-length', '0'),
('x-kong-response-latency', '0'), ('server',
'kong/3.0.2')))
|Gateway: Connection error with deployment
`executor0` at address(es) {'blah.wolf.jina.ai'}.
Connection with {'blah.wolf.jina.ai'} succeeded, but
`executor0` was not found. Possibly `executor0` is
behind an API gateway but not reachable.
trailing_metadata=Metadata((('date', 'Mon, 23 Jan
2023 11:35:15 GMT'), ('content-length', '0'),
('x-kong-response-latency', '0'), ('server',
'kong/3.0.2')))
Use mixin_hub_pull_options_parser
from Hubble (#5586)
Some Hub parameters were implemented in both Jina and jina-hubble-sdk
. This meant that if jina-hubble-sdk
updated some parameters, there would be a mismatch and potentially bugs. This release removes these parameters from Jina to rely entirely on jina-hubble-sdk
for Hubble-specific parameters.
Disable timeout for Liveness Probe in Kubernetes and keep only Kubernetes timeout (#5594)
When an Executor is deployed to Kubernetes, a Kubernetes Liveness Probe is configured. The liveness probe uses the jina ping
command under the hood to check the Executor health. However, this health-check is subject to the Kubernetes Liveness Probe timeout as well as the jina ping
command timeout. This release removes (actually relaxes) the jina ping
command timeout to keep only one configurable timeout (it respects timeout_ready
) so that you can deploy Executors that are slow to load.
Enable timeout for pinging Executor and Gateway (#5600)
The jina ping
CLI can submit ping requests to Gateways and Executors. However, this command previously accepted a timeout
parameter that was not respected. This release fixes this behavior and specifying the timeout
parameter now makes the command fail if the ping requests are not successful after the timeout
is exceeded.
Edit terminal profile file even if it does not exist (#5597)
When Jina is installed with pip
, the installation script attempts to configure the user's terminal profile (.bashrc
, .zshrc
, .fish
files) to add configuration needed for the jina
command. However, this would be ignored if a user's terminal profile didn't exist.
With this release, the installation script now identifies the required terminal profile file depending on the user's environment and writes a new one if it does not exist already.
Multi-protocol gateway supports monitoring (#5570)
Prior to this release, using multiple protocols in the Gateway along with monitoring
would raise an error. In this release, there are no issues when using multiple protocols along with monitoring in your Gateway.
๐ Documentation improvements
- Document
tracing
support injcloud
(#5688) - Add survey banner (#5649)
- Refactor example code for experimenting with OpenTelemetry (#5656)
- Document
arm64
architecture support injina push
command (#5644) - Add
jcloud
Executor availability parameters (#5624) - Use one GPU in Jcloud deployment example (#5620)
- Add caution about exceptions inside Flow context (#5615)
- Document Flow update, restart, pause and resume on
jcloud
(#5577) - Document
ephemeral
storage type in JCloud (#5583) - Document Executor data retention with
retain
parameter in JCloud (#5572)
๐ค Contributors
We would like to thank all contributors to this release:
- Yanlong Wang (@nomagick)
- tarrantro (@tarrantro)
- samsja (@samsja)
- Alaeddine Abdessalem (@alaeddine-13)
- Girish Chandrashekar (@girishc13)
- Subba Reddy Veeramreddy (@subbuv26)
- Alaeddine Abdessalem (@nan-wang)
- Alex Cureton-Griffiths (@alexcg1)
- Jake-00 (@Jake-00)
- Anne Yang (@AnneYang720)
- Joan Fontanals (@JoanFM)
- Nikolas Pitsillos (@npitsillos)
- Johannes Messner (@JohannesMessner)
- Jackmin801 (@Jackmin801)