Jina is a MLOps framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer.
Release Note (3.13.0
)
This release contains 14 new features, 9 bug fixes and 7 documentation improvements.
This release introduces major features like Custom Gateways, Dynamic Batching for Executors, development support with auto-reloading, support for the new namespaced Executor scheme jinaai
, improvements for our gRPC transport layer, and more.
🆕 Features
Custom Gateways (#5153, #5189, #5342, #5457, #5465, #5472 and #5477)
Jina Gateways are now customizable in the sense that you can implement them in much the same way as an Executor. With this feature, Jina gives power to the user to implement any server, protocol or interface at the Gateway level. There's no more need to build an extra service that uses the Flow.
For instance, you can define a Jina Gateway that communicates with the rest of Flow Executors like so:
from jina.serve.runtimes.gateway.http.fastapi import FastAPIBaseGateway
class MyGateway(FastAPIBaseGateway):
@property
def app(self):
from fastapi import FastAPI
app = FastAPI(title='Custom FastAPI Gateway')
@app.get(path='/service')
async def my_service(input: str):
# convert input request to Documents
docs = DocumentArray([Document(text=input)])
# send Documents to Executors using GatewayStreamer
result = None
async for response_docs in self.streamer.stream_docs(
docs=docs,
exec_endpoint='/',
):
# convert response docs to server response and return it
result = response_docs[0].text
return {'result': result}
return app
Then you can use it in your Flow in the following way:
flow = Flow().config_gateway(
uses=MyGateway, port=12345, protocol='http'
)
A Custom Gateway can be used as a Python class, YAML configuration or Docker image.
Adding support for Custom Gateways required exposing the Gateway API and supporting multiple ports and protocols (mentioned in a prior release). You can customize it by subclassing the FastAPIBaseGateway class (for simple implementation) or base Gateway for more complex use cases.
Working on this feature also involved exposing and improving the GatewayStreamer API as a way to communicate with Executors within the Gateway.
Find more information in the Custom Gateway page.
Dynamic batching (#5410)
This release adds Dynamic batching capabilities to Executors.
Dynamic batching allows requests to be accumulated and batched together before being sent to an Executor. The batch is created dynamically depending on the configuration for each endpoint.
This feature is especially relevant for inference tasks where model inference is more optimized when batched to efficiently use GPU resources.
You can configure Dynamic batching using either a decorator or the uses_dynamic_batching
parameter. The following example shows how to enable Dynamic batching on an Executor that performs model inference:
from jina import requests, dynamic_batching, Executor, DocumentArray, Flow
class MyExecutor(Executor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# initialize model
self.model = torch.nn.Linear(in_features=128, out_features=128)
@requests(on='/bar')
@dynamic_batching(preferred_batch_size=10, timeout=200)
def embed(self, docs: DocumentArray, **kwargs):
docs.embeddings = self.model(torch.Tensor(docs.tensors))
flow = Flow().add(uses=MyExecutor)
With Dynamic Batching enabled, the Executor above will efficiently use GPU resources to perform inference by batching Documents together.
Read more about the feature in the Dynamic Batching documentation page.
Install requirements of local Executors (#5508)
Prior to this release, the install_requirements
parameter of Executors only installed Executor requirements for Hub Executors. Now, local Executors with a requirements.txt
file will also have their requirements installed before starting Flows.
Support jinaai
Executor scheme to enable namespaced Hub Executors (#5462, #5468 and #5515)
As Jina AI Cloud introduced namespaces to Executor resources, we made changes to support the new jinaai
Executor scheme. This PR adds support for the new scheme.
This means that namespaced Executors can now be used with the jinaai
scheme in the following way:
from jina import Flow
flow =Flow().add(uses='jinaai://jina-ai/DummyHubExecutor')
This scheme is also supported in Kubernetes and other APIs:
from jina import Flow
flow = Flow().add(uses='jinaai+docker://jina-ai/DummyHubExecutor')
flow.to_kubernetes_yaml('output_path', k8s_namespace='my-namespace')
The support of the new scheme means the minimum supported version of jina-hubble-sdk
has been increased to 0.26.10
.
Add auto-reloading to Flow and Executor on file changes (#5461, #5488 and #5514)
A new argument reload
has been added to the Flow and Executor APIs, which automatically reloads running Flows and Executors when changes are made to Executor source code or YAML configurations of Flows and Executors.
Although this feature is only meant for development, it aims to help developers iterate fast and automatically update Flows with changes they make live during development.
Find out more about this feature in these two sections:
Expand Executor serve parameters (#5494)
The method Executor.serve
can receive more parameters, similar to what the Flow API expects. With new parameters to control serving and deployment configurations of the Executor, this method empowers the Executor to be convenient for single service tasks.
This means you can not only build advanced microservices-based pipelines and applications, but also build individual services with all Jina features: shards/replicas, dynamic batching, auto-reload, etc.
Read more about the method in the Python API documentation.
Add gRPC trailing metadata when logging gRPC error (#5512)
When logging gRPC errors, context trailing metadata is now shown. This helps identify underlying network issues rather than the error codes that mask multiple network errors into a single gRPC status code.
For instance, the new log message looks like the following:
DEBUG gateway@ 1 GRPC call to deployment executor0 failed
with error <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
...
trailing_metadata=Metadata((('content-length', '0'),
('l5d-proxy-error', 'HTTP Balancer service in
fail-fast'), ('l5d-proxy-connection', 'close'),
('date', 'Tue, 13 Dec 2022 10:20:15 GMT'))), for
retry attempt 2/3. Trying next replica, if available.
The trailing metadata returned by load balancers will help to identify the root cause more accurately.
Implement unary_unary
stub for Gateway Runtime (#5507)
This release adds the gRPC unary_unary
stub for Gateway Runtime as a new communication stub with Executors. Since the gRPC performance best practices for Python page suggests that unary stream implementation might be faster for Python, we added this communication method.
However, this is not enabled by default. The streaming RPC method will still be used unless you set the stream
option to False
in the Client.post()
method. The feature is only effective when the gRPC protocol is used.
Read more about the feature in the documentation.
Add Startup Probe and replace Readiness Probe with Liveness Probe (#5407)
Before this release, when exporting Jina Flows to Kubernetes YAML configurations, Kubernetes Readiness Probes used to be added for the Gateway pod and each Executor pod. In this release we have added a Startup Probe and replaced Readiness Probe with Liveness Probe.
Both probes use the jina ping
command to check that pods are healthy.
New Jina perf Docker images (#5498)
We added a slightly larger Docker image with suffix perf
which includes a set of tools useful for performance tuning and debugging.
The new image is available in Jina AI's Docker hub.
New Jina Docker image for Python 3.10, and use Python 3.8 for default Jina image (#5490)
Besides adding Docker images aimed for performance optimization, we added an image with a newer Python version: 3.10. This is available in Jina AI's Docker hub, for example jinaai/jina:master-py310-standard
.
We also made Python 3.8 our minimum supported Python version by default, and it will be used for default Docker images.
Minimize output of jina ping
command (#5476)
jina ping
commands are now less verbose and will print less irrelevant output. However, important information like latency for each round, average latency, number of successful requests and ping result will still show up.
Add Kubernetes preStop hook to the containers (#5445)
A preStop
hook has been added to Executors and the Gateway to allow a grace period. This allows more time to complete in-flight requests and finish the server's graceful shutdown.
Generate random ports for multiple protocols (#5455)
If you use multiple protocols for a Gateway, you no longer need to specify a port for each one. Whether it's Python or YAML, you just need to specify the protocols you want to support and Jina will generate random ports for you.
Python API:
from jina import Flow
flow = Flow().config_gateway(protocol=['grpc', 'http', 'websocket'])
with flow:
flow.block()
YAML:
jtype: Flow
gateway:
protocol:
- 'grpc'
- 'http'
- 'websocket'
Result:
🐞 Bug Fixes
List-like args passed as string (#5464)
We fixed the format expected for port
, host
and port_monitoring
to feel more Pythonic. Basically, if you use replicas, you no longer have to provide comma-separated ports as a string value. Instead, you can simply pass a list of values, no need to put all in a string anymore!
For instance, suppose we have two external replicas of an Executor that we want to join in our Flow (the first is hosted on localhost:12345
and the second on 91.198.174.192:12346
). We can add them like this:
from jina import Flow
replica_hosts, replica_ports = ['localhost', '91.198.174.192'], [
'12345',
'12346',
] # instead of 'localhost,91.198.174.192', '12345,12346'
Flow().add(host=replica_hosts, port=replica_ports, external=True)
Or:
Flow().add(host=['localhost:12345', '91.198.174.192:12346'], external=True)
Note that this is not a breaking change, and the old syntax (comma-separated values: Flow().add(host='localhost:12345, 91.198.174.192:12346', external=True)
) is still supported for backwards compatibility.
Restore port to overload type hint and JSON schema (#5501)
When we made port
and protocol
arguments of the Gateway support multiple values, a bug was introduced where port
did not appear in Jina's JSON schema as well as the Flow API overload for method signatures.
Although the arguments are functional in both the Python API and YAML, this suppressed auto-completion and developer support for these parameters. This release restores the port
parameter in both the Flow method overloads and JSON schema.
Do not force insecure
to True
in open telemetry integration (#5483)
In Jina's instrumentation, communication to open telemetry exporters used to be forced to insecure
mode. Luckily, our community member @big-thousand picked this up and submitted a fix. The communication is no longer forced to the insecure
mode.
Kudos to @big-thousand for his contribution!
Fix problem when using floating Executor in HTTP (#5493)
We found a bug when using Floating Executors in HTTP, where the floating Executor is connected to the Gateway (in the Flow topology). In this case, the Executor would not receive input Documents properly. This release fixes the mentioned bug.
Add egg info post install command for egg info setup mode (#5491)
This release adds support for the egg info
setup mode in Python. This means post-installation commands are now properly executed in environments that rely on Python's new setup mode.
This bug resulted in several issues especially for environments that depend on these post-installation commands. For instance, some Environment Variables that are needed for Jina to work on macOS and for CLI auto-complete.
Do not apply limits when gpus='all'
in Kubernetes (#5485)
If Executor parameter gpus
is set to "all"
, no limits will be applied on the pod in Kubernetes.
Fix Windows signal handling (#5484)
This release improves signal handling on Windows, specifically when cancelling a Flow with an OS signal.
Cap opentelemetry-instrumentation-aiohttp-client
(#5452)
This release caps the version for opentelemetry-instrumentation-aiohttp-client
which is incompatible with opentelemetry-semantic-conventions
.
Raise exceptions from path importer (#5447)
Previously, errors were hidden when they came from a Python module imported to load an Executor. Actually the module was not considered to be a Python module, which produced other misleading errors. In this release, actual errors during imports will be raised and no longer hidden.
📗 Documentation Improvements
- Add gRPC requirements for Apple Silicon (M1 Chip) to fix failing installation of Jina (#5511)
- Add redirects from '/fundamentals' to '/concepts' (#5504)
- Update JCloud documentation to the jcloud
v0.1.0
(#5385) - Restructure documentation under
/concepts
- Change Executor URI scheme to namespaced scheme
jinaai
(#5450) - Custom Gateway documentation (#5465)
- Provide more accurate description for port and protocol parameters of the Gateway (#5456)
🤘 Contributors
We would like to thank all contributors to this release:
- Delgermurun (@delgermurun)
- Jie Fu (@jemmyshin)
- Alex Cureton-Griffiths (@alexcg1)
- big-thousand (@big-thousand)
- IyadhKhalfallah (@IyadhKhalfallah)
- Deepankar Mahapatro (@deepankarm)
- samsja (@samsja)
- AlaeddineAbdessalem (@alaeddine-13)
- Joan Fontanals (@JoanFM)
- Anne Yang (@AnneYang720)
- Han Xiao (@hanxiao)
- Girish Chandrashekar (@girishc13)
- Jackmin801 (@Jackmin801)