DocArray

The data structure for multimodal data

Process, embed, recommend, store and transfer data, laying a solid foundation for any multimodal AI project.

A one-stop solution for working with any data type

The most important part of working with AI models is the data you feed into them. Representing and processing that data becomes challenging when working with multiple modalities and across microservices.

DocArray is The data structure for multimodal data. It allows you to efficiently process, embed, recommend, store and transfer data, laying a solid foundation for any multimodal AI project.

Rich data types

Super-expressive data structure for representing complicated text, image, video, audio, 3D meshes, and more.

Pythonic experience

Designed to be as easy as a Python list. If you can use Python, you can use DocArray.

For modern apps

GraphQL support makes your server versatile on request and response; built-in data validation and JSON schema help you build reliable web services.

How does DocArray stack up?

Full SupportLimited SupportNo Support
DocArray
numpy.ndarray
JSON
pandas.DataFrame
Protobuf
Tensor / matrix data
Text data
Media data
Nested data
Mixed data of the above four
Easy to (de)serialize
Data validation (of the output)
Pythonic experience
IO support for filetype
Deep learning framework support
Multi-core / GPU support
Rich functions for data types

Ready to get started?

Let DocArray be your data structure for unstructured data.

layout