Large Language Models (LLMs) have revolutionized the field of natural language processing (NLP) and opened up a myriad of possibilities for various applications. With the recent rise in popularity of the LangChain framework, developers are now able to create powerful end-to-end user-facing applications by chaining different steps in a workflow.
In this blog post, we'll show how Jina AI's Inference, a robust SaaS serving highly curated models for the most common machine learning tasks, seamlessly integrates with LangChain, allowing developers to build cutting-edge applications with unparalleled ease.
LangChain and Inference: A match made in heaven
Given the enormous value proposition of LangChain in terms of ease of use across several use cases, we are proud to announce that Inference is now fully compatible with LangChain. Existing users that have already created Inference APIs can, starting now, completely integrate these APIs into their LangChain workflow, and enjoy numerous benefits, including:
- Simplified workflow: By integrating Inference with LangChain, developers can easily access and utilize the power of CLIP embeddings without having to train or deploy neural networks. This reduces time spent on complex setup and management.
- Scalability: Inference allows developers to generate embeddings for their input documents on demand, ensuring that their applications can scale with ease.
- Flexibility: The compatibility between Inference and LangChain enables developers to mix and match different components in their applications. This allows for the creation of customized solutions tailored to specific needs.
- Cost-effective: With Inference’s highly competitive pay-per-document pricing model, developers only pay for the embeddings they generate, making it a cost-effective option for projects of any size.
Invoking Inference from LangChain: Extracting passages from a PDF document
Follow the steps below to create a simple application that processes text from a PDF document, generates embeddings using Inference, and uses them to extract the most relevant passages from the document given a user’s question:
Follow the steps below to create a simple application that processes text from a PDF document, generates embeddings using Inference, and uses them to extract the most relevant passages from the document given a user’s question:
Step 1: Sign up on Jina AI Cloud
Visit Jina AI Cloud to sign up.
Step 2: Generate a new token
If you’re a first-time user of Inference, you will need a personalized user token that you can create under the Tokens section . You will need this token in step 7.
Step 3: Create an Inference API
You can now create a personalized Inference API from the several models and variants offered on our platform. Before creating an API, you can try out the several tasks supported by a model, such as CLIP.
Click on a model type, e.g. CLIP, then click on Details
to see technical details of the provided models and their variants.
Click on Create
on the top right of the details page, to start the creation process.
Enter the name of your API, e.g. “my-first-inference-api
”.
Now, based on your requirements for query throughput and encoding performance, select a model variant such as ViT-B-32::openai
. You will use the name of this model in step 7.
Finally, click on Create
.
Step 4: Install packages
To get started, you’ll need to install the required packages for LangChain and Jina AI’s Inference. The following command will install these packages for you:
pip install "langchain>=0.0.124" jina==3.14.1 chromadb unstructured
Chroma is an in-memory vector database, that is especially useful for this tutorial because it means you don’t need to have a running installation of a vector database on your system or remotely.
Step 5: Import libraries
from langchain.embeddings import JinaEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import UnstructuredPDFLoader
Step 6: Read PDF content:
loader = UnstructuredPDFLoader('path/to/file/knowledge.pdf')
data = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=0)
chunks = text_splitter.split_text(data[0].page_content)
The CharacterTextSplitter
utility from LangChain allows you to split a large text, as we would have gotten for our PDF, into smaller chunks that we should then embed using JinaEmbeddings
.
Step 7: Embed chunks
Embed document chunks using JinaEmbeddings
object:
embeddings = JinaEmbeddings(
jina_auth_token='<your-auth-token-from-step-2>', model_name='ViT-B-32::openai'
)
docsearch = Chroma.from_texts(
chunks, embeddings, metadatas=[{'source': f'{i}'} for i in range(len(chunks))]
)
The jina_auth_token
to be used here is the same one that you generated in step 2. The model_name
should also match the model that you used for creating your Inference API in step 3.
Step 8: Find similar passages
Find the most similar passages in the document for a given query:
query = 'the highest mountain in the Alaskan Range'
answers = docsearch.similarity_search(query)
print([answer.page_content for answer in answers])
The output could be as follows:
[
'Clark at its southwest end to the White River in Canadas Yukon Territory in the southeast. The highest mountain in North America, Denali, is in the',
'The Alaska Range is a relatively narrow, 600-mile-long (950km) mountain range in the southcentral region of the U.S. state of Alaska, from Lake',
'Alaska Range. It is part of the American Cordillera. The Alaska range is one of the higher ranges in the world after the Himalayas and the Andes.'
]
Conclusion
The integration of Jina AI’s Inference with the LangChain framework is a game-changer for developers working with text embeddings. By combining the power of Inference’s embedding generation capabilities with LangChain’s flexible and modular workflow, developers can create innovative, scalable, and cost-effective applications with unprecedented ease.
Don't miss out on the opportunity to leverage this powerful combination. Start exploring the benefits of Jina AI’s Inference and LangChain integration today by logging into Jina AI Cloud.
What’s next?
If you’re looking to put your existing LangChain workflows into production, then also check out LangChain Serve. This repository allows users to serve local chains and agents as RESTful, gRPC, or WebSocket APIs thanks to Jina. Deploy your chains and agents with ease and enjoy independent scaling, serverless and autoscaling APIs, as well as a Streamlit playground on Jina AI Cloud.