Zep Cloud
Zep is a long-term memory service for AI Assistant apps. With Zep, you can provide AI assistants with the ability to recall past conversations, no matter how distant, while also reducing hallucinations, latency, and cost.
Note: The ZepCloudVectorStore
works with Documents
and is intended to be used as a Retriever
.
It offers separate functionality to Zep's ZepCloudMemory
class, which is designed for persisting, enriching
and searching your user's chat history.
Why Zep's VectorStore? 🤖🚀
Zep automatically embeds documents added to the Zep Vector Store using low-latency models local to the Zep server. The Zep TS/JS client can be used in non-Node edge environments. These two together with Zep's chat memory functionality make Zep ideal for building conversational LLM apps where latency and performance are important.
Supported Search Types
Zep supports both similarity search and Maximal Marginal Relevance (MMR) search. MMR search is particularly useful for Retrieval Augmented Generation applications as it re-ranks results to ensure diversity in the returned documents.
Installation
Sign up for Zep Cloud and create a project.
Follow the Zep Cloud Typescript SDK Installation Guide to install and get started with Zep.
Usage
You'll need your Zep Cloud Project API Key to use the Zep VectorStore. See the Zep Cloud docs for more information.
Zep auto embeds all documents by default, and it's not expecting to receive any embeddings from the user.
Since LangChain requires passing in a Embeddings
instance, we pass in FakeEmbeddings
.
Example: Creating a ZepVectorStore from Documents & Querying
- npm
- Yarn
- pnpm
npm install @getzep/zep-cloud @langchain/openai @langchain/community @langchain/core
yarn add @getzep/zep-cloud @langchain/openai @langchain/community @langchain/core
pnpm add @getzep/zep-cloud @langchain/openai @langchain/community @langchain/core
import { ZepCloudVectorStore } from "@langchain/community/vectorstores/zep_cloud";
import { FakeEmbeddings } from "@langchain/core/utils/testing";
import { TextLoader } from "langchain/document_loaders/fs/text";
import { randomUUID } from "crypto";
const loader = new TextLoader("src/document_loaders/example_data/example.txt");
const docs = await loader.load();
const collectionName = `collection${randomUUID().split("-")[0]}`;
const zepConfig = {
// Your Zep Cloud Project API key https://help.getzep.com/projects
apiKey: "<Zep Api Key>",
collectionName,
};
// We're using fake embeddings here, because Zep Cloud handles embedding for you
const embeddings = new FakeEmbeddings();
const vectorStore = await ZepCloudVectorStore.fromDocuments(
docs,
embeddings,
zepConfig
);
// Wait for the documents to be embedded
// eslint-disable-next-line no-constant-condition
while (true) {
const c = await vectorStore.client.document.getCollection(collectionName);
console.log(
`Embedding status: ${c.documentEmbeddedCount}/${c.documentCount} documents embedded`
);
// eslint-disable-next-line no-promise-executor-return
await new Promise((resolve) => setTimeout(resolve, 1000));
if (c.documentEmbeddedCount === c.documentCount) {
break;
}
}
const results = await vectorStore.similaritySearchWithScore("bar", 3);
console.log("Similarity Results:");
console.log(JSON.stringify(results));
const results2 = await vectorStore.maxMarginalRelevanceSearch("bar", {
k: 3,
});
console.log("MMR Results:");
console.log(JSON.stringify(results2));
API Reference:
- ZepCloudVectorStore from
@langchain/community/vectorstores/zep_cloud
- FakeEmbeddings from
@langchain/core/utils/testing
- TextLoader from
langchain/document_loaders/fs/text
Example: Using ZepCloudVectorStore with Expression Language
import { ZepClient } from "@getzep/zep-cloud";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ConsoleCallbackHandler } from "@langchain/core/tracers/console";
import { ChatOpenAI } from "@langchain/openai";
import { Document } from "@langchain/core/documents";
import {
RunnableLambda,
RunnableMap,
RunnablePassthrough,
} from "@langchain/core/runnables";
import { ZepCloudVectorStore } from "@langchain/community/vectorstores/zep_cloud";
import { StringOutputParser } from "@langchain/core/output_parsers";
async function combineDocuments(docs: Document[], documentSeparator = "\n\n") {
const docStrings: string[] = await Promise.all(
docs.map((doc) => doc.pageContent)
);
return docStrings.join(documentSeparator);
}
// Your Zep Collection Name
const collectionName = "<Zep Collection Name>";
const zepClient = new ZepClient({
// Your Zep Cloud Project API key https://help.getzep.com/projects
apiKey: "<Zep Api Key>",
});
const vectorStore = await ZepCloudVectorStore.init({
client: zepClient,
collectionName,
});
const prompt = ChatPromptTemplate.fromMessages([
[
"system",
`Answer the question based only on the following context: {context}`,
],
["human", "{question}"],
]);
const model = new ChatOpenAI({
temperature: 0.8,
modelName: "gpt-3.5-turbo-1106",
});
const retriever = vectorStore.asRetriever();
const setupAndRetrieval = RunnableMap.from({
context: new RunnableLambda({
func: (input: string) => retriever.invoke(input).then(combineDocuments),
}),
question: new RunnablePassthrough(),
});
const outputParser = new StringOutputParser();
const chain = setupAndRetrieval
.pipe(prompt)
.pipe(model)
.pipe(outputParser)
.withConfig({
callbacks: [new ConsoleCallbackHandler()],
});
const result = await chain.invoke("Project Gutenberg?");
console.log("result", result);
API Reference:
- ChatPromptTemplate from
@langchain/core/prompts
- ConsoleCallbackHandler from
@langchain/core/tracers/console
- ChatOpenAI from
@langchain/openai
- Document from
@langchain/core/documents
- RunnableLambda from
@langchain/core/runnables
- RunnableMap from
@langchain/core/runnables
- RunnablePassthrough from
@langchain/core/runnables
- ZepCloudVectorStore from
@langchain/community/vectorstores/zep_cloud
- StringOutputParser from
@langchain/core/output_parsers
Related
- Vector store conceptual guide
- Vector store how-to guides