Skip to main content

MongoDB Atlas

Compatibility

Only available on Node.js.

You can still create API routes that use MongoDB with Next.js by setting the runtime variable to nodejs like so:

export const runtime = "nodejs";

You can read more about Edge runtimes in the Next.js documentation here.

This guide provides a quick overview for getting started with MongoDB Atlas vector stores. For detailed documentation of all MongoDBAtlasVectorSearch features and configurations head to the API reference.

Overview​

Integration details​

ClassPackagePY supportPackage latest
MongoDBAtlasVectorSearch@langchain/mongodbβœ…NPM - Version

Setup​

To use MongoDB Atlas vector stores, you’ll need to configure a MongoDB Atlas cluster and install the @langchain/mongodb integration package.

Initial Cluster Configuration​

To create a MongoDB Atlas cluster, navigate to the MongoDB Atlas website and create an account if you don’t already have one.

Create and name a cluster when prompted, then find it under Database. Select Browse Collections and create either a blank collection or one from the provided sample data.

Note: The cluster created must be MongoDB 7.0 or higher.

Creating an Index​

After configuring your cluster, you’ll need to create an index on the collection field you want to search over.

Switch to the Atlas Search tab and click Create Search Index. From there, make sure you select Atlas Vector Search - JSON Editor, then select the appropriate database and collection and paste the following into the textbox:

{
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "euclidean",
"type": "vector"
}
]
}

Note that the dimensions property should match the dimensionality of the embeddings you are using. For example, Cohere embeddings have 1024 dimensions, and by default OpenAI embeddings have 1536:

Note: By default the vector store expects an index name of default, an indexed collection field name of embedding, and a raw text field name of text. You should initialize the vector store with field names matching your index name collection schema as shown below.

Finally, proceed to build the index.

Embeddings​

This guide will also use OpenAI embeddings, which require you to install the @langchain/openai integration package. You can also use other supported embeddings models if you wish.

Installation​

Install the following packages:

yarn add @langchain/mongodb mongodb @langchain/openai @langchain/core

Credentials​

Once you’ve done the above, set the MONGODB_ATLAS_URI environment variable from the Connect button in Mongo’s dashboard. You’ll also need your DB name and collection name:

process.env.MONGODB_ATLAS_URI = "your-atlas-url";
process.env.MONGODB_ATLAS_COLLECTION_NAME = "your-atlas-db-name";
process.env.MONGODB_ATLAS_DB_NAME = "your-atlas-db-name";

If you are using OpenAI embeddings for this guide, you’ll need to set your OpenAI key as well:

process.env.OPENAI_API_KEY = "YOUR_API_KEY";

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

// process.env.LANGCHAIN_TRACING_V2="true"
// process.env.LANGCHAIN_API_KEY="your-api-key"

Instantiation​

Once you’ve set up your cluster as shown above, you can initialize your vector store as follows:

import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MongoClient } from "mongodb";

const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const collection = client
.db(process.env.MONGODB_ATLAS_DB_NAME)
.collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);

const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
});

const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection: collection,
indexName: "vector_index", // The name of the Atlas search index. Defaults to "default"
textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});

Manage vector store​

Add items to vector store​

You can now add documents to your vector store:

import type { Document } from "@langchain/core/documents";

const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { source: "https://example.com" },
};

const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { source: "https://example.com" },
};

const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { source: "https://example.com" },
};

const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

await vectorStore.addDocuments(documents, { ids: ["1", "2", "3", "4"] });
[ '1', '2', '3', '4' ]

Note: After adding documents, there is a slight delay before they become queryable.

Adding a document with the same id as an existing document will update the existing one.

Delete items from vector store​

await vectorStore.delete({ ids: ["4"] });

Query vector store​

Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.

Query directly​

Performing a simple similarity search can be done as follows:

const similaritySearchResults = await vectorStore.similaritySearch(
"biology",
2
);

for (const doc of similaritySearchResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

Filtering​

MongoDB Atlas supports pre-filtering of results on other fields. They require you to define which metadata fields you plan to filter on by updating the index you created initially. Here’s an example:

{
"fields": [
{
"numDimensions": 1024,
"path": "embedding",
"similarity": "euclidean",
"type": "vector"
},
{
"path": "source",
"type": "filter"
}
]
}

Above, the first item in fields is the vector index, and the second item is the metadata property you want to filter on. The name of the property is the value of the path key. So the above index would allow us to search on a metadata field named source.

Then, in your code you can use MQL Query Operators for filtering.

The below example illustrates this:

const filter = {
preFilter: {
source: {
$eq: "https://example.com",
},
},
};

const filteredResults = await vectorStore.similaritySearch(
"biology",
2,
filter
);

for (const doc of filteredResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

Returning scores​

If you want to execute a similarity search and receive the corresponding scores you can run:

const similaritySearchWithScoreResults =
await vectorStore.similaritySearchWithScore("biology", 2, filter);

for (const [doc, score] of similaritySearchWithScoreResults) {
console.log(
`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
doc.metadata
)}]`
);
}
* [SIM=0.374] The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* [SIM=0.370] Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

Query by turning into retriever​

You can also transform the vector store into a retriever for easier usage in your chains.

const retriever = vectorStore.asRetriever({
// Optional filter
filter: filter,
k: 2,
});
await retriever.invoke("biology");
[
Document {
pageContent: 'The powerhouse of the cell is the mitochondria',
metadata: { _id: '1', source: 'https://example.com' },
id: undefined
},
Document {
pageContent: 'Mitochondria are made out of lipids',
metadata: { _id: '3', source: 'https://example.com' },
id: undefined
}
]

Usage for retrieval-augmented generation​

For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:

Closing connections​

Make sure you close the client instance when you are finished to avoid excessive resource consumption:

await client.close();

API reference​

For detailed documentation of all MongoDBAtlasVectorSearch features and configurations head to the API reference.


Was this page helpful?


You can also leave detailed feedback on GitHub.