Mongo Vector Date Filtering
Research conducted with OpenAI's ChatGPT DeepResearch
Discussion
The documentation for MongoDB Atlas Vector Search is not clear on how to filter by date. This is a simple example of how to do this. I attempted to find the answer with OpenAI DeepResearch, but the answer did not work.
I do not blame DeepResearch for this. MongoDB Atlas Vector Search is a relatively new feature and the documentation is not yet complete and there are few examples of it in the wild. That’s why I’m sharing this.
I took a different approach having Cursor simplify the problem and try multiple solutions with cURL commands. That can seen below:
import { json } from '@sveltejs/kit';
import { prisma } from '$lib/utils/prisma.server';
import { getEmbedding } from '$lib/common/embeddings';
/**
* POST endpoint to test vector search with date filtering
* Request body:
* {
* "query": "search query",
* "startDate": "2023-01-01", // optional
* "endDate": "2023-12-31", // optional
* "clientId": "client123", // optional
* "filterFormat": 1 // optional (1-4)
* }
*/
export async function POST({ request }) {
try {
// Parse request body
const body = await request.json();
const {
query = 'customer service',
startDate,
endDate,
clientId,
filterFormat = 1
} = body;
return await performVectorSearch(query, startDate, endDate, clientId, filterFormat);
} catch (error) {
console.error('Error in test-vector-search POST:', error);
return json({ error: error.message }, { status: 500 });
}
}
/**
* GET endpoint to test vector search with date filtering
* Query parameters:
* - query: The search query to embed and search for
* - startDate: Optional start date in ISO format (YYYY-MM-DD)
* - endDate: Optional end date in ISO format (YYYY-MM-DD)
* - clientId: Optional client ID to filter by
* - filterFormat: Optional parameter to specify the date filter format (1, 2, 3, or 4)
*/
export async function GET({ url }) {
try {
// Get query parameters
const query = url.searchParams.get('query') || 'customer service';
const startDate = url.searchParams.get('startDate');
const endDate = url.searchParams.get('endDate');
const clientId = url.searchParams.get('clientId');
const filterFormat = parseInt(url.searchParams.get('filterFormat') || '1', 10);
return await performVectorSearch(query, startDate, endDate, clientId, filterFormat);
} catch (error) {
console.error('Error in test-vector-search GET:', error);
return json({ error: error.message }, { status: 500 });
}
}
/**
* Helper function to perform vector search with the given parameters
*/
async function performVectorSearch(query, startDate, endDate, clientId, filterFormat) {
// Generate embedding for the query
const queryEmbedding = await getEmbedding(query);
// Build base filter
const baseFilter: any = {
type: 'VOTC'
};
if (clientId) {
baseFilter.client_id = clientId;
}
// Create date filter in different formats based on the format parameter
let vectorSearchFilter = { ...baseFilter };
if (startDate && endDate) {
const startDateObj = new Date(startDate);
const endDateObj = new Date(endDate);
// Try different date filter formats
switch (filterFormat) {
case 1: // Simple object with $gte and $lte
vectorSearchFilter.started_at = {
$gte: startDateObj,
$lte: endDateObj
};
break;
case 2: // Using $and with separate conditions
vectorSearchFilter.$and = [
{ started_at: { $gte: startDateObj } },
{ started_at: { $lte: endDateObj } }
];
break;
case 3: // Using $and with combined condition
vectorSearchFilter.$and = [
{
started_at: {
$gte: startDateObj,
$lte: endDateObj
}
}
];
break;
case 4: // Using $and with $date operator
vectorSearchFilter.$and = [
{
started_at: {
$gte: { $date: startDateObj },
$lte: { $date: endDateObj }
}
}
];
break;
}
}
console.log('Vector search filter:', JSON.stringify(vectorSearchFilter, null, 2));
// Build the pipeline
const pipeline = [
{
$vectorSearch: {
queryVector: queryEmbedding,
path: 'embedding',
numCandidates: 100,
limit: 10,
index: 'default',
filter: vectorSearchFilter
}
},
{
$project: {
_id: 1,
conversation_id: 1,
client_id: 1,
summary: 1,
started_at: 1,
score: { $meta: 'vectorSearchScore' }
}
}
];
// Execute the pipeline
const results = await prisma.summaries.aggregateRaw({
pipeline
});
return json({
query,
filter: vectorSearchFilter,
results,
count: Array.isArray(results) ? results.length : 0
});
}
This pattern has Cursor send the cURL command and inspect the response. The 4th solution worked.
Final Solution
You will see that we redeclared the object vectorSearchFilter
and then added the date filter to it if it was present. This is because the $vectorSearch
stage does not support the $gte
and $lte
operators as strings. It requires the date to be in the ISODate format.
In this endpoint, we’re generating a date filter from natural language via Structured Outputs from OpenAI.
The final solution applied to our $vectorSearch
stage is below:
// Build the filter for vector search
let vectorSearchFilter: any = {
client_id,
type: 'VOTC'
};
// Add date filter to the vector search filter if present
if (dateFilter) {
vectorSearchFilter = {
...vectorSearchFilter,
started_at: {
$gte: { $date: dateFilter.$gte },
$lte: { $date: dateFilter.$lte }
}
};
}
MongoDB Vector Search Aggregation with Date Filter
To include a date range filter in a MongoDB Atlas Vector Search pipeline, use the $vectorSearch
stage with a filter on the date field. Ensure the date field is indexed as a filter in your vector index definition. Use the ISODate format (or a Date object in your driver) for the $gte
and $lte
bounds. For example, to retrieve documents between January 1, 2023 and December 31, 2023:
⚠️ This does not work.
db.collection.aggregate([
{
$vectorSearch: {
index: "yourVectorIndex", // name of your Atlas Vector Search index
path: "embedding", // vector field path
queryVector: [/* your query vector */],
k: 10, // number of nearest neighbors to return
filter: {
$and: [
{ dateField: { $gte: ISODate("2023-01-01T00:00:00Z") } },
{ dateField: { $lte: ISODate("2023-12-31T23:59:59Z") } }
]
}
}
}
]);
This will pre-filter the vector search to only consider documents whose dateField
lies in the specified range.
Relevant Sources
Source & Link | TL;DR Summary |
---|---|
MongoDB Atlas Docs – Vector Search Filters (Run Vector Search Queries) (mongodb - Filtering in Atlas Vector Search) | Atlas Vector Search allows pre-filtering on metadata fields of type boolean, date, number, objectId, string, or UUID. To use a field in the filter option, you must index that field as a “filter” type in the vector index definition. This narrows the search scope (fewer documents to compare), improving query latency and accuracy. |
MongoDB Atlas Docs – Date Range Filter Example (Compound Query with Date Range) (How to Run Atlas Search Queries with a Date Range Filter) | Demonstrates how to filter documents by a date range in an aggregation query. Uses a range filter on a date field with gt /lt and ISODate values to find documents released in 2015. (In Atlas Search, the range operator accepts ISODate strings or objects to define the date bounds.) |
Stack Overflow – Filtering by Date in Atlas Search (Can the Atlas search filter in MongoDB cloud be used to query between dates?) | Q&A explaining that to query between dates in Atlas Search, you should use the range operator on the date field with gte and lte values. The answer provides an example with a date field "saleDate" and shows using "$gte": "2014-03-31T16:02:06.624+00:00", "$lte": "2017-12-08T21:40:34.527+00:00" (ISO 8601 strings) to filter the results. |
MongoDB Community Forum – Vector Search Date Filter Issue (How can I filter vectors by date?) | A user encountered an error when attempting to filter by date in a vector search. They tried filter={"$and": [{"my_date_field": {"$lte": datetime.now()}}]} and got “Operand type is not supported for $vectorSearch: date”. This highlights that the filter syntax must use a proper date type (e.g. ISODate or BSON Date) recognized by MongoDB, not a raw Python datetime. |
GitHub Issue (LangChain) – Date Filtering Limitation (MongoDB Atlas Vector Search Does Not Support Date Pre Filters) | Discussion of a known limitation in early Atlas Vector Search: the $vectorSearch stage did not support direct filtering on BSON dates at the time, causing errors. One workaround mentioned was converting dates to a numeric format (e.g. Unix epoch timestamps) since numeric filters are supported. It’s also noted that $vectorSearch must be the first stage in the pipeline (you cannot put a $match before it for pre-filtering). |