Metadata
Metadata
vecs allows you to associate key-value pairs of metadata with indexes and ids in your collections. You can then add filters to queries that reference the metadata metadata.
Types
Metadata is stored as binary JSON. As a result, allowed metadata types are drawn from JSON primitive types.
- Boolean
- String
- Number
The technical limit of a metadata field associated with a vector is 1GB. In practice you should keep metadata fields as small as possible to maximize performance.
Metadata Query Language
The metadata query language is based loosely on mongodb's selectors.
vecs
currently supports a subset of those operators.
Comparison Operators
Comparison operators compare a provided value with a value stored in metadata field of the vector store.
Operator | Description |
---|---|
$eq | Matches values that are equal to a specified value |
$ne | Matches values that are not equal to a specified value |
$gt | Matches values that are greater than a specified value |
$gte | Matches values that are greater than or equal to a specified value |
$lt | Matches values that are less than a specified value |
$lte | Matches values that are less than or equal to a specified value |
$in | Matches values that are contained by scalar list of specified values |
$contains | Matches values where a scalar is contained within an array metadata field |
Logical Operators
Logical operators compose other operators, and can be nested.
Operator | Description |
---|---|
$and | Joins query clauses with a logical AND returns all documents that match the conditions of both clauses. |
$or | Joins query clauses with a logical OR returns all documents that match the conditions of either clause. |
Performance
For best performance, use scalar key-value pairs for metadata and prefer $eq
, $and
and $or
filters where possible.
Those variants are most consistently able to make use of indexes.
Examples
year
equals 2020
_10{"year": {"$eq": 2020}}
year
equals 2020 or gross
greater than or equal to 5000.0
_10{_10 "$or": [_10 {"year": {"$eq": 2020}},_10 {"gross": {"$gte": 5000.0}}_10 ]_10}
last_name
is less than "Brown" and is_priority_customer
is true
_10{_10 "$and": [_10 {"last_name": {"$lt": "Brown"}},_10 {"is_priority_customer": {"$gte": 5000.00}}_10 ]_10}
priority
contained by ["enterprise", "pro"]
_10{_10 "priority": {"$in": ["enterprise", "pro"]}_10}
tags
, an array, contains the string "important"
_10{_10 "tags": {"$contains": "important"}_10}