Semantic search
Learn how to search by meaning rather than exact keywords.
Semantic search interprets the meaning behind user queries rather than exact keywords. It uses machine learning to capture the intent and context behind the query, handling language nuances like synonyms, phrasing variations, and word relationships.
When to use semantic search
Semantic search is useful in applications where the depth of understanding and context is important for delivering relevant results. A good example is in customer support or knowledge base search engines. Users often phrase their problems or questions in various ways, and a traditional keyword-based search might not always retrieve the most helpful documents. With semantic search, the system can understand the meaning behind the queries and match them with relevant solutions or articles, even if the exact wording differs.
For instance, a user searching for "increase text size on display" might miss articles titled "How to adjust font size in settings" in a keyword-based search system. However, a semantic search engine would understand the intent behind the query and correctly match it to relevant articles, regardless of the specific terminology used.
It's also possible to combine semantic search with keyword search to get the best of both worlds. See Hybrid search for more details.
How semantic search works
Semantic search uses an intermediate representation called an “embedding vector” to link database records with search queries. A vector, in the context of semantic search, is a list of numerical values. They represent various features of the text and allow for the semantic comparison between different pieces of text.
The best way to think of embeddings is by plotting them on a graph, where each embedding is a single point whose coordinates are the numerical values within its vector. Importantly, embeddings are plotted such that similar concepts are positioned close together while dissimilar concepts are far apart. For more details, see What are embeddings?
Embeddings are generated using a language model, and embeddings are compared to each other using a similarity metric. The language model is trained to understand the semantics of language, including syntax, context, and the relationships between words. It generates embeddings for both the content in the database and the search queries. Then the similarity metric, often a function like cosine similarity or dot product, is used to compare the query embeddings with the document embeddings (in other words, to measure how close they are to each other on the graph). The documents with embeddings most similar to the query's are deemed the most relevant and are returned as search results.
Embedding models
There are many embedding models available today. Supabase Edge Functions has built in support for the gte-small
model. Others can be accessed through third-party APIs like OpenAI, where you send your text in the request and receive an embedding vector in the response. Others can run locally on your own compute, such as through Transformers.js for JavaScript implementations. For more information on local implementation, see Generate embeddings.
It's crucial to remember that when using embedding models with semantic search, you must use the same model for all embedding comparisons. Comparing embeddings created by different models will yield meaningless results.
Semantic search in Postgres
To implement semantic search in Postgres we use pgvector
- an extension that allows for efficient storage and retrieval of high-dimensional vectors. These vectors are numerical representations of text (or other types of data) generated by embedding models.
-
Enable the
pgvector
extension by running:_10create extension vector_10with_10schema extensions; -
Create a table to store the embeddings:
_10create table documents (_10id bigint primary key generated always as identity,_10content text,_10embedding vector(512)_10);Or if you have an existing table, you can add a vector column like so:
_10alter table documents_10add column embedding vector(512);In this example, we create a column named
embedding
which uses the newly enabledvector
data type. The size of the vector (as indicated in parentheses) represents the number of dimensions in the embedding. Here we use 512, but adjust this to match the number of dimensions produced by your embedding model.
For more details on vector columns, including how to generate embeddings and store them, see Vector columns.
Similarity metric
pgvector
support 3 operators for computing distance between embeddings:
Operator | Description |
---|---|
<-> | Euclidean distance |
<#> | negative inner product |
<=> | cosine distance |
These operators are used directly in your SQL query to retrieve records that are most similar to the user's search query. Choosing the right operator depends on your needs. Inner product (also known as dot product) tends to be the fastest if your vectors are normalized.
The easiest way to perform semantic search in Postgres in by creating a function:
_15-- Match documents using cosine distance (<=>)_15create or replace function match_documents (_15 query_embedding vector(512),_15 match_threshold float,_15 match_count int_15)_15returns setof documents_15language sql_15as $$_15 select *_15 from documents_15 where documents.embedding <=> query_embedding < 1 - match_threshold_15 order by documents.embedding <=> query_embedding asc_15 limit least(match_count, 200);_15$$;
Here we create a function match_documents
that accepts three parameters:
query_embedding
: a one-time embedding generated for the user's search query. Here we set the size to 512, but adjust this to match the number of dimensions produced by your embedding model.match_threshold
: the minimum similarity between embeddings. This is a value between 1 and -1, where 1 is most similar and -1 is most dissimilar.match_count
: the maximum number of results to return. Note the query may return less than this number ifmatch_threshold
resulted in a small shortlist. Limited to 200 records to avoid unintentionally overloading your database.
In this example, we return a setof documents
and refer to documents
throughout the query. Adjust this to use the relevant tables in your application.
You'll notice we are using the cosine distance (<=>
) operator in our query. Cosine distance is a safe default when you don't know whether or not your embeddings are normalized. If you know for a fact that they are normalized (for example, your embedding is returned from OpenAI), you can use negative inner product (<#>
) for better performance:
_15-- Match documents using negative inner product (<#>)_15create or replace function match_documents (_15 query_embedding vector(512),_15 match_threshold float,_15 match_count int_15)_15returns setof documents_15language sql_15as $$_15 select *_15 from documents_15 where documents.embedding <#> query_embedding < -match_threshold_15 order by documents.embedding <#> query_embedding asc_15 limit least(match_count, 200);_15$$;
Note that since <#>
is negative, we negate match_threshold
accordingly in the where
clause. For more information on the different operators, see the pgvector docs.
Calling from your application
Finally you can execute this function from your application. If you are using a Supabase client library such as supabase-js
, you can invoke it using the rpc()
method:
_10const { data: documents } = await supabase.rpc('match_documents', {_10 query_embedding: embedding, // pass the query embedding_10 match_threshold: 0.78, // choose an appropriate threshold for your data_10 match_count: 10, // choose the number of matches_10})
You can also call this method directly from SQL:
_10select *_10from match_documents(_10 '[...]'::vector(512), -- pass the query embedding_10 0.78, -- chose an appropriate threshold for your data_10 10 -- choose the number of matches_10);
In this scenario, you'll likely use a Postgres client library to establish a direct connection from your application to the database. It's best practice to parameterize your arguments before executing the query.
Next steps
As your database scales, you will need an index on your vector columns to maintain fast query speeds. See Vector indexes for an in-depth guide on the different types of indexes and how they work.