Skip to content

Key vector concepts for pgvector

Learn how pgvector stores embeddings in ordinary tables, how nearest-neighbor queries and distance operators work, and why approximate indexes matter on Tiger Cloud.

This page explains the ideas behind pgvector in PostgreSQL: the vector column type, typical table shapes, k-nearest-neighbor queries, distance metrics, and approximate nearest neighbor (ANN) indexes. For how these pieces fit with pgvectorscale and Tiger Cloud, see Understand pgvector and pgvectorscale.

Embeddings live in normal tables: you add a column whose type is provided by the pgvector extension (for example vector(1536) when your model outputs 1,536 dimensions). A common pattern is to store each embedding next to the text it represents and optional JSONB metadata:

CREATE TABLE IF NOT EXISTS document_embedding (
id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
document_id BIGINT REFERENCES document (id),
metadata JSONB,
contents TEXT,
embedding vector(1536)
);

The row has a primary key, an optional foreign key to a parent document, metadata, the chunk text (if you keep it in the database), and the embedding vector.

Why a separate table instead of a column on document

Section titled “Why a separate table instead of a column on document”

Embedding models and LLMs enforce context limits. Long documents are usually chunked; each chunk gets its own embedding. That is naturally a one-to-many relationship (one document, many embedding rows), modeled with document_id (or similar) on the embedding table.

If you only store vectors and not source text, you can drop the foreign key or keep a loose link inside JSONB. Both patterns are common.

The usual question is: which stored vectors are closest to this query vector? That is k-nearest neighbors (k-NN). In SQL you order by a distance expression and cap rows with LIMIT.

Here $1 is a parameter holding the query embedding. The <=> operator is the cosine distance operator when you use the cosine operator class on the column (see Distance metrics):

SELECT *
FROM document_embedding
ORDER BY embedding <=> $1
LIMIT 10;

This returns the ten rows whose embeddings are most similar to the query under that metric. Because this is PostgreSQL, you can add WHERE filters (including on JSONB), JOINs, and everything else you already use.

The example above uses cosine distance via <=>. pgvector also supports L2 (<->) and inner product (<#>) for the matching operator classes. Your index must be built for the same distance you use in ORDER BY, and you should follow your embedding provider’s guidance (many text models are tuned for cosine or inner product on unit-normalized vectors).

In practice, cosine distance on normalized vectors is a solid default for many text-embedding pipelines.

Cosine-related scores measure how aligned two directions are in high-dimensional space. For unit vectors (length 1), cosine distance and inner-product-based rankings line up in useful ways; pgvector’s operator-class docs spell out the exact definitions for <=> and friends. For background on cosine similarity, see a standard reference such as Cosine similarity (Wikipedia).

The inner product (dot product) scales with vector length as well as angle. With unit vectors, ordering by inner product often matches ordering by cosine similarity.

L2 is straight-line distance in the embedding space. It is sensitive to magnitude and is widely used in clustering and geometry-heavy workflows.

For unit-length embeddings, cosine distance, negative inner product, and Euclidean distance can yield the same ranking of neighbors (up to ties). That is why many teams standardize on normalization + cosine unless a model vendor documents a different contract.

Classic multidimensional indexes (think R-tree-style ideas) work well at low dimensionality. Embeddings are often hundreds or thousands of dimensions, where exact structures stop scaling (curse of dimensionality). Exact k-NN over huge tables without an index is too expensive for interactive search.

ANN indexes trade a small amount of recall for much faster queries. pgvector ships HNSW and IVFFlat; pgvectorscale adds StreamingDiskANN for large or disk-heavy workloads. Deep dive: pgvectorscale reference.

On Tiger Cloud you typically use pgvector for the vector type and for HNSW / IVFFlat. StreamingDiskANN comes from pgvectorscale (it builds on pgvector). The table is a coarse comparison (your data size, recall, and update pattern matter more than any one row):

IndexBuild timeTypical query speedRebuild after many updates?
StreamingDiskANN (pgvectorscale)FastOften strongest at scaleUsually no (check release notes for your version)
HNSW (pgvector)FastFastUsually no
IVFFlat (pgvector)FastestOften slowest of the threeOften yes (lists can go stale)

For methodology and benchmark context, see the discussion in pgvectorscale reference and Understand pgvector and pgvectorscale.

If you are on Tiger Cloud and expect large embedding tables or strict latency goals, evaluate StreamingDiskANN from pgvectorscale alongside HNSW. For smaller datasets or simpler ops, HNSW alone is often enough.

GoalWhere to read
Tiger Cloud stack (pgvector + pgvectorscale + pgai)Understand pgvector and pgvectorscale
StreamingDiskANN parameters, filtered searchpgvectorscale reference
Enable extensions and walk through a flowCreate a chatbot using pgvector (Azure)
Lexical (BM25) search and hybrid ideasUnderstand pg_textsearch and BM25 search