Retrieval-Augmented Generation (RAG)

pgvector

Definition

pgvector extends PostgreSQL with a vector data type and similarity search operators, allowing applications already using Postgres to add semantic search without adopting a new database system. The extension supports L2 distance (Euclidean), inner product, and cosine distance metrics. Vectors are stored in regular Postgres tables as a vector(n) column alongside structured data. Similarity queries use operators: SELECT * FROM documents ORDER BY embedding <=> query_vector LIMIT 5, where <=> is the cosine distance operator. For indexing, pgvector supports IVFFlat (inverted file with flat compression) and HNSW indexes to speed up approximate nearest neighbor queries on large datasets.

Why It Matters

For teams already operating Postgres infrastructure, pgvector dramatically simplifies the RAG stack by eliminating a separate vector database service. Instead of managing Pinecone credentials, namespaces, and billing, teams use their existing Postgres deployment, backup procedures, and access controls. Joins between vector results and relational data are trivially expressed in SQL—finding support tickets similar to a query and joining with customer metadata in one query is impossible without pgvector's cohabitation with relational data. For 99helpers, pgvector powers knowledge base search within the same Postgres instance that stores chatbot conversation history, enabling rich queries like 'find articles related to this conversation topic.'

How It Works

To use pgvector: CREATE EXTENSION vector; CREATE TABLE articles (id serial, content text, embedding vector(1536)); INSERT INTO articles (content, embedding) VALUES ('Setup guide...', '[0.1, 0.2, ...]'); CREATE INDEX ON articles USING hnsw (embedding vector_cosine_ops); SELECT content FROM articles ORDER BY embedding <=> query_vector LIMIT 5. For multi-tenancy, add an org_id column and filter: SELECT content FROM articles WHERE org_id = 123 ORDER BY embedding <=> query_vector LIMIT 5. The HNSW index with m=16, ef_construction=64 provides a good quality/speed balance for most deployments. Supabase, Neon, and AWS RDS all support pgvector natively.

pgvector — Vectors Inside PostgreSQL

Table: products

ColumnTypeNote

idserialPrimary key

nametextProduct name

pricenumericStandard column

categorytextStandard column

embeddingvector(1536)pgvector column

Standard Postgres table — one extra column for vectors

Similarity Query

SELECT * FROM products
ORDER BY embedding <-> query_vector
LIMIT 5;

<-> = L2 distance <=> = cosine distance <#> = inner product

Index Options

IVFFlatFaster build, good recall

HNSWBetter recall, slower build

No Separate Vector DB

+Existing Postgres infra

+One backup strategy

+SQL joins with metadata

+Native transactions

Real-World Example

A 99helpers SaaS deployment uses Supabase (Postgres + pgvector) for all data storage. Knowledge base articles and their embeddings are stored in a single articles table with columns: id, org_id, content, embedding, category, updated_at. A single SQL query handles retrieval with filtering: SELECT content, 1 - (embedding <=> query_vec) AS similarity FROM articles WHERE org_id = $1 AND category = $2 ORDER BY similarity DESC LIMIT 5. This query benefits from a compound index on (org_id, category) for filtering and the HNSW index for vector search. Eliminating a separate vector database reduces monthly infrastructure costs by $200 and simplifies the deployment architecture.

Common Mistakes

✕Using exact nearest neighbor search (no index) for tables with more than 10,000 rows—query time grows linearly without an IVFFlat or HNSW index.
✕Setting lists (IVFFlat) or m/ef_construction (HNSW) parameters without tuning for your vector count—default parameters may produce poor recall or slow queries.
✕Storing embeddings as text arrays instead of the vector type—only the vector type supports similarity operators and indexing.

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

pgvector

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Vector Database

Pinecone

Chroma

Weaviate

Approximate Nearest Neighbor

Ready to build your AI chatbot?