Vector Upsert
Definition
Upsert is the write operation used to add or update vectors in a vector database. The term combines 'update' and 'insert': if a vector with the given ID already exists, it is overwritten with the new vector and metadata; if it does not exist, it is created. Upsert is the standard operation for RAG knowledge base maintenance because documents change over time—when a help article is edited, its chunks must be re-embedded and the new vectors must replace the old ones. Without upsert semantics, engineers would need to track which chunks already exist and issue separate insert or update commands, significantly complicating incremental update logic.
Why It Matters
Knowledge bases are not static—articles get updated, products change, policies evolve. Without reliable upsert semantics, updating content requires deleting old vectors and re-inserting new ones, a two-step operation that risks leaving the knowledge base in an inconsistent state if the process fails between steps. Upsert's atomic replace-or-insert guarantee ensures that the knowledge base always contains either the old version or the new version of a chunk, never neither. For 99helpers customers who update their documentation daily, the indexing pipeline relies on upsert to efficiently refresh only changed content without downtime or full re-indexing.
How It Works
A typical upsert workflow: generate a stable, deterministic ID for each document chunk (e.g., hash of source URL + chunk index); embed the chunk; call the vector database's upsert API with the ID, vector, and metadata. For Pinecone: index.upsert(vectors=[{'id': 'article-123-chunk-0', 'values': embedding, 'metadata': {'source': url, 'text': chunk_text}}]). If the article is updated, re-chunk, re-embed, and upsert with the same IDs—old vectors are automatically replaced. For deleted articles, issue explicit delete calls by ID. Batch upserts (100-1000 vectors per API call) are more efficient than single-vector upserts for bulk operations.
Upsert — Insert or Update Vector Atomically
Incoming vector
ID: "doc-123"
embedding: [0.12, -0.34, ...]
metadata: title, updated_at
ID exists in DB?
No — Insert
New document available for search
Yes — Update
Fresh embedding reflects latest content
Why not plain insert or delete + insert?
Typical use case
When a help article is edited, re-run embedding and call upsert with the same doc ID. The vector DB overwrites the old embedding atomically — no duplicate entries, no downtime window.
Real-World Example
A 99helpers knowledge base contains 50,000 chunks from 8,000 help articles. When the pricing page is updated, the indexing pipeline detects the change via content hash comparison. It re-fetches the page, re-chunks into 7 segments, re-embeds all 7, and upserts them to Pinecone using deterministic IDs (e.g., 'pricing-page-chunk-0' through 'pricing-page-chunk-6'). The upsert replaces the 7 old vectors atomically. Within 2 minutes of the page update, the chatbot answers pricing questions based on the new content, without touching the other 49,993 chunks in the index.
Common Mistakes
- ✕Using random IDs for each ingestion run instead of deterministic IDs—without stable IDs, upsert cannot replace old vectors, causing index bloat from duplicate content.
- ✕Not deleting chunks when a document is shortened—if an article shrinks from 10 chunks to 7, upsert will update chunks 0-6 but leave stale chunks 7-9 in the index.
- ✕Ignoring batch size limits—vector databases have per-request size limits; upserting too many vectors in one call causes API errors.
Related Terms
Vector Database
A vector database is a purpose-built data store optimized for storing, indexing, and querying high-dimensional numerical vectors (embeddings), enabling fast similarity search across large collections of embedded documents.
Indexing Pipeline
An indexing pipeline is the offline data processing workflow that transforms raw documents into searchable vector embeddings, running during knowledge base setup and when content is updated.
Vector Database Namespace
A namespace in vector databases is a logical partition that isolates groups of vectors within the same index, enabling multi-tenant RAG applications where different users or organizations have separate, private knowledge bases.
Embedding Model
An embedding model is a machine learning model that converts text (or other data) into dense numerical vectors that capture semantic meaning, enabling similarity search and serving as the foundation of RAG retrieval systems.
Retrieval Pipeline
A retrieval pipeline is the online query-time workflow that transforms a user question into a ranked set of relevant document chunks, serving as the information retrieval stage of a RAG system.
Ready to build your AI chatbot?
Put these concepts into practice with 99helpers — no code required.
Start free trial →