Retrieval-Augmented Generation (RAG)

Vector Database Namespace

Definition

Vector database namespaces (called namespaces in Pinecone, collections in Chroma, tenants in Weaviate) allow a single vector index to serve multiple isolated user groups without the cost of creating separate indexes for each. All vectors in a namespace are logically isolated: queries against one namespace cannot access vectors in another. This is the foundation of multi-tenant RAG—when 99helpers deploys a chatbot that each customer can configure with their own knowledge base, namespaces ensure that Company A's private documentation is never retrieved when answering Company B's queries. Namespaces are metadata-free—they are access boundaries, not filterable attributes.

Why It Matters

Multi-tenancy is a core requirement for SaaS AI products. Without namespaces, teams must either (1) create a separate vector index per tenant (expensive and operationally complex for hundreds of customers) or (2) store all tenants in one index and rely on metadata filters for isolation (risks data leakage if filters are misconfigured). Namespaces provide a middle ground: logical isolation enforced at the database level, at no extra cost, sharing the underlying infrastructure. For 99helpers, namespaces are how each customer's knowledge base remains private while sharing the same Pinecone or Weaviate cluster.

How It Works

Namespace implementation varies by vector database. In Pinecone, namespaces are specified in the upsert and query API calls: client.upsert(vectors=..., namespace='org-123'). Queries are automatically scoped to the specified namespace: client.query(vector=..., namespace='org-123'). In Weaviate, multi-tenancy is enabled at the class level, and tenant IDs are specified per operation. In Chroma, separate collections serve as namespace equivalents. When building multi-tenant RAG, the application layer must pass the correct namespace identifier for the authenticated user in every vector operation—a middleware or database access layer typically handles this to prevent misconfiguration.

Namespace — Multi-Tenant Isolation in Vector Database

Vector Database — Partitioned Namespaces

customer-a

500 docs

Accessible

shared

1,200 docs

Accessible

customer-blocked

320 docs

No access

Customer A Query — Scope

Query

from Customer A

customer-a

allowed

shared

allowed

customer-b

blocked

Searches customer-a + shared namespaces only. Customer B data is invisible.

Access Matrix

Tenantcustomer-asharedcustomer-b
Customer AReadReadDenied
Customer BDeniedReadRead

Real-World Example

99helpers hosts AI chatbots for 500 business customers on a shared Pinecone index. Each customer's knowledge base is stored in its own namespace named by their organization ID (e.g., 'org_acme', 'org_globex'). When a user from Acme asks a question, the retrieval pipeline queries namespace 'org_acme', returning only Acme's documents. When Acme adds a new help article, it is upserted to namespace 'org_acme'. Globex's queries never see Acme's content. This architecture supports 500 tenants with a single Pinecone index, reducing operational complexity from managing 500 indexes to managing one.

Common Mistakes

  • Relying on metadata filters alone for tenant isolation—namespaces provide database-enforced isolation; metadata filters are application-enforced and can have bugs.
  • Using a single namespace for all tenants in development and a per-tenant namespace model in production—switching namespace models requires re-indexing all content.
  • Forgetting to include namespace in every vector operation—a missing namespace parameter in one query can accidentally access the default namespace containing other tenants' data.

Related Terms

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →
What is Vector Database Namespace? Vector Database Namespace Definition & Guide | 99helpers | 99helpers.com