AI Infrastructure, Safety & Ethics

Model Versioning

Definition

Model versioning extends software version control concepts to ML artifacts. A model version captures the trained weights file, the hyperparameter configuration, the dataset version used for training, the preprocessing pipeline, evaluation metrics, and metadata such as training date and compute environment. Version control systems like MLflow Model Registry, DVC, and Weights & Biases Artifacts organize models through lifecycle stages: staging, production, and archived. Semantic versioning (major.minor.patch) communicates the scope of changes between versions.

Why It Matters

Without model versioning, teams cannot reproduce past results, safely roll back failing deployments, or track which model version is serving production traffic. When a new model version degrades performance, versioning enables instant rollback to the last known good version. Regulatory compliance in finance and healthcare often requires proving which model version made a specific decision — impossible without versioning records. Model versioning also enables A/B testing by routing traffic across multiple simultaneous versions.

How It Works

A model version is created at the end of each training run and stored in a model registry. The registry records lineage: which dataset version, code commit, and hyperparameters produced each model. Lifecycle management transitions models through stages — from candidate to staging (for testing) to production (for serving) to archived (retired). Deployment pipelines reference specific model versions by ID rather than mutable latest pointers, ensuring deterministic deployments.

Model Versioning — Semantic Versioning

v3.2.1

Major

Breaking changes

Minor

New capabilities

Patch

Bug fixes

Each version includes: weights, config, tokenizer, training metadata

Real-World Example

A startup's customer support chatbot starts degrading after a model update. Because they use MLflow Model Registry, they can identify that version 2.4.1 was deployed three days ago and version 2.3.8 had better accuracy metrics. They promote v2.3.8 back to production in under two minutes, restoring service quality while they investigate what went wrong with the v2.4.1 training run.

Common Mistakes

✕Versioning only the model weights without capturing the associated preprocessing pipeline and data version
✕Using mutable 'latest' pointers in production deployments instead of pinned version IDs
✕Deleting archived model versions to save storage — removing the ability to reproduce past decisions

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

Model Versioning

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Model Registry

Experiment Tracking

MLOps

Model Deployment

Continuous Training

Ready to build your AI chatbot?