Introduction

Fig 1. Image credit: Roman Synkevych
In a typical software development process, changes are made rapidly and continuously due to the experimental nature of technology development. As a result, versioning has become an essential component to keep track of modifications made to the source code and identify the team member responsible for them.
This is particularly relevant in Machine Learning (ML) systems, where teams need to track changes in data, code, and the model being developed to achieve optimal results. Specifically, three types of versioning are crucial in an ML system:
- Data Versioning: This mostly involves tracking and managing changes to the data utilized in creating the model.
- Code Versioning: This enables the tracking of modifications made to the source code that powers an ML system, ensuring transparency and facilitating collaboration.
- Model Versioning: This ensures that modifications to the machine learning model are tracked and managed. It can also involve some aspects of data versioning when needed.
In this article, we will focus on model versioning and provide a comprehensive guide to help you understand what it is and the various technicalities involved.
Model versioning
It is crucial to understand version control to appreciate model versioning.
Version control
It is the process of tracking and managing modifications in software code or ML systems and it is an essential part of maintaining a detailed record of changes to a system, enabling data science teams to revert to previous (favorable) versions and collaborate effectively.
Model versioning, on the other hand, is a specific type of version control focused on tracking changes made to the ML model in a machine learning system. By versioning the model, teams can maintain a complete history of changes made to the model, enabling them to reproduce results, debug issues, and collaborate effectively.
In addition, model versioning can track datasets, metrics, hyperparameters, algorithms, and artifacts to ensure transparency and accuracy in the ML development process.
Importance of version control in ML development
Due to the iterative nature of ML model development or lifecycles, continuous modifications to various components of the model, data, or code are a common occurrence. To track and manage these changes, machine learning model versioning plays a crucial role in creating simple, iterative, and retrievable records of these modifications. Here are some key benefits:
- Enables accurate reproduction of previous results
- Facilitates debugging problems and collaborating more effectively
- Allows for continuous improvement and optimization of datasets, code, and models to improve the performance of the ML system.
- Improves the reproducibility of the entire project.
Overview of version control systems
A Version Control System (VCS) is a software tool that enables developers to track and manage changes to source code, data, or model. By programmatically versioning files and projects, these tools help data scientists reduce the burden of manual versioning and enable team collaboration. Also, these tools reduce the likelihood of single-point failure compared to manual versioning where all changes can be lost if your disk gets damaged.
There are three primary types of version control systems:
- Local Version Control Systems (LVCS)
- Centralized Version Control Systems (CVCS)
- Distributed Version Control Systems (DVCS)
Local Version Control System (LVCS):
It involves creating a database of changes to files and directories in the project, allowing you to revert to previous versions of the project in case of any issues. The system stores a complete copy of the project in the local database, which allows the team to work offline without the need for a network connection. It can lead to a single-point failure since it has inadequate backup capabilities.

Fig 2. A diagram illustrating a local version control system. Image credit: Git
Centralized Version Control System (CVCS):
It stores the code in a central repository, and collaborators work on a local copy of the code. Changes are made to the local copy, and then the changes are committed to the central repository. Examples of CVCS include Subversion (SVN), and Perforce.

Fig 3. A diagram illustrating a centralized version control system. Image credit: Git
Distributed Version Control System (DVCS):
It also stores the code in a central repository, but each collaborator has a local copy of the entire repository, enabling them to work offline and commit changes to the local repository. Changes can then be merged into the central repository as needed. Examples of DVCs include Git, Mercurial, and Bazaar.
Git is a version control system while GitHub is a cloud-based hosting service that helps you manage Git repositories

Fig 4. A diagram illustrating a distributed version control system. Image credit: Git
Feature | Git | Mercurial | Bazaar |
---|---|---|---|
Type | Distributed | Distributed | Distributed |
Command | Git | Hg (or Mercurial) | bzr (or Bazaar) |
Popularity | Most popular | Second most popular | Less popular |
Language | C | Python | Python |
Performance | Very fast | Fast | Slow |
Windows support | Good | Good | Good |
Large projects | Better suited for very large projects | Better suited for medium-sized projects | Better suited for small projects |
Hosting | GitHub, GitLab, Bitbucket, and others | Bitbucket, SourceForge, and others | Launchpad, SourceForge, and others |
Table 1. A table showing the differences and similarities between Git, Mercurial, and Bazaar
Applying Git to model versioning
Git is the most popular version control system used by developers and data scientists alike as it has a very reliable workflow, is massively supported by most third-party platforms like GitHub, GitLab, etc, and the immense adoption of distributed version control systems by the vibrant development community.
Here is an example of using git for model versioning on your local machine. You can connect to your GitHub account to interact with your git repository as well.
Note: code used in this article is created with python programming language
## initialize a new Git repository in your project directory git init ## create a new PyTorch model and save it to a file import torch import torch.nn as nn class MyModel(nn.Module): def __init__(self): super(MyModel, self).__init__() self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1) self.pool = nn.MaxPool2d(kernel_size=2, stride=2) self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1) self.fc1 = nn.Linear(32 * 8 * 8, 64) self.fc2 = nn.Linear(64, 10) def forward(self, x): x = self.conv1(x) x = nn.functional.relu(x) x = self.pool(x) x = self.conv2(x) x = nn.functional.relu(x) x = self.pool(x) x = torch.flatten(x, 1) x = self.fc1(x) x = nn.functional.relu(x) x = self.fc2(x) return x model = MyModel() torch.save(model.state_dict(), 'model.pth') ## add the model file to the Git repository and commit the changes git add model.pth git commit -m "Initial version of PyTorch model" ## train the model on some data and save the new version # Load some data and train the model train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True) optimizer = torch.optim.Adam(model.parameters(), lr=0.001) criterion = nn.CrossEntropyLoss() for epoch in range(10): for i, (images, labels) in enumerate(train_loader): optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() # Save the new version of the model torch.save(model.state_dict(), 'model_v2.pth') ## Add the new model version to the Git repository and commit the changes git add model_v2.pth git commit -m "Trained model on some data"
This is a general example showing the idea behind model versioning with git and might be different in your case.
Model versioning in MLOps: Using Deepchecks
Deepchecks is a powerful open source MLOps tool that can be very useful for model versioning. It helps data scientists to effectively track and manage changes made to their machine learning models over time. With Deepchecks, you can easily compare different versions of your models and identify changes that may have caused a drop in performance.
The deepchecks SaaS platform (Deepchecks Pro) also acts as a versioning software that flawlessly does MLOps data versioning, machine learning model versioning, and helps versioning management for individual model evaluation metrics.
With its detailed visualizations of evaluation metrics, it helps you evaluate the effectiveness of each version of your different models. Deepchecks can be integrated with popular versioning systems like Git to ensure that your models are properly versioned and documented. Making model management, tracking, and collaboration easier.
Feature | Deepchecks Pro | Git |
---|---|---|
Versioning | It’s version control is designed specifically for machine learning models | General-purpose version control system |
Ease of Use | Easy to set up and use with a user-friendly interface | Might be intimidating fro first time users especially when it requires a command line interface |
Data Tracking | Tracks and logs data used to train models, making it easy to reproduce experiments and results | Limited data tracking capabilities |
Model Tracking | Tracks all versions of a model, including the model architecture, weights, and evaluation metrics | Tracks only code changes, requires manual tracking of model versions |
Collaboration | Allows for seamless collaboration among team members by providing a centralized platform for sharing and reviewing model versions | Collaboration can be difficult without proper branching and merging |
Scalability | Can handle large-scale machine learning projects with many contributors and models | Limited scalability for larger projects |
Interpretability | Provides explainability of model versions through tracking of hyperparameters and training data, facilitating model interpretation | Limited interpretability of model versions |
Table 2. A table showing the differences Deepchecks Pro and Git
Here is an example of how model versioning can be done with Deepchecks:
## Installing deepchecks package import sys !{sys.executable} -m pip install -U deepchecks !{sys.executable} -m pip install -U deepchecks-client
Preparing the reference data
from deepchecks.tabular.datasets.regression.airbnb import load_data, \ load_pre_calculated_prediction, load_pre_calculated_feature_importance ref_dataset, _ = load_data(data_format='Dataset') ref_predictions, _ = load_pre_calculated_prediction() feature_importance = load_pre_calculated_feature_importance() # Optional feature_importance
Out: neighbourhood_group 0.1 neighbourhood 0.2 room_type 0.1 minimum_nights 0.1 number_of_reviews 0.1 reviews_per_month 0.1 calculated_host_listings_count 0.1 availability_365 0.1 has_availability 0.1 dtype: float64
Creating the Data Schema
from deepchecks_client import DeepchecksClient, create_schema, read_schema schema_file_path = 'schema_file.yaml' create_schema(dataset=ref_dataset, schema_output_file=schema_file_path) read_schema(schema_file_path) # Note: for conveniently changing the auto-inferred schema it's recommended to edit the textual file with an app of your choice. # After editing, you can use the `read_schema` function to verify the validity of the syntax in your updated schema.
Out: Schema was successfully generated and saved to schema_file.yaml. {'additional_data': {}, 'features': {'availability_365': 'integer', 'calculated_host_listings_count': 'integer', 'has_availability': 'categorical', 'minimum_nights': 'integer', 'neighbourhood': 'categorical', 'neighbourhood_group': 'categorical', 'number_of_reviews': 'integer', 'reviews_per_month': 'numeric', 'room_type': 'categorical'}}
Creating a Model Version
import os # Point the host to deepchecks app host = os.environ.get('DEEPCHECKS_API_HOST') # Replace this with https://app.deepchecks.com # note to put the API token in your environment variables. Or alternatively (less recommended): # os.environ['DEEPCHECKS_API_TOKEN'] = 'uncomment-this-line-and-insert-your-api-token-here' dc_client = DeepchecksClient(host=host, token=os.getenv('DEEPCHECKS_API_TOKEN')) model_name = 'Airbnb' model_version = dc_client.create_tabular_model_version(model_name=model_name, version_name='ver_1', schema=schema_file_path, feature_importance=feature_importance, reference_dataset=ref_dataset, reference_predictions=ref_predictions, task_type='regression')
Out: Model Airbnb was successfully created!. Default checks, monitors and alerts added. Model version ver_1 was successfully created. Reference data uploaded.
Uploading Production Data and Predictions
timestamp, label_col = 'datestamp', 'price' _, prod_data = load_data(data_format='DataFrame') _, prod_predictions = load_pre_calculated_prediction() model_version.log_batch(sample_ids=prod_data.index, data=prod_data.drop([timestamp, label_col], axis=1), timestamps=prod_data[timestamp],predictions=prod_predictions)
Out: /home/runner/work/mon/mon/.venv/lib/python3.9/site-packages/deepchecks_client/tabular/client.py:661: UserWarning: Index of provided "data" dataframe completely matches "sample_ids" array, are you sure that "samples_ids" array is correct and contains unique sample identifiers? 10000 new samples sent. 10000 new samples sent. 10000 new samples sent. 10000 new samples sent. 2225 new samples sent. Upload finished successfully but might take time to ingest into the system, see http://127.0.0.1:8000/configuration/models for status.
Updating the Labels
model_client = dc_client.get_or_create_model(model_name) model_client.log_batch_labels(sample_ids=prod_data.index, labels=prod_data[label_col])
Out: 10000 labels sent. 10000 labels sent. 10000 labels sent. 10000 labels sent. 2225 labels sent.
Delete Model
# CAUTION: This will delete the model, all model versions, and all associated datasets. dc_client.delete_model(model_name)
To get access to an account, you can request for your organization and check out the quickstart guide to test the platform and its suitability for your organization’s use case.
What needs to be versioned?
Apart from versioning the code and data in an ML development lifecycle, the model and the environment should be versioned most for reproducibility and model optimization.
Models:
- Model architecture or algorithm, hyperparameters (batch size, learning rate, epochs, etc.), and weights can be versioned.
- Model evaluation metrics and results for each version, including test accuracy and other relevant performance indicators, should be documented. This enhances the model’s explainability and performance throughout experimentation.
Training and Deployment environments:
- The configurations used for training and deploying models should be versioned. These configurations include dependencies like libraries and packages to ensure training and deployment environment consistency.
- The deployment scripts used to deploy the model can be versioned to enable the reproducibility of the deployment process.
- The dependencies required for the deployment environment, such as the operating system, runtime libraries, and software packages, can be versioned to ensure the consistency of the deployment environment.
Types of Model Versions
In order to ensure consistency and clarity in model versioning, different types of version numbers are used to indicate the scope of the changes made to a model. These types of version numbers are typically broken down into
- Major version
- Minor version
- Patch version
Major Version: A major version indicates a significant change that could impact the performance or functionality of your model. Typically, a major version update involves a significant change in your model’s architecture, algorithms, or training data. You can also introduce new features or capabilities using this. Major versions are typically denoted by incrementing the first digit in the version number.
Minor Version: A minor version indicates a smaller change that typically does not significantly affect the model’s performance or functionality. For example, a minor version update could involve a bug fix, a small optimization, or a new feature that does not fundamentally alter the model’s behavior. Minor versions are typically denoted by incrementing the second digit in the version number.
Patch Version: A patch version indicates a small change or bug fix that is made to a specific version of the model. Patch versions are typically denoted by incrementing the third digit in the version number.

Fig 5. Components of versioning numbering. Image credit: Geeksforgeeks
Versioning schemes
Choosing a versioning scheme is an essential step toward efficient collaboration in ML development. Failure to do so could result in confusion and time wastage, especially for larger projects. Here are some versioning schemes to consider when developing machine learning models:
- Semantic Versioning:
Semantic versioning is a widely used versioning scheme that uses a three-part version number consisting of major, minor, and patch versions. The version number is typically written in the format “major.minor.patch”. This scheme is often used for software libraries, frameworks, and APIs. - Calendar Versioning:
Calendar versioning is a versioning scheme that uses the date of release as the version number. For example, a model released on January 1st, 2022 would have a version number of 2022.01.01. This scheme is often used for data science projects where the focus is on tracking changes over time.Fig 6. Components of calendar versioning. Image credit: Datalust
- Sequential Versioning:
Sequential versioning is a versioning scheme that uses a simple sequential numbering system to track versions. Each new version is assigned the next available number in the sequence (e.g., 1, 2, 3, etc.). This scheme is often used for small projects or individual models. - Git Commit Hash:
This scheme involves versioning based on the Git commit hash, which is a unique identifier for every commit made to a code repository. It allows for precise tracking of changes made to the model and its associated code. For example, ‘4dfc13a’, ‘8c2fb85’, etc for each commit made to the repository. It is commonly used for collaborative development on machine learning projects, where multiple developers may be making changes to the codebase at the same time.

Fig 7. Git commit hash. Image credit: Git
Challenges and Considerations
While model versioning is an essential part of machine learning development, it comes with its own set of challenges. As models become more complex and teams grow, it becomes increasingly important to manage the versioning process effectively. Here are a few considerations that should be made depending on the size of your project:
- Data management and versioning
- Scalability and resource requirements
- Integrating with existing workflows and systems
Data management and versioning:
One major challenge in model versioning is data management and versioning. Keeping track of changes in data used to train models is crucial as it can have a significant impact on model performance.
However, managing large datasets and tracking changes to them can be challenging, especially when dealing with distributed datasets across multiple machines. It is essential to establish proper protocols and workflows for data versioning to ensure data consistency, reliability, and easy tracking of changes.
Scalability and resource requirements:
Another challenge in model versioning is scalability and resource requirements. As models grow in complexity, their versioning requirements can become increasingly complex, and storing multiple versions of models can quickly become resource-intensive. It is crucial to consider the scalability and resource requirements of model versioning systems when implementing them to ensure they can handle increasing model complexity and volume.
Integrating with existing workflows and systems:
Integrating model versioning with existing workflows and systems can also pose a challenge. Different teams may have different tools and workflows, and integrating model versioning into these can require significant effort. It is essential to choose a model versioning system that can integrate seamlessly with existing tools and workflows to minimize disruption and ensure smooth collaboration across teams.
Best Practices for Model Versioning
To ensure efficient collaboration with your team throughout the timeline of a project, it is important to consider version management from the outset.
When doing this, remember that model versioning should also involve monitoring and evaluating the model’s performance over time to ensure it continues to meet business and performance requirements. One best practice is to establish automated testing and monitoring processes that evaluate the model’s performance at regular intervals. This includes tracking metrics like accuracy, precision, and recall and comparing them against previous versions. For instance, using tools like deepchecks, can help track and evaluate model performance metrics over time.

Fig 8. Diagram showing the dashboard for Deepcheck Pro. Image credit: Deepchecks Pro
Conclusion
In conclusion, the significance of model versioning cannot be overstated, especially in the context of machine learning development. It allows teams to access records of all changes made to the models, facilitating ease of review and reverting back to previous versions if need be. This enhances continuous experimentation, which in turn can improve the model and the overall ML system.
As we look to the future, there is hope that the process of model versioning will become automated, similar to how Google Colab saves code when working. Gradually the integration of versioning with MLOps platforms and tools is also expected to significantly increase reproducibility and interpretability. Additionally, we must emphasize the importance of maintaining detailed records of each version, including relevant performance metrics, as it is an essential part of understanding the strengths and weaknesses of our models. It is only through continuous experimentation and careful monitoring of our models that we can ensure optimal performance in ML projects.
If you will like to test its model monitoring platform, apply for an invite and join the deepchecks community.