Drift monitoring

AI usage is accelerating in a variety of areas. However, the difficulty of applying machine learning has hampered AI systems’ performance. MLOps, and particularly the making of machine learning models, face issues comparable to those that plagued software before DevOps Monitoring.

  • Model drift detection is simply one aspect of MLOps Monitoring


Drift is the shift in an entity’s position in relation to a reference point. A shift in the distribution of data, which underpins model drift, is termed data drift. This is the difference between real-time production data and a baseline data set, most likely the training set, that is reflective of the task the model is designed to do in the case of production ML models. Due to changes in the actual world, production data might diverge or drift from the baseline data over time. Model re-train cadence can be informed by the drift of projected values, which is a useful proxy for concept drift or data integrity concerns.

Different types of ML drift

Depending on the data distribution being compared, there are four forms of drift:

  • Prediction drift – A shift in the model’s predictions. For example, if your product was introduced in a more affluent location, you could see a higher number of credit-worthy applicants. Although your model validation remains valid, your company may be unprepared for this circumstance
  • Label drift – A movement in the model’s output
Testing. CI/CD. Monitoring.

Because ML systems are more fragile than you think. All based on our open-source core.

Our GithubInstall Open SourceBook a Demo
  • Feature drift – A shift in the model’s input data distribution. For example, all candidates’ earnings increase by 2%, but the economic fundamentals remain the same.
  • Concept drift – A shift in the real relationship between the model inputs and outputs. When macroeconomic conditions make lending riskier and there is a greater bar to be qualified for a loan, this is an example of idea drift. An income level that was previously regarded as creditworthy is no longer creditworthy in this circumstance.

The difference between a genuine and learned decision boundary is what concept drift is all about. It needs to re-learning the data in order to preserve the prior regime’s error rate and accuracy. Performance drift is the strongest sign of this if ground truth labels are provided and sufficiently real-time. In the absence of real-time ground truth, drift in prediction and feature distributions is frequently suggestive of significant world changes. These values, unlike performance drift, can drift with respect to a properly modeled decision boundary. The model’s performance will remain constant in this instance.

Reasons for drift to occur

There are various reasons for ai model drift in production:

  • When externalities cause a genuine shift in the data distribution. This might necessitate the development of a new model with an updated representative training set. A shift in ground truth or input data distribution, such as changing client preferences owing to a pandemic, launching a product in a new market, and so on. There is a shift in the concept, such as a rival releasing a new service.
  • When there are problems with data integrity. This requires more human examination. Correct data is entered at the source, but owing to improper data engineering, it is erroneously updated. In the model input, for example, debt-to-income and age variables are switched. At the source, incorrect data is entered. A website form, for example, supports leaving a field blank owing to a front-end problem.

Without the correct tools, monitoring prediction and feature drift may be time-consuming. A data scientist or machine learning engineer in charge of maintaining production models must compare a specified window of live traffic with a baseline using one of the approaches listed above on a regular basis.

After capturing model output drift, the next step is to determine which characteristics contributed to the drift. Many times, a large drift in an input feature may not have resulted in a significant change in the model output due to the feature’s low relevance in the model. Assessing the underlying drift in characteristics in relation to their input relevance is necessary for determining the cause of the drift.

Below are some practical steps you should implement to identify drift:

  • Look at the performance of the impacted traffic slice for further information on the drift.
  • Compare data distribution to see the real change and get a sense of whether or not model retraining is required.
  • Quickly discover prediction drift in real-time model outputs when compared to a training or baseline set.
  • Drill down into the specified time frame to see drift in underlying characteristics. Use explainability to assess the drifted features’ relevance and focus only on the ones that have a significant influence, weeding out false drift.