What is Model Monitoring
Machine learning model monitoring relates to how we track and understand our models’ success in development from both a data science and operational standpoint. When it’s time to put a model to work in production after practicing and testing it in an experimental setup, the output of your model will be affected by the transition from an experimental to a real-world environment. It’s important to keep in mind that a model is created to serve a business purpose and to add value to a company. As a result, it must address the following essential business requirements:
- We want models that are always stable and usable.
- We want models to continue to be relevant in development.
Why Is Model Monitoring Important?
Machine Learning Model Monitoring needs to be configured to provide timely alerts for the many issues that can occur with a machine learning model. Machine Learning implementations must be tracked as part of an MLOps paradigm in order to keep track of the model’s health and take action when performance metrics deteriorate. Some of the issues that can occur are:
Skews of Data
Skewing occurs when training data does not reflect live data. This means that it occurs when the data used to train the model while in the experimental phase doesn’t reflect the data received in the live application/system.
This can happen for a variety of reasons:
- The training data was designed incorrectly.
- A feature isn’t in production: When this happens, we must either delete the feature, re-create the feature by incorporating other features already in production, or replace it with a similar feature that already exists.
- A mismatch between research and live data: In the research environment, the data we used to train our models came from one source, while the live data came from another.
- Data Dependencies: Models can ingest variables stored or generated by other systems. They can adjust how they generate data, and it’s unfortunate that this isn’t always communicated clearly
- Constant changes in the environment: If historical data is to be used for training the models, we should account for the fact that people and their actions will differ in the present.
- Arms races: foreign governments, fraudsters, and other bad actors can actively search for flaws in the model and change their attacks according to these flaws. This is sometimes referred to as an “arms race.”
- Customer tastes shift as fashion, politics, ethics, and other factors influence customer behavior. This is a risk that must be constantly monitored, particularly in recommender ML systems.
Feedback Loops (Negative)
When models are automatically trained on data obtained in development, a more complicated problem arises. If the data is skewed or distorted in some way, the models that are trained on it will perform poorly.
The above are all very valid reasons and issues that monitoring machine learning models can solve.
How Do You Measure the Performance of a Model?
Naturally, the accuracy of our model(s) in development is essential to us which is why model monitoring is important and this is where data science model monitoring comes into play.
However, in many situations, knowing the accuracy and measuring the performance of a model is impossible right away. Consider the following example of a machine learning model designed to detect frauds: The only way to verify its accuracy in predicting new live incidents is if a criminal investigation or other tests are conducted.
Many other places where we don’t get direct input face similar challenges (e.g. predicting disease risks, future property values, credit risk prediction, predicting stock markets in the long term). Given these constraints, monitoring proxy values in production in order to ensure model accuracy makes sense.
Some of the model monitoring best practices are:
Model Performance Monitoring
We can verify that the input values fall within an allowed set or range and the frequencies of each respective value within the set agree with what we have seen in the past, given a set of expected values for an input function.
We will enable or disallow null input features depending on the configuration of our model. We can keep an eye on something like this. If features that we consider to be null in general begin to alter, it may signify a data bias or a shift in customer conduct, all of which would warrant further investigation. Another reason as to why you should monitor machine learning. We may compare our model prediction distributions with statistical tests in either an automated or manual operation.
This is a simple place to monitor which is often overlooked: Easily tracking which version of your model has been implemented, as configuration errors do occur.