You simply need to allocate a business value to commercial applications to four sorts of results: true negatives, true positives, false negatives, and false positives. You can be sure you’re using the best model by multiplying each bucket’s number of results by the related business values.
The confidence values offered by the model further complicate the matter. Almost all machine learning models may be programmed to provide a level of confidence in their output. Multiplying this value with the findings is a high-level technique to apply it in accuracy measurement, effectively rewarding the model for delivering high confidence levels for its right assessments.
More advanced techniques, on the other hand, are feasible. If all low confidence predictions are going to be manually checked, allocating a manual labor cost to them and excluding their outcomes from the model accuracy measurement is a more accurate approximation of the business value created by the model.
Individual predictions in a model can be true or false, indicating whether the model is correct or incorrect. The data point’s actual value is also significant. You might be wondering why we need a model that predicts values when we already know what they are. We’re talking about the model’s performance on the training data, which we already know the answers to.
The actual value of the data points can be either the values we’re looking for in the dataset (positives) or something else entirely (negatives). As a result, the four possible outcomes of a model’s individual predictions are as follows:
When you actually want to correctly forecast the cases in the true class, you need to know about the positive rate. If you have a test for a serious form of cancer, for example, you want it to be able to detect all of the situations where someone actually has cancer. So you’re really concerned about the positive rate.
Divide the TP (true positives) with the sum of TP (true positives) and FN (false negatives) to get the true positive in machine learning:
The best possible TPR is one, while the lowest possible recall is zero. In circumstances where recall is critical, there is another thing we can do to accurately anticipate more true cases: adjust our decision threshold.
A scikit-learn classification model’s decision threshold is set to.5 by default. This means that if the model believes an observation has a 50% or greater chance of being a member of the positive class, it is projected to be a member of the positive class.
If TPR is important to us, we can reduce the decision threshold to catch more of the true affirmative cases. For example, suppose you want the model to predict true with a probability of 20% or greater for every observation.
After all, you might predict that every observation would be favorable and have a perfect 100 percent TPR. But that isn’t always a good idea.
When the cost of false positives is large, it’s important to be aware of them. You’ll need a statistic to track how successfully your model distinguishes between genuine and false positives. You must pay close attention to precision.
It’s important to remember that a positive rate is also known as sensitivity or recall. The true negative rate is another name for specificity.
These two criteria normally trade-off against one another inside any statistical evaluation tool, although dominating approaches are theoretically feasible across multiple statistical inferential procedures, where one way delivers larger positive rates without necessarily delivering a greater false-positive rate.
The false-negative rate is 1 minus the false positive rate, while the false-negative rate is 1 minus the positive rate