How are Machine Learning algorithms evaluated?

Kayley Marshall
Kayley MarshallAnswered

Metrics are undeniably an essential part of Machine Learning processing. They are an irreplaceable part of every Machine Learning process and actions related to it. It is also the main indicator of process effectiveness and purposefulness of the project. Evaluating Machine Learning algorithms is a crucial part of any process supported by AI. Some of the most common Machine Learning evaluation metrics are:

  • Classification Accuracy – refers to a number of positive/correct predictions in the total number of samples inputted. Be cautious since it only operates well if each class has the same number of samples.
  • Confusion Matrix – is the metric that describes the classification model performance. It is seen  through 4 values with equal importance:
    • TP (True Positive) – Positive the real world, predicted positive.
    • FP( False Positive) – Negative the real world, predicted positive.
    • TN (True Negative) – Negative the real world, predicted negative.
    • FN (False Negative) – Positive the real world , predicted negative.

Knowing these values make it is easy to understand and apply evaluation metrics under the term Confusion Matrix, which are the following equations:

  1. Sensitivity = TP / TP+FN (True Positive/True Positive + False Negative)
  2. Accuracy = TP + TN/TP+FP+TN+FN (True Positive + True Negative/True Positive + False Positive + True Negative + False Negative)
  3. Specificity = TN / TN + FP (True Negative/True Negative + False Positive)
  • AUC (Area Under Curve) is the most common evaluation metric for binary classification models.

F1 Score is the measurement of test accuracy. It thrives to find a balance between Recall and Precision. The recall is the number of the correct, positive results in all relevant sampling. Precision is the actual, real number of positive correct results divided by a number of positive results that the classifier predicted to be positive. A higher F1 score means better performance of the model.

Testing. CI/CD. Monitoring.

Because ML systems are more fragile than you think. All based on our open-source core.

Our GithubInstall Open SourceBook a Demo

Subscribe to Our Newsletter

Do you want to stay informed? Keep up-to-date with industry news, the latest trends in MLOps, and observability of ML systems.