Anomalous data can be brought about by mistakes made by people, instruments, populations naturally deviating from the norm, fraud, unforeseen behavioral changes, or systemic flaws. Depending on the type of data gathering anomalies, one might take into account the three anomaly detection methods and algorithms:
- Unsupervised Clustering. An unsupervised learning strategy should be used for data lacking prior knowledge, especially when the data points have not been pre-labeled as normal or pathological. The data points with values outside the permitted range of the distribution are marked as outliers, which presupposes that the data has a fixed allocation that can be explained by statistical models.
- Supervised Classification. This method requires pre-labeled data that are classified as normal or abnormal or even specific known categories of abnormal behavior. It supports both normality and abnormality modeling. Many people consider this a conventional classification problem instead of an anomaly detection system problem since any supervised ML method may be applied to the circumstance.
- Semi-supervised Detection. This focuses solely on modeling normalcy, necessitating either pre-classified data that has been designated as normal, or the presumption that the training set exclusively consists of normal data. In the semi-supervised procedure, a supervised model is taught the normal pattern while an unsupervised method is used to infer the normality border. In the supervised step of time-series data learning, time-series forecasting algorithms are frequently employed. The semi-supervised technique can be advantageous in situations where normal data is widely accessible but aberrant data is very difficult to acquire, such as those in defect detection domains.