There are frequently too many criteria on which the final categorization is made in ML classification issues. These elements are essentially variables referred to as **features**. The more characteristics there are, the more difficult it is to envision the training set and subsequently work on it. Most of these characteristics are sometimes connected and hence redundant. Dimensionality reduction methods are useful in this situation. The technique of lowering the number of random variables under consideration by generating a set of primary variables is known as dimensionality reduction. It is split into two parts: feature selection and feature extraction.

## Importance of Dimensionality ReductionSome data maytends

A basic email categorization issue, in which we must determine if the email is junk or not, provides a straightforward illustration of dimensionality reduction. This can include a variety of factors, such as whether or not the email has a generic subject, the email’s content, whether or not the email employs a template, and so on. Some of these characteristics, however, may overlap. In another situation, a classification issue that relies on both humidity and rainfall may be reduced into just one underlying feature, because the two are highly associated. As a result, the number of features in such situations can be reduced. A 3-D classification problem can be difficult to picture, but a 2-D problem can be translated to a basic two-dimensional space and a 1-D problem to a simple line.

Dimensionality reduction is made up of two parts:

- Feature selection: In this step, we strive to locate a subset of the original collection of variables, or features, so that we can model the issue with a smaller subset. It is generally accomplished in three ways: Filter, Wrapper, and Embedded
- Feature extraction lowers data in high-dimensional space to a lower-dimensional one, i.e. one with fewer dimensions.

## Techniques of Dimensionality Reduction

The following are some of the approaches used to reduce dimensionality:

**PCA**– Principal Component Analysis – It operates on the premise that when data from a higher-dimensional space is translated to data from a lower-dimensional space, the lower-dimensional space’s variance should be the greatest.

It entails the following procedures: create the data’s covariance matrix

The eigenvectors of this matrix are computed, and the eigenvectors with the biggest eigenvalues are utilized to recover a large percentage of the variance in the original data.

As a result, we have a smaller number of eigenvectors, and some data may have been lost in the process. However, the remaining eigenvectors should keep the most significant variances.

**GDA**– Generalized Discriminant Analysis**LDA**– Linear Discriminant Analysis

Dimensionality reduction might be linear or nonlinear, depending on the approach employed.

## Benefits and drawbacks of Dimensionality Reduction

- It aids data compression, resulting in less storage space.
- It cuts down on computation time.
- It also aids in the removal of any unnecessary features.

- It’s possible that some data will be lost as a result.
- PCA has a tendency to detect linear connections between variables, which isn’t always a good thing.
- When mean and covariance are insufficient to characterize datasets, PCA fails.
- We may not know how many major components to keep track of, but in practice, some guidelines are followed.