# Difference between Normalization and Standardization: 2 ways of feature scaling your data, How to do it in Excel

#### Difference between Normalization and Standardization

Normalization and Standardization are very integral parts of data processing. While processing data we often encounter different kind of variables which have different original scale. Using these scale can put more weightage to variables which have a large range in their data. In order to deal with this problem, we will be using feature rescaling of the independent variables so that the weights of all variables will be on the same scale.

In this article we will be discussing about the two feature scaling methods i.e. Normalization and Standardization. Both the terms are sometime used interchangeably. But they usually refer different things.

Standardization

It is also known as Z-score normalization and in it features are rescaled to ensure mean and standard deviation to be 0 and 1, respectively. The formula to rescale will be as given below.

It is useful in machine learning algorithms where weight inputs are required. It also requires for algorithms that use distance measurements.

Normalization

Normalization is also known as max-min Normalization. In this method, values are rescaled between 0 and 1. For every feature, the minimum value of that feature gets transformed into 0, and the maximum value gets transformed into 1. The equation can be seen below.

Difference between Normalization and Standardization

We use normalization when we know that the distribution of our data doesn’t follow Gaussian’s distribution. It is helpful in algorithms that don’t assume distribution of data like K-nearest Neighbors and Neural Networks.

On the other hand, Standardization is useful where data follow Gaussian’s distribution. Also unlike normalization, standardization does not have a bounding range due to which outlier will not get impacted if any.

Example of Normalization and Standardization

In this example, we will be sharing a case study wherein we have two groups and their KPIs. We wanted to rank them to check who is performing better. But there is a catch as both groups have a different kind of job and performance threshold. So it would be unjust to rank it by just combining the performance of both groups. To make it more practical we will normalize or standardize both the group first, then we will club the data and finally rank them basis a new scaled number.

We will add two more columns Standardization and Normalization and will do the calculation according to the formula mentioned above. After calculating standardization and normalization we will calculate rank by both method.

We can infer below observations from the above table:

• Rank from standardization and normalization is the same.
• New ranks are different from the rank calculated earlier as it is based on the rescaled method.

That is all for now for this topic.

You can read more on this topic from the below article.

https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/

https://www.geeksforgeeks.org/normalization-vs-standardization/