Difference between Normalization and Standardization: 2 ways of feature scaling your data, How to do it in Excel

Difference between Normalization and Standardization

Normalization and Standardization are very integral parts of data processing. While processing data we often encounter different kind of variables which have different original scale. Using these scale can put more weightage to variables which have a large range in their data. In order to deal with this problem, we will be using feature rescaling of the independent variables so that the weights of all variables will be on the same scale.

In this article we will be discussing about the two feature scaling methods i.e. Normalization and Standardization. Both the terms are sometime used interchangeably. But they usually refer different things.

Standardization

It is also known as Z-score normalization and in it features are rescaled to ensure mean and standard deviation to be 0 and 1, respectively. The formula to rescale will be as given below.

It is useful in machine learning algorithms where weight inputs are required. It also requires for algorithms that use distance measurements.

Normalization

Normalization is also known as max-min Normalization. In this method, values are rescaled between 0 and 1. For every feature, the minimum value of that feature gets transformed into 0, and the maximum value gets transformed into 1. The equation can be seen below.

max-min normalization

Difference between Normalization and Standardization

We use normalization when we know that the distribution of our data doesn’t follow Gaussian’s distribution. It is helpful in algorithms that don’t assume distribution of data like K-nearest Neighbors and Neural Networks.

On the other hand, Standardization is useful where data follow Gaussian’s distribution. Also unlike normalization, standardization does not have a bounding range due to which outlier will not get impacted if any.

Example of Normalization and Standardization

In this example, we will be sharing a case study wherein we have two groups and their KPIs. We wanted to rank them to check who is performing better. But there is a catch as both groups have a different kind of job and performance threshold. So it would be unjust to rank it by just combining the performance of both groups. To make it more practical we will normalize or standardize both the group first, then we will club the data and finally rank them basis a new scaled number.

Group Name	Emplyee_Name	KPI_1	RANK_before
Group-A	Agent_1	155.9%	14
Group-A	Agent_2	158.3%	12
Group-A	Agent_3	150.7%	16
Group-A	Agent_4	137.9%	25
Group-A	Agent_5	114.0%	34
Group-A	Agent_6	192.5%	1
Group-A	Agent_7	191.2%	2
Group-A	Agent_8	181.6%	3
Group-A	Agent_9	177.7%	4
Group-A	Agent_10	175.5%	5
Group-A	Agent_11	171.8%	6
Group-A	Agent_12	167.5%	7
Group-A	Agent_13	165.5%	8

Sample dataset

You can download the complete dataset by clicking Here.

We will add two more columns Standardization and Normalization and will do the calculation according to the formula mentioned above. After calculating standardization and normalization we will calculate rank by both method.

Group Name	Emplyee_Name	KPI_1	RANK_before	Standardization	Normalization	RANK_after_Standardization	RANK_after_Normalization
Group-A	Agent_1	155.9%	14	0.3	0.69	17	17
Group-A	Agent_2	158.3%	12	0.4	0.71	13	13
Group-A	Agent_3	150.7%	16	0.1	0.64	19	19
Group-A	Agent_4	137.9%	25	-0.4	0.53	33	33
Group-A	Agent_5	114.0%	34	-1.4	0.33	51	51
Group-A	Agent_6	192.5%	1	1.9	1.00	1	1
Group-A	Agent_7	191.2%	2	1.8	0.99	2	2
Group-A	Agent_8	181.6%	3	1.4	0.91	4	4
Group-A	Agent_9	177.7%	4	1.3	0.87	5	5
Group-A	Agent_10	175.5%	5	1.2	0.85	6	6
Group-A	Agent_11	171.8%	6	1.0	0.82	7	7
Group-A	Agent_12	167.5%	7	0.8	0.79	8	8
Group-A	Agent_13	165.5%	8	0.8	0.77	9	9
Group-A	Agent_14	164.0%	9	0.7	0.76	10	10

We can infer below observations from the above table:

Rank from standardization and normalization is the same.
New ranks are different from the rank calculated earlier as it is based on the rescaled method.

That is all for now for this topic.

You can read more on this topic from the below article.

https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/

https://www.geeksforgeeks.org/normalization-vs-standardization/

Post Views: 1,398

Difference between Normalization and Standardization: 2 ways of feature scaling your data, How to do it in Excel

Difference between Normalization and Standardization

Leave a Comment Cancel Reply

Free Excel Tutorial Online – Free Excel Course with Free Certificate

FREE SQL course for Data Analysts – A-Z of Oracle SQL

Difference between Normalization and Standardization

Related Posts

Leave a Comment Cancel Reply

Free Excel Tutorial Online – Free Excel Course with Free Certificate

FREE SQL course for Data Analysts – A-Z of Oracle SQL