Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Machine Learning

Difference Between Normalization and Standardization | by Chetan Ambi | Sep, 2022

admin by admin
September 11, 2022
in Machine Learning


Understand the differences between normalization and standardization, different methods, and most importantly, when you should consider using normalization or standardization.

Table of Contents

· Introduction
· What is feature scaling and why it is important?
· Normalization
∘ MinMaxScaler
∘ MaxAbsScaler
∘ RobustScaler
· Standardization
∘ StandardScaler
· Summary
· References

Feature scaling is one of the important steps in the machine learning pipeline. The two common techniques used for feature scaling are normalization and standardization. But what is the difference between normalization and standardization? When should you use normalization and standardization? This is a very common question among people who have just started their data science journey. Let’s try to answer these questions in the article.

Feature scaling is a method to transform data in the common range — [0, 1] or [-1, 1] or [-2, 2], etc. If the feature scaling is not applied to the data, then machine learning models give higher weightage to the features with large values, resulting in a biased model.

For example, let’s consider 2 features — total_bill and tip. As you can see, both features are in different ranges. The model only looks at the numbers. It doesn’t know which is total_bill or tip. When you build the model without feature scaling, the model will have a bias towards the features with larger values resulting in a biased model. To mitigate this issue, we need to use feature scaling to the data.

An example from the tips dataset from the Seaborn library

Note: Feature scaling is not necessary for all machine learning algorithms. The distance-based algorithms such as linear regression, logistic regression, support vector machines, KNN, K-means, etc., and gradient descent optimization algorithm works best with feature-scaling. However, tree-based models such as random forest, XGBoost, LightGBM, etc., are affected by the scaling.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import MaxAbsScaler
from sklearn.preprocessing import RobustScaler
from sklearn.preprocessing import StandardScaler
sns.set(font_scale=1.5)
df = sns.load_dataset('tips')
df = df[['total_bill', 'tip']]
df.head()fig, axes = plt.subplots(1, 3, figsize=(18, 5))
sns.histplot(data=df, x='total_bill', ax=axes[0])
sns.histplot(data=df, x='tip', ax=axes[1])
sns.scatterplot(data=df, x='total_bill', y='tip', ax=axes[2]);

Normalization is a feature scaling technique to bring the features in the data to a common range, say [0, 1] or [-1, 0] or [-1, 1]. In this section, we’ll go through 3 popular normalization methods as discussed below.

MinMaxScaler

This method scales each feature individually such that it is in the range [0,1]. Each feature value is subtracted with the min value and divided by the difference between max and min.

It uses the minimum, and maximum values for scaling, and both minimum & maximum are sensitive to outliers. As a result, the MinMaxScaler method is also sensitive to outliers. Note that MinMaxScaler doesn’t change the distribution of the data.

Scikit-learn implementation

You can use MinMaxScaler from Sklearn as shown below.

from sklearn.preprocessing import MinMaxScalerscaler = MinMaxScaler()
minmaxscaler_df = pd.DataFrame(scaler.fit_transform(df), columns=df.columns)

When to use MinMaxScaler?

  • MinMaxScaler is preferred when the distribution of the features is unknown (i.e., if the features are not normally distributed).
  • MinMaxScaler can also be considered if the underlying machine learning algorithms you are using don’t make any assumptions about the distribution of the data (e.g. kNN, Neural Nets, etc.).
  • Consider using MinMaxScaler only if features have very few or no outliers.

Note: By default, MinMaxScaler scales the data in the range [0,1]. However, you can modify this range as per your need by setting feature_range parameter.

MaxAbsScaler

MaxAbsScaler is another normalization technique, and this method scales each feature individually such that it is in the range [0, 1] or [-1, 0] or [-1, 1] under different scenarios as mentioned below.

  • only positive values: [0, 1]
  • only negative values: [-1, 0]
  • both positive & negative values: [-1, 1]

In this method, each feature value is divided by the maximum absolute value. Since this method uses the maximum and hence it is also sensitive to outliers like MinMaxScaler.

Scikit-learn implementation

You can use MaxAbsScaler from Sklearn, as shown below.

from sklearn.preprocessing import MaxAbsScalerscaler = MaxAbsScaler()
maxabsscaler_df = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

When to use MaxAbsScaler?

  • If the data is sparse (i.e., most of the values are zeroes), you must consider using MaxAbsScaler. In fact, MaxAbsScaler is designed specifically for sparse data.

To curious minds — Why should you use MaxAbsScaler if the dataset is sparse? Please read this article by Christian Versloot here to understand.

RobustScaler

The MinMaxScaler and MaxAbsScaler are sensitive to outliers. So, the alternative is RobustScaler. Instead of using the minimum and maximum values as in MinMaxScaler or MaxAbsScaler, the RobustScaler uses IQR hence it is robust to outliers.

The formula for RobustScaler calculation is shown below. As you can see, the median is removed from the data points and scaled according to IQR (Inter Quartile Range). The calculated median and IQR are stored so that they can be used during the transformation of the test set. The scaling happens independently for each feature.

Scikit-learn implementation

You can use RobustScaler from Sklearn as shown below.

from sklearn.preprocessing import RobustScalerscaler = RobustScaler()
robustscaler_df = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

When to use RobustScaler?

  • RobustScaler is preferred when the data contains outliers as it is less sensitive to outliers than MinMaxScaler and MaxAbsScaler.

Standardization is the most commonly used feature scaling technique in machine learning. This is because some of the algorithms assume the normal or near-normal distribution of the data. If the features are normally distributed, then the model behaves badly. The StandardScaler and standardization both refer to the same thing.

StandardScaler

This method removes the mean and scales the data with unit variance ( or standard deviation). The calculated mean and standard deviation are stored so that they can be used during the transformation of the test set. The scaling happens independently for each feature in the data.

Scikit-learn implementation

You can use StandardScaler from Sklearn, as shown below.

from sklearn.preprocessing import StandardScalerscaler = StandardScaler()
standardscaler_df = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

The StandardScaler uses mean, and mean is sensitive to outliers. Hence, outliers have an influence on the StandardScaler.

When to use StandardScaler?

  • If the features are normally distributed, then, StandardScaler will be your first choice.
  • Consider using StandardScaler if the underlying machine learning algorithms you are using make assumptions about the normal distribution of the data (e.g., linear regression, logistic regression, etc.)
  • If there are outliers in the data, then you can remove those outliers and use either MinMaxScaler/MaxAbsScaler/StandardScaler.

Normalization and Standardization are the two popular feature scaling techniques. The below table gives the summary of both methods.

image by author

However, note that feature scaling is not mandatory for all the algorithms. The tree-based algorithms such as the Decision Tree algorithm, Random Forest algorithm, Gradient Boosted Trees, etc. don’t need feature scaling.

[1]. https://scikit-learn.org/stable/modules/preprocessing.html

[2]. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html

[3]. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html

[4]. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html



Source link

Previous Post

How to Best Utilize AI / ML and Creative Humans in 2022 | by Matthew Joseph Taylor | Sep, 2022

Next Post

Saúde & Diagnóstico & Engenharia & Data Science | by Gabrie Freitas | Sep, 2022

Next Post

Saúde & Diagnóstico & Engenharia & Data Science | by Gabrie Freitas | Sep, 2022

Turning News Into Data — Day 14 (Connecting to Cloud SQL Part 2) | by Norman Benbrahim | Sep, 2022

Machine Learning Explainability — Worked out Example (Balance Consolidation) | by RG | Sep, 2022

Related Post

Artificial Intelligence

Exploring TensorFlow Model Prediction Issues | by Adam Brownell | Feb, 2023

by admin
February 2, 2023
Machine Learning

Different Loss Functions used in Regression | by Iqra Bismi | Feb, 2023

by admin
February 2, 2023
Machine Learning

How to organize bills? – 3 ways to track bills

by admin
February 2, 2023
Artificial Intelligence

How to decide between Amazon Rekognition image and video API for video moderation

by admin
February 2, 2023
Artificial Intelligence

The Future of AI: GPT-3 vs GPT-4: A Comparative Analysis | by Mohd Saqib | Jan, 2023

by admin
February 2, 2023
Deep Learning

6 Ways To Streamline Tech Hiring With A Recruitment Automation Platform

by admin
February 2, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.