理解特征归一化与标准化的重要性：何时无需进行特性缩放操作详解

最编程 2024-07-23 16:37:21

...

与距离计算无关的概率模型，不需要feature scaling，比如Naive Bayes；
与距离计算无关的基于树的模型，不需要feature scaling，比如决策树、随机森林等，树中节点的选择只关注当前特征在哪里切分对分类更好，即只在意特征内部的相对大小，而与特征间的相对大小无关。

小结

这篇文章写的十分艰难，一开始以为蛮简单直接，但随着探索的深入，冒出的问号越来越多，打破了很多原来的“理所当然”，所以，在写的过程中不停地做加法，很多地方想解释得尽量直观，又不想照搬太多公式，但自己的理解又不够深刻，导致现在叙述这么冗长，希望以后在写文时能更专注更精炼。

Sigh。。。

参考

wiki-Feature scaling
wiki-Backpropagation
Hung-yi Lee pdf-Gradient Descent
quora-Why does mean normalization help in gradient descent?
scikit learn-Importance of Feature Scaling
scikit learn-5.3. Preprocessing data
scikit learn-Compare the effect of different scalers on data with outliers
data school-Comparing supervised learning algorithms
Lecun paper-Efficient BackProp
Hinton vedio-3.2 The error surface for a linear neuron
CS231n-Neural Networks Part 2: Setting up the Data and the Loss
ftp-Should I normalize/standardize/rescale the data?
medium-Understand Data Normalization in Machine Learning
Normalization and Standardization
How and why do normalization and feature scaling work?
Is it a good practice to always scale/normalize data for machine learning?
When conducting multiple regression, when should you center your predictor variables & when should you standardize them?