理解特征归一化与标准化的重要性:何时无需进行特性缩放操作详解
- 与距离计算无关的概率模型,不需要feature scaling,比如Naive Bayes;
- 与距离计算无关的基于树的模型,不需要feature scaling,比如决策树、随机森林等,树中节点的选择只关注当前特征在哪里切分对分类更好,即只在意特征内部的相对大小,而与特征间的相对大小无关。
小结
这篇文章写的十分艰难,一开始以为蛮简单直接,但随着探索的深入,冒出的问号越来越多,打破了很多原来的“理所当然”,所以,在写的过程中不停地做加法,很多地方想解释得尽量直观,又不想照搬太多公式,但自己的理解又不够深刻,导致现在叙述这么冗长,希望以后在写文时能更专注更精炼。
Sigh。。。
参考
-
wiki-Feature scaling
-
wiki-Backpropagation
-
Hung-yi Lee pdf-Gradient Descent
-
quora-Why does mean normalization help in gradient descent?
-
scikit learn-Importance of Feature Scaling
-
scikit learn-5.3. Preprocessing data
-
scikit learn-Compare the effect of different scalers on data with outliers
-
data school-Comparing supervised learning algorithms
-
Lecun paper-Efficient BackProp
-
Hinton vedio-3.2 The error surface for a linear neuron
-
CS231n-Neural Networks Part 2: Setting up the Data and the Loss
-
ftp-Should I normalize/standardize/rescale the data?
-
medium-Understand Data Normalization in Machine Learning
-
Normalization and Standardization
-
How and why do normalization and feature scaling work?
-
Is it a good practice to always scale/normalize data for machine learning?
-
When conducting multiple regression, when should you center your predictor variables & when should you standardize them?