基于手写数字识别数据集的机器学习方法比较研究
最编程
2024-05-19 13:15:02
...
基于手写数字识别数据集的机器学习方法对比研究
摘要
研究意义:统计机器学习和深度学习都已被广泛地应用。
主流研究方法:在相同的数据集上进行对比实验。
前人研究存在的问题:在检索范围内,没有发现统计学习方法与深度学习方法对比的工作。
我们的解决手段:本文在手写数字识别数据集(MNIST)上,对比了主流的统计机器学习方法和深度学习方法的表现。
我们解决的还不错:通过实验证明了深度学习方法在 MNIST 数据集上的效果更好,测试集上准确率为98.08%,统计机器学习方法(SVM)准确率为97.92%。
Keywords: 手写数字识别, MNIST, DNN, SVM, 统计机器学习, 深度学习
实验
实验设置
Epoch : 10
Train Data Sample : 60000
Test Data Sample : 10000
Image Shape : (28, 28, 1)
实验结果
预测性能
方法 | Acc on Train | Acc on Test | Paramters |
---|---|---|---|
DNN | 0.9950 | 0.9808 | 1,238,730 |
CNN+MaxPooling | 0.9906 | 0.9742 | 1,332,810 |
kernel approximation + LinearSVC | 0.9378 | 0.9371 | N/A |
SVC | 0.9899 | 0.9792 | N/A |
执行效率
CPU 80线程,128GB内存,固态硬盘
方法 | Training and Inference |
---|---|
DNN | 0m 38.849s |
CNN+MaxPooling | 11m 19.786s |
kernel approximation + LinearSVC | 0m 20.889s |
SVC | 10m 54.445s |
结论
1.深度学习方法在足量的数据上,可以取得比统计学习方法更高的准确率;
2.CNN+MaxPooling方法在当前的“实验设置”下,过拟合了;
3.在当前的“实验设置”下,DNN方法的效果一致好于CNN+MaxPooling方法;
4.自带核函数的SVM(SVC)预测效果好于近似核函数和线性SVM的组合方法(kernel approximation + LinearSVC);
5.自带核函数的SVM,训练时间和推断时间都远高于近似核函数和线性SVM的组合方法,高于DNN,略低于CNN;
代码
DNN
# encoder=utf-8
from tensorflow import keras
from tensorflow.keras import Model, layers
from tensorflow.keras.utils import to_categorical
import numpy as np
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# Reshape the data
x_train = np.reshape(x_train, (len(x_train), 28 * 28)) / 255.0
x_test = np.reshape(x_test, (len(x_test), 28 * 28)) / 255.0
print(x_train.shape)
# categorical labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
print(y_train.shape)
# Define and build the model
input_img = layers.Input(shape=28*28)
x = layers.Dense(28*28, activation='relu')(input_img)
x = layers.Dense(28*28, activation='sigmoid')(x)
x = layers.Dense(10, activation='softmax')(x)
model = Model(input_img, x)
model.summary()
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics='acc'
)
model.fit(
x=x_train,
y=y_train,
batch_size=128,
epochs=10
)
loss, metric = model.evaluate(x=x_test, y=y_test, batch_size=128, )
print("cross entropy is %.4f, accuracy is %.4f" % (loss, metric))
CNN + MaxPooling
# encoder=utf-8
from tensorflow import keras
from tensorflow.keras import Model, layers
from tensorflow.keras.utils import to_categorical
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# normalize the data
x_train = x_train / 255.0
x_test = x_test / 255.0
# categorical labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
print(y_train.shape)
# Define and build the model
input_img = layers.Input(shape=(28, 28, 1))
x = layers.Conv2D(28*28, (3, 3))(input_img)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Flatten()(x)
x = layers.Dense(10, activation='softmax')(x)
model = Model(input_img, x)
model.summary()
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics='acc'
)
model.fit(
x=x_train,
y=y_train,
batch_size=128,
epochs=10
)
loss, metric = model.evaluate(x=x_test, y=y_test, batch_size=128, )
print("cross entropy is %.4f, accuracy is %.4f" % (loss, metric))
Kernel approximation + LinearSVM
# encoder=utf-8
from tensorflow import keras
import numpy as np
from sklearn.kernel_approximation import Nystroem
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# Reshape the data
x_train = np.reshape(x_train, (len(x_train), 28 * 28)) / 255.0
x_test = np.reshape(x_test, (len(x_test), 28 * 28)) / 255.0
print(x_train.shape)
print(y_train.shape)
# Define and build the kernel mapping
x = np.concatenate((x_train, x_test))
print(x.shape)
# SVC is too slow to practice, hence we split the SVC into
# approximating kernel map (sklearn.kernel_approximation.Nystroem)
# and linear SVM (sklearn.svm.LinearSVC)
feature_map_nystroem = Nystroem(n_components=28*28)
feature_map_nystroem.fit(x)
x = feature_map_nystroem.transform(x)
x_train = x[:60000]
x_test = x[60000:]
print(x_train.shape)
print(x_test.shape)
cls = LinearSVC()
cls.fit(x_train, y_train)
y_pred = cls.predict(x_train)
ret = accuracy_score(y_train, y_pred)
print(ret)
y_pred = cls.predict(x_test)
ret = accuracy_score(y_test, y_pred)
print(ret)
SVC
# encoder=utf-8
from tensorflow import keras
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# Reshape the data
x_train = np.reshape(x_train, (len(x_train), 28 * 28)) / 255.0
x_test = np.reshape(x_test, (len(x_test), 28 * 28)) / 255.0
print(x_train.shape)
print(y_train.shape)
cls = SVC()
cls.fit(x_train, y_train)
y_pred = cls.predict(x_train)
ret = accuracy_score(y_train, y_pred)
print(ret)
y_pred = cls.predict(x_test)
ret = accuracy_score(y_test, y_pred)
print(ret)
推荐阅读
-
【摩尔线程+Colossal-AI强强联手】MusaBert登上CLUE榜单TOP10:技术细节揭秘 - 技术实力:摩尔线程凭借"软硬兼备"的技术底蕴,让MusaBert得以从底层优化到顶层。其内置多功能GPU配备AI加速和并行计算模块,提供了全面的AI与科学计算支持,为AI推理和低资源条件下的大模型训练等场景带来了高效、经济且环保的算力。 - 算法层面亮点:依托Colossal-AI AI大模型开发系统,MusaBert在训练过程中展现出了卓越的并行性能与易用性,特别在预处理阶段对DataLoader进行了优化,适应低资源环境高效处理海量数据。同时,通过精细的建模优化、领域内数据增强以及Adan优化器等手段,挖掘和展示了预训练语言模型出色的语义理解潜力。基于MusaBert,摩尔线程自主研发的MusaSim通过对比学习方法微调,结合百万对标注数据,MusaSim在多个任务如语义相似度、意图识别和情绪分析中均表现出色。 - 数据资源丰富:MusaBert除了自家高质量语义相似数据外,还融合了悟道开源200GB数据、CLUE社区80GB数据,以及浪潮公司提供的1TB高质量数据,保证模型即便在较小规模下仍具备良好性能。 当前,MusaBert已成功应用于摩尔线程的智能客服与数字人项目,并广泛服务于语义相似度、情绪识别、阅读理解与声韵识别等领域。为了降低大模型开发和应用难度,MusaBert及其相关高质量模型代码已在Colossal-AI仓库开源,可快速训练优质中文BERT模型。同时,通过摩尔线程与潞晨科技的深度合作,仅需一张多功能GPU单卡便能高效训练MusaBert或更大规模的GPT2模型,显著降低预训练成本,进一步推动双方在低资源大模型训练领域的共享目标。 MusaBert荣登CLUE榜单TOP10,象征着摩尔线程与潞晨科技联合研发团队在中文预训练研究领域的领先地位。展望未来,双方将携手探索更大规模的自然语言模型研究,充分运用上游数据资源,产出更为强大的模型并开源。持续强化在摩尔线程多功能GPU上的大模型训练能力,特别是在消费级显卡等低资源环境下,致力于降低使用大模型训练的门槛与成本,推动人工智能更加普惠。而潞晨科技作为重要合作伙伴,将继续发挥关键作用。
-
基于手写数字识别数据集的机器学习方法比较研究
-
基于深度学习的手写数字和符号识别系统(网络版 + YOLOv8/v7/v6/v5 代码 + 训练数据集)
-
基于 Tensorflow 完整 mnist 数据集的数字手写识别