欢迎您访问 最编程 本站为您分享编程语言代码,编程技术文章!
您现在的位置是: 首页

6. 向前传播与反向传播:理解 Dropout 的工作原理

最编程 2024-07-24 22:07:16
...

参考资料

cs231n Course Materials: Backprop
Derivatives, Backpropagation, and Vectorization
cs231n Lecture 4:Neural Networks and Backpropagation
cs231n Assignment 2
笔记: Batch Normalization及其反向传播

6. Dropout

前向传播

在这里插入图片描述

Y i , j = 1 p X i , j M i , j (6.1) Y_{i,j}=\frac{1}{p}X_{i,j}M_{i,j}\tag{6.1} Yi,j=p1Xi,jMi,j(6.1)
其中
M i , j = { 1 r a n d ( ) < p 0 r a n d ( ) ≥ p (6.2) M_{i,j}=\left\{\begin{matrix} 1& rand()<p\\0& rand()\ge p\end{matrix}\right.\tag{6.2} Mi,j={10rand()<prand()p(6.2)

反向传播

由式(6.1)可得
∂ L ∂ X i , j = ∂ L ∂ Y i , j ∂ Y i , j ∂ X i , j = ∂ L ∂ Y i , j 1 p M i , j (6.3) \begin{aligned}\frac{\partial{L}}{\partial{X_{i,j}}}&=\frac{\partial{L}}{\partial{Y_{i,j}}}\frac{\partial{Y_{i,j}}}{\partial{X_{i,j}}}\\&=\frac{\partial{L}}{\partial{Y_{i,j}}}\frac{1}{p}M_{i,j}\end{aligned}\tag{6.3} Xi,jL=Yi,jLXi,jYi,j=Yi,jLp1Mi,j(6.3)

推荐阅读