Is applying dropout equivalent to zeroing output of random neurons in each mini-batch iteration and leaving rest of forward and backward steps in back-propagation unchanged? I'm implementing network from scratch in numpy.
- 1$\begingroup$ Yes, although just to be super-duper-extra precise, bernoulli dropout is the same as zeroing-out random neurons (some people use other kinds of randomness and call it things like Gaussian dropout; see e.g. keras.io/api/layers/regularization_layers/gaussian_dropout) $\endgroup$Nathan Wycoff– Nathan Wycoff2022-11-09 21:30:51 +00:00Commented Nov 9, 2022 at 21:30
- $\begingroup$ @Qbik please see edits to my reply below. $\endgroup$hH1sG0n3– hH1sG0n32022-11-10 09:07:18 +00:00Commented Nov 10, 2022 at 9:07
1 Answer
Indeed. To be precise, the dropout operation will randomly zero some of the input tensor elements with probability $p$, and furthermore the rest of the non-dropped out outputs are scaled by a factor of $\frac{1}{1-p}$ during training.
For example, see how elements of each tensor in the input (top tensor in output) are zeroed in the output tensor (bottom tensor in output) using pytorch.
m = nn.Dropout(p=0.5) input = torch.randn(3, 4) output = m(input) print(input, '\n', output) >>> tensor([[-0.9698, -0.9397, 1.0711, -1.4557], >>> [-0.0249, -0.9614, -0.7848, -0.8345], >>> [ 0.9420, 0.6565, 0.4437, -0.2312]]) >>> tensor([[-0.0000, -0.0000, 2.1423, -0.0000], >>> [-0.0000, -0.0000, -1.5695, -1.6690], >>> [ 0.0000, 0.0000, 0.0000, -0.0000]]) EDIT: please note the post has been updated to reflect Todd Sewell's addition in the comments.
- 1$\begingroup$ Note that non-dropped out elements are scaled by 1/(1-p) to compensate for the shift in average magnitude, so it's not just zeroing out some elements. $\endgroup$KarelPeeters– KarelPeeters2022-11-09 20:12:59 +00:00Commented Nov 9, 2022 at 20:12
- $\begingroup$ That is very true, I omitted that info from the original pytorch docs for simplicity however in that way makes the post only half correct. Amended now to reflect your point. $\endgroup$hH1sG0n3– hH1sG0n32022-11-09 22:17:52 +00:00Commented Nov 9, 2022 at 22:17