How exactly does the mini batch method work?

Question

I mean let's say I have a mini batch, I take an example from it and for it I do the following:

I do forward propagation.
Using the output after forward propogation - I calculate the gradients of the parameters.

Then I take the next example from the mini-batch and repeat steps 1 and 2 for it, and after I get the parameter gradients I sum them with the gradients I got earlier. After these actions are performed for all elements of the mini-batch, I use the resulting sum of gradients to find the average value of the gradients and use it to update the weights? Am I correct?

No, you don't compute example by example. All examples of the minibatch are computed together, in parallel. — noe
– noe, Commented May 23, 2023 at 20:35

Lynn · Accepted Answer · 2023-05-24 10:35:12Z

Your description looks conceptually correct to me. See Hugh's answer to: How does minibatch gradient descent update the weights for each example in a batch? on Cross Validated for a detailed explanation.

However, as per @noe's comment, in practice mini-batches are not implemented by processing the examples one at a time. To speed up processing, most deep learning frameworks will implement this using matrix or tensor operations and process the entire mini-batch in one pass.

Stack Exchange Network

How exactly does the mini batch method work?

1 Answer 1

Hot Network Questions

How exactly does the mini batch method work?

1 Answer 1

Related

Hot Network Questions