I have the following CNN:
[![network layour][1]][1]
1. I start with an input image of size 5x5
2. Then I apply convolution using 2x2 kernel and stride = 1, that produces feature map of size 4x4.
3. Then I apply 2x2 max-pooling with stride = 2, that reduces feature map to size 2x2.
4. Then I apply logistic sigmoid.
5. Then one fully connected layer with 2 neurons.
6. And an output layer.
For the sake of simplicity, let's assume I have already completed the forward pass and computed **δH1=0.25** and **δH2=-0.15**
So after the complete forward pass and partially completed backward pass my network looks like this:
[![network after forward pass][2]][2]
Then I compute deltas for non-linear layer (logistic sigmoid):
$$
\begin{align}
&\delta_{11}=(0.25 * 0.61 + -0.15 * 0.02) * 0.58 * (1 - 0.58) = 0.0364182\\
&\delta_{12}=(0.25 * 0.82 + -0.15 * -0.50) * 0.57 * (1 - 0.57) = 0.068628\\
&\delta_{21}=(0.25 * 0.96 + -0.15 * 0.23) * 0.65 * (1 - 0.65) = 0.04675125\\
&\delta_{22}=(0.25 * -1.00 + -0.15 * 0.17) * 0.55 * (1 - 0.55) = -0.06818625\\
\end{align}
$$
Then, I propagate deltas to 4x4 layer and set all the values which were filtered out by max-pooling to 0 and gradient map look like this:
[![enter image description here][3]][3]
How do I update kernel weights from there? And if my network had another convolutional layer prior to 5x5, what values should I use to update it kernel weights? And overall, is my calculation correct?
[1]: https://i.sstatic.net/MehcI.png
[2]: https://i.sstatic.net/wE1He.png
[3]: https://i.sstatic.net/aaEQ9.png