Let $g: \mathbb{R}^{N_\ell \times N_{\ell-1}} \to \mathbb{R}^{N_\ell} \;\;\;$ $g(W) = Wa$
a function that takes a matrix as an argument, and multiplies it by a vector $a \in \mathbb{R}^{N_\ell}$
Let $h: \mathbb{R}^{N_\ell} \to \mathbb{R} \;$ a differentiable function
I want to differentiate the composition $h \circ g$ with respect to the matrix $W$, so I differentiate with respect to each of its components. I want to use the total derivative of h, and my intuition says that
$\frac{\partial}{\partial W_{j,i}}(h \circ g) = Dh \dfrac{\partial g}{\partial W^\ell_{j,i}}$ where Dh is the total derivative of h, and $\dfrac{\partial g}{\partial W^\ell_{j,i}}$ is the partial derivative of g with respect to the j-row i-column component of the matrix $W$
My questions are: is my intuition correct? If so, why is it? (I'm familiar with the chain rule of total derivatives, but I've never seen it mixed with partial derivatives)