I am training a deep learning model, the loss function of which is of the form
$$ \cal{L} = \cal{L_1} + \cal{L_2} $$
where $\cal{L_1}$ and $\cal{L_2}$ are of very different orders. WLOG, let's assume the order of $\cal{L_1}$ is much higher than the order of $\cal{L_2}$.
During the first several epochs of training, the model will attempt to minimize $\cal{L_1}$ largely. However, after a certain number of epochs, the value of $\cal{L_1}$ will converge.
My question is, what will happen now? Specifically, I have three questions:
Does the convergence of $\cal{L_1}$ imply the convergence of $\cal{L}$, which means the training is over and the loss function behaved as if it was essentially $\cal{L} = \cal{L_1}$?
Since $\cal{L_1}$ has now converged, does that imply $\frac{\partial{\cal{L_1}}}{\partial{\theta}} \approx 0$? (where $\theta$ are the model parameters)
If the above point is true, then since the model parameters are updated based on $\frac{\partial{\cal{L}}}{\partial \theta}$, does that imply that the model will now start minimizing $\cal{L_2}$ (since $\frac{\partial{\cal{L}}}{\partial \theta} \approx \frac{\partial \cal{L_2}}{\partial \theta}$)?