I am training a CNN to regress on 4 targets related to a given image. Within the image is a point of interest whose position can be defined by phi, and theta (corresponding to x and y of a normal cartesian axis). The targets for my model are $\sin(\phi), \cos(\phi), \sin(\theta),$ and $\cos(\theta)$. I use L2 mean squared error as the loss function for both phi and theta.
The issue I have is that not every image has equal inherent value. Some images have a higher probability of being encountered whereas others not so much. This probability scores scale on a metric from $1 \to 1e^{-100}$.
My question is how would one incorporate these weights into a loss function so as to make the model better?