I am studying how the distances between embeddings evolve during training of a language model.
One way to describe this "evolution" is that the k-nearest neighbours of a particular embedding may change after some training, and eventually "converge."
Problem
I want to compute some metric of similarity between distance distance matrices at different training steps. For example, I have a distance matrix at the beginning of training and another after some iterations, and I want to quantify how similar they are.
What I’ve tried/considered
- Converting the upper-triangular part of the matrix into a vector and computing correlations (Pearson or Spearman) between different training steps.
- Looking at averages and standard deviations of the distances over time, though this feels too coarse.
Question
Are there established metrics or approaches for comparing distance (or similarity) matrices in this way?
I am looking for methods that can capture whether the overall structure of the embedding space is stabilizing or converging during training.