I have a set of data associated with ~60 individuals. For each individual, I have sequence data for a number of different genes. I have performed clustering analysis (using affinity propagation) for each gene, based on the number of pairwise differences between sequences.
This means that for each gene, I have a number of clusters to which each individual is assigned. However, the cluster membership may be completely different for different genes.
My question is:
How do I assess how well conserved the clustering is between genes? That is, is there some metric or statistic that will give me a measure of whether the cluster grouping is conserved between different genes?
To put it slightly differently, suppose Alice and Bob both belong to the same cluster when considering Genes 1, 4 and 5, but different clusters when considering Genes 2 and 3. How can I determine if this is the same as would be expected if all gene sequences are independent of each other, and if not, is there a metric that gives the "strength" of such a relationship (being in the same cluster across multiple genes).
I'm imagining that I will need to assess the correlation between a set of matrices describing the clustering for each gene, but I am unsure if there is a standard approach for this type of problem.
Note: I am not necessarily looking for a complete solution, but rather some pointers in the right direction. I have struggled to turn up anything useful in the usual google searches.