I am working on the following "equality identification" problem and become quite confused on how to reasonably define false positive and false negative in my case.
Problem:
Suppose I have a large set of graphs $G$, where for every graph $g \in G$, there is one and only one graph $g'$ that is identical to $g$. I am training a model $M$ which, given two graphs $g, g' \in G$, can decide how similar these two graphs are. The output of $M$, $o =M(g, g')$, denotes a similarity score of these two graphs. $o$ is a floating number, ranging from $[0, 1]$.
To evaluate the performance of $M$, I am measuring the top-$k$ accuracy of its predications in the following very standard way:
for g \in G: for t \in G: // suppose here t != g and therefore comparison is meaningful accuracy[(g, t)] = M(g, t) correct_prediction = 0 total_prediction = 0 for g \in G: total_prediction += 1 sort_accuracy_array_on_graph (g); // we sort all comparisons on $g$ with other graphs \in G if g' in top-$k$ most similarity graphs with g: // g' is the ground truth, the "identical" graph with g correct_prediction += 1 correct_prediction / total_prediction // the average top-$k$ accuracy OK, so this seems very standard approach to computing the top-$k$ accuracy of $M$. However, currently I want to take one step further and somewhat measure the false positive and false negative of $M$. I understand that FP and FN are both very standard metrics in data mining, but just somewhat become very confused on how to define it in my problem. For instance, I can define False Negative in the following way:
False Negative of $M$: conceptually, FN denotes that $g'$, the identical graph of $g$, does not appear in top-$k$ most similarity graph of $g$. Then, perhaps FN is interchangeable with top-$k$ accuracy in my case?
False Positive of $M$: but if we accept that FN is interchangeable with top-$k$ accuracy in my case, then how to define FP of $M$? FP denotes that $M$ aggressively treats graph $t$, where $t != g'$, as the "similar" graph of $g$ and appear in its top-$k$ most similar set. Then isn't it indicating that FP is ALSO interchangeable with top-$k$ accuracy in my case? That seems suspicious because in my understanding, FP should not equal to FN? There must be something wrong.
Am I clear on this confusion? Any suggestion would be appreciated. Thank you!