Which evaluation measures to use for match prediction?

Question

I am trying to predict wins/losses of tennis matches by predicting win probabilities of each match, and currently thinking about which evaluation measures to use.

Besides using overall evaluation measures like the Brier score, I look at model calibration and discriminative ability separately. I am in doubt what metrics are good to use for model discrimination specifically.

I've read that for assessing model discrimination the AUROC is often used. However, I feel like it is not appropriate in my application, because it doesnt make sense to consider different thresholds than 0.5. Also, measures like precision/recall/F1-score seem not to be appropriate because of my balanced classes (either win or loss, which both occur 50% of the time ofcourse) and the fact that false positives are of similar importance as false negatives.

Therefore, I think simply using prediction accuracy (fraction of correctly predicted wins/losses) is a good metric to use for assessing model discrimination. Is my thought process here correct? Am I missing something? Are there any drawbacks of using accuracy in this application?

Fallen Apart · Accepted Answer · 2020-08-10 10:30:42Z

You are right that the accuracy is an appropriate metric. In your case distinction on positives and negatives is arbitrary, thus precision and recall wouldn't make much sense in my opinion.

I am not familiar with the context of your problem, but if for example it concerns betting on matches, you may also measure the performance in the expected return.

Stack Exchange Network

Which evaluation measures to use for match prediction?

1 Answer 1

Hot Network Questions

Which evaluation measures to use for match prediction?

1 Answer 1

Related

Hot Network Questions