1
$\begingroup$

So I’m playing a chess game and using 365chess.com to help with my moves (this is totally legal btw as long as engine analysis is turned off) it says move A leads to a win with white* 36% of the time in 405 games and move B leads to a win with white 44.1% of the time in 34 games. Obviously a much smaller sample size.

  • technically, this move led to white winning 36% of the time from this position in games over 1600 elo. Or summit.

How confident can I be (if at all) that move B is better than move A

$\endgroup$
3
  • 2
    $\begingroup$ In other words, you want to test the null hypothesis that both moves have the same win percentage against the alternative hypothesis that move B has a higher win percentage? $\endgroup$ Commented Jan 19, 2022 at 14:36
  • $\begingroup$ See the thread linked above, you need either $z$-test or Fisher exact test. $\endgroup$ Commented Jan 19, 2022 at 15:05
  • 1
    $\begingroup$ I don't think this question should be closed. It's a good example of why one should not just look at proportions... $\endgroup$ Commented Jan 19, 2022 at 15:07

1 Answer 1

0
$\begingroup$

We can do this pretty simply with a t-test.

Here's the code an output for a quick implementation in R:

#This creates the data. A vector of 0's and 1's with 1 corresponding to a win and 0 corresponding to a loss. MoveA<-c(rep(1,.36*405),rep(0,(1-.36)*405)) MoveB<-c(rep(1,.441*34),rep(0,(1-.441)*34)) #And then we just plug it into the t-test function t.test(x=MoveA,y=MoveB) 

And the output: Welch Two Sample t-test

data: MoveA and MoveB t = -0.72129, df = 36.95, p-value = 0.4753 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.2488654 0.1182023 sample estimates: mean of x mean of y 0.3589109 0.4242424 

So what is this saying? It's saying that given the data, we are 95% sure that MoveA is between 25% worse to 11% better than MoveB. So... the sample size for Move B isn't really big enough to make any good conclusions.

It looks like you'd need between 200 and 250 games with MoveB to have the statistical power to actually see a significant different. So, just play 150 more games with MoveB and then you'll know!

$\endgroup$
1
  • $\begingroup$ You don't need to generate the zeroes and ones data to compute the test. Moreover, $t$-test is not appropriate in here, instead $z$-test should be used, because in binomial distribution variance depends on mean, so you don't need to account for it as $t$-test does. $\endgroup$ Commented Jan 19, 2022 at 15:07

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.