We can do this pretty simply with a t-test.
Here's the code an output for a quick implementation in R:
#This creates the data. A vector of 0's and 1's with 1 corresponding to a win and 0 corresponding to a loss. MoveA<-c(rep(1,.36*405),rep(0,(1-.36)*405)) MoveB<-c(rep(1,.441*34),rep(0,(1-.441)*34)) #And then we just plug it into the t-test function t.test(x=MoveA,y=MoveB)
And the output: Welch Two Sample t-test
data: MoveA and MoveB t = -0.72129, df = 36.95, p-value = 0.4753 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.2488654 0.1182023 sample estimates: mean of x mean of y 0.3589109 0.4242424
So what is this saying? It's saying that given the data, we are 95% sure that MoveA is between 25% worse to 11% better than MoveB. So... the sample size for Move B isn't really big enough to make any good conclusions.
It looks like you'd need between 200 and 250 games with MoveB to have the statistical power to actually see a significant different. So, just play 150 more games with MoveB and then you'll know!