I would like to know how to calculate the probability that 2 discrete samples come from the same distribution, and if so, which one is the distribution they are coming from.
Let's say we have 3 buckets, and 2 years.
| Bucket\year | 2019 | 2020 |
|---|---|---|
| A | 76% | 73% |
| B | 20% | 22% |
| C | 4% | 5% |
I would like to claim that they both come from the same distribution (which I'd assume is close to [74.5%,21%,4.5%]) and that the variation between years is just given by random chance (with which probability?). I think in order to make this claim I have to calculate the probability that both samples come from the same distribution (I heard about 'power' for continous random variables, but I don't know if there's an equivalent in discrete verison). Any hint on how to proceed?
Thanks a lot!
As side note: both periods of time have different amount of datapoints, ie. 2019 has 350, 2020 has 400. Is it too much of a problem?