Goodness of fit test for survey data?

Question

Is there a way to test whether our data from sample is similiar to population data?

Let's say that we conducted a poll about political preferences with 2% marigin of error and 95% confidence level. Can we check reliably whether we had a proper sample?

I know about chi square tests. Let's say we have a party, which got 36% of 10000000 votes (3600000) and poll had said that they ought to get 35,5% (3550000). Chi-square result is about 704, which seems far too big.

I've heard that chi square shouldn't be used for large samples, so are there any other tests I can use?

"I've heard that chi square shouldn't be used for large samples, so are there any other tests I can use?" See Are large data sets inappropriate for hypothesis testing? — J-J-J
– J-J-J, Commented Aug 17 at 8:16
Also, about "Can we check reliably whether we had a proper sample?": In your example, you're observing that the point estimate from the sample is far off 0.5% from the actual value. You should elaborate on why you consider this a problem in the first place, and why you think you should run a test at all, because it doesn't seem really clear. — J-J-J
– J-J-J, Commented Aug 17 at 13:12
Moreover, it seems there might be an error in the way you computed the chisquare statistic (704). I've been unable to replicate it with the few information you gave. How did you calculate it, exactly? // That being said, the question is quite old now, and I suspect we won't get the clarifications necessary to answer it, so I'd suggest to simply close it. — J-J-J
– J-J-J, Commented Aug 18 at 9:48

Bernhard · Accepted Answer · 2017-11-02 19:59:50Z

Traditional statistical tests like chi-square-tests and binomial test are meant to investigate point hypotheses. If you test for x = 35.5%, you test for x = 35.50000000000...% and with a large sample (such as your $10^7$) things get quite precise and the slightest difference will yield a $p$ very close to zero.

First, you should not take the point estimation of 35.5% from your poll, but the 95% or 99% confidence interval from your poll and compare that to the 36%. This is still comparing to the "same" population, not "similar" populations. You will have to define what "similar" should actually mean.

Stack Exchange Network

Goodness of fit test for survey data?

1 Answer 1

Linked

Hot Network Questions

Goodness of fit test for survey data?

1 Answer 1

Linked

Related

Hot Network Questions