Timeline for How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples
Current License: CC BY-SA 4.0
32 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Apr 26, 2024 at 8:41 | history | edited | User1865345 | CC BY-SA 4.0 | added 21 characters in body |
| Jun 15, 2020 at 5:07 | history | edited | Glen_b | CC BY-SA 4.0 | added 172 characters in body |
| Oct 10, 2019 at 22:07 | history | edited | Glen_b | CC BY-SA 4.0 | added 1 character in body |
| Apr 25, 2019 at 4:29 | history | edited | Glen_b | CC BY-SA 4.0 | added 116 characters in body |
| Oct 19, 2018 at 1:10 | comment | added | Glen_b | ctd... and at n=1000 (is that in each sample, or both together?) the attainable type I error rate should be indistinguishable from the significance level. As such, I expect something is astray in your setup, or perhaps the situation you're simulating under your null is less clear than it might be. You would also need to clearly identify which parts of the question you're addressing. | |
| Oct 19, 2018 at 1:09 | comment | added | Glen_b | I don't think your results can all be correct. For example in the case where the variances are the same but the means may differ, you say the Mann-Whitney has a type I error rate of 9%. However, the Mann-Whitney should have an observed type I error rate (i.e. under the case where the means also don't differ) of exactly the next lowest available size below the selected significance level (the Mann-Whitney is distribution-free after all, so the fact that you're sampling gamma variates is of no consequence for the type I error), ... ctd | |
| Oct 18, 2018 at 21:45 | comment | added | Xavier Bourret Sicotte | @Glen_b Hi Glen, fantastic answer (as always) -- I have posted a simulation below let me know what you think, thanks ! | |
| Sep 10, 2018 at 22:19 | history | edited | Glen_b | CC BY-SA 4.0 | added 79 characters in body |
| Apr 13, 2017 at 12:44 | history | edited | CommunityBot | replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/ | |
| Feb 7, 2017 at 0:05 | history | edited | Glen_b | CC BY-SA 3.0 | clarifications, additional explanation, minor fixes |
| Mar 24, 2016 at 23:40 | history | edited | Glen_b | CC BY-SA 3.0 | added 5 characters in body |
| Mar 10, 2016 at 12:58 | comment | added | Rodolphe | I think it should be emphasised that it is normality of the residuals of the model that should be checked, not that of the sample itself. Assuming you absolutely want to check it, that can be done only after the model has been fitted. Checking normality of the sample as is is nonsense. Obviously most measured variables, such as height of individuals for example, will be always positive. But residuals, that is, variation of individuals' height around the mean, should be normally distributed and that can be checked. | |
| Feb 17, 2015 at 21:31 | comment | added | Glen_b | @Silverfish It's definitely helpful to make such edits, and where it doesn't alter the sense of the answer, you should do it without further thought. It's best to make the edit reason as clear as possible (or add a comment if space is insufficient), since the original poster may not immediately realize why (something like "amended quotes to match edited question" should do). [In my case you should make any edit to any of my answers for any reason you see fit. They can always be rolled back or re-edited if necessary.] | |
| Feb 17, 2015 at 17:21 | comment | added | Silverfish | Many thanks for the addition, I thought that added a lot to the quality of this answer. Now this question has settled down a bit, and generated a good set of responses, I'd like to give the original question a good copy-edit and remove anything which might be misleading (for the benefit of readers who don't read past the question!). Is it okay when I do so for me to make appropriate edits to your response so quotes match with the reorganized question? | |
| Jan 28, 2015 at 5:26 | history | edited | Glen_b | CC BY-SA 3.0 | added 128 characters in body |
| Jan 28, 2015 at 2:16 | history | edited | Glen_b | CC BY-SA 3.0 | added 590 characters in body |
| Jan 28, 2015 at 1:56 | history | edited | Glen_b | CC BY-SA 3.0 | added 1419 characters in body |
| Jan 28, 2015 at 1:50 | history | edited | Glen_b | CC BY-SA 3.0 | added 1419 characters in body |
| Jan 28, 2015 at 1:33 | comment | added | Glen_b | Silverfish - I'm not sure if I sufficiently addressed your question asking for detail on robustifying. I'll add a little more now. | |
| Jan 28, 2015 at 1:23 | history | edited | Glen_b | CC BY-SA 3.0 | added 128 characters in body |
| Jan 28, 2015 at 1:17 | history | edited | Glen_b | CC BY-SA 3.0 | added 128 characters in body |
| Jan 28, 2015 at 1:09 | history | edited | Glen_b | CC BY-SA 3.0 | added 70 characters in body |
| Nov 11, 2014 at 20:26 | vote | accept | Silverfish | ||
| Nov 11, 2014 at 2:18 | history | edited | Glen_b | CC BY-SA 3.0 | added 726 characters in body |
| Nov 11, 2014 at 0:19 | history | bounty awarded | Silverfish | ||
| Nov 10, 2014 at 23:51 | vote | accept | Silverfish | ||
| Nov 11, 2014 at 0:19 | |||||
| Nov 10, 2014 at 18:59 | comment | added | Silverfish | Any chance that this already excellent answer could incorporate a little more detail on what options there might be to "robustify" the t-test? | |
| Nov 10, 2014 at 17:13 | comment | added | Glen_b | Yes, that should be understood as "population is either known to be heavy-tailed, or may reasonably expected to be heavy tailed". That certainly includes things like theory (or sometimes even general reasoning about the situation that doesn't quite reach the status of theory), expert knowledge, and previous studies. It's not suggesting testing for heavy-tailedness. In situations where it's simply unknown, it may be worth investigating how bad things might be under various distributions which might be plausible for the specific situation you have. | |
| Nov 10, 2014 at 17:08 | history | edited | Glen_b | CC BY-SA 3.0 | added 6 characters in body |
| Nov 10, 2014 at 16:51 | comment | added | Silverfish | A couple of things I'd like clarification on. There's several points where you mention e.g. "If the distribution is heavy-tailed, ..." (or skewed etc) - presumably this should be read as "if it's reasonable to assume that the distribution will be heavy-tailed" (from theory/previous studies/whatever) rather than "if the sample is heavy-tailed", otherwise we are back at multi-step testing again which is the thing we are trying to avoid? (It seems to me that a central issue in this topic is how to justify beliefs or assumptions about distributions, without reading too much into the sample.) | |
| Nov 10, 2014 at 16:49 | history | edited | Glen_b | CC BY-SA 3.0 | added 6 characters in body |
| Nov 10, 2014 at 11:59 | history | answered | Glen_b | CC BY-SA 3.0 |