Timeline for How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples

Current License: CC BY-SA 4.0

32 events

when toggle format	what		by	license	comment
Apr 26, 2024 at 8:41	history	edited	User1865345	CC BY-SA 4.0	added 21 characters in body
Jun 15, 2020 at 5:07	history	edited	Glen_b	CC BY-SA 4.0	added 172 characters in body
Oct 10, 2019 at 22:07	history	edited	Glen_b	CC BY-SA 4.0	added 1 character in body
Apr 25, 2019 at 4:29	history	edited	Glen_b	CC BY-SA 4.0	added 116 characters in body
Oct 19, 2018 at 1:10	comment	added	Glen_b		ctd... and at n=1000 (is that in each sample, or both together?) the attainable type I error rate should be indistinguishable from the significance level. As such, I expect something is astray in your setup, or perhaps the situation you're simulating under your null is less clear than it might be. You would also need to clearly identify which parts of the question you're addressing.
Oct 19, 2018 at 1:09	comment	added	Glen_b		I don't think your results can all be correct. For example in the case where the variances are the same but the means may differ, you say the Mann-Whitney has a type I error rate of 9%. However, the Mann-Whitney should have an observed type I error rate (i.e. under the case where the means also don't differ) of exactly the next lowest available size below the selected significance level (the Mann-Whitney is distribution-free after all, so the fact that you're sampling gamma variates is of no consequence for the type I error), ... ctd
Oct 18, 2018 at 21:45	comment	added	Xavier Bourret Sicotte		@Glen_b Hi Glen, fantastic answer (as always) -- I have posted a simulation below let me know what you think, thanks !
Sep 10, 2018 at 22:19	history	edited	Glen_b	CC BY-SA 4.0	added 79 characters in body
Apr 13, 2017 at 12:44	history	edited	CommunityBot		replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/
Feb 7, 2017 at 0:05	history	edited	Glen_b	CC BY-SA 3.0	clarifications, additional explanation, minor fixes
Mar 24, 2016 at 23:40	history	edited	Glen_b	CC BY-SA 3.0	added 5 characters in body
Mar 10, 2016 at 12:58	comment	added	Rodolphe		I think it should be emphasised that it is normality of the residuals of the model that should be checked, not that of the sample itself. Assuming you absolutely want to check it, that can be done only after the model has been fitted. Checking normality of the sample as is is nonsense. Obviously most measured variables, such as height of individuals for example, will be always positive. But residuals, that is, variation of individuals' height around the mean, should be normally distributed and that can be checked.
Feb 17, 2015 at 21:31	comment	added	Glen_b		@Silverfish It's definitely helpful to make such edits, and where it doesn't alter the sense of the answer, you should do it without further thought. It's best to make the edit reason as clear as possible (or add a comment if space is insufficient), since the original poster may not immediately realize why (something like "amended quotes to match edited question" should do). [In my case you should make any edit to any of my answers for any reason you see fit. They can always be rolled back or re-edited if necessary.]
Feb 17, 2015 at 17:21	comment	added	Silverfish		Many thanks for the addition, I thought that added a lot to the quality of this answer. Now this question has settled down a bit, and generated a good set of responses, I'd like to give the original question a good copy-edit and remove anything which might be misleading (for the benefit of readers who don't read past the question!). Is it okay when I do so for me to make appropriate edits to your response so quotes match with the reorganized question?
Jan 28, 2015 at 5:26	history	edited	Glen_b	CC BY-SA 3.0	added 128 characters in body
Jan 28, 2015 at 2:16	history	edited	Glen_b	CC BY-SA 3.0	added 590 characters in body
Jan 28, 2015 at 1:56	history	edited	Glen_b	CC BY-SA 3.0	added 1419 characters in body
Jan 28, 2015 at 1:50	history	edited	Glen_b	CC BY-SA 3.0	added 1419 characters in body
Jan 28, 2015 at 1:33	comment	added	Glen_b		Silverfish - I'm not sure if I sufficiently addressed your question asking for detail on robustifying. I'll add a little more now.
Jan 28, 2015 at 1:23	history	edited	Glen_b	CC BY-SA 3.0	added 128 characters in body
Jan 28, 2015 at 1:17	history	edited	Glen_b	CC BY-SA 3.0	added 128 characters in body
Jan 28, 2015 at 1:09	history	edited	Glen_b	CC BY-SA 3.0	added 70 characters in body
Nov 11, 2014 at 20:26	vote	accept	Silverfish
Nov 11, 2014 at 2:18	history	edited	Glen_b	CC BY-SA 3.0	added 726 characters in body
Nov 11, 2014 at 0:19	history	bounty awarded	Silverfish
Nov 10, 2014 at 23:51	vote	accept	Silverfish
Nov 11, 2014 at 0:19
Nov 10, 2014 at 18:59	comment	added	Silverfish		Any chance that this already excellent answer could incorporate a little more detail on what options there might be to "robustify" the t-test?
Nov 10, 2014 at 17:13	comment	added	Glen_b		Yes, that should be understood as "population is either known to be heavy-tailed, or may reasonably expected to be heavy tailed". That certainly includes things like theory (or sometimes even general reasoning about the situation that doesn't quite reach the status of theory), expert knowledge, and previous studies. It's not suggesting testing for heavy-tailedness. In situations where it's simply unknown, it may be worth investigating how bad things might be under various distributions which might be plausible for the specific situation you have.
Nov 10, 2014 at 17:08	history	edited	Glen_b	CC BY-SA 3.0	added 6 characters in body
Nov 10, 2014 at 16:51	comment	added	Silverfish		A couple of things I'd like clarification on. There's several points where you mention e.g. "If the distribution is heavy-tailed, ..." (or skewed etc) - presumably this should be read as "if it's reasonable to assume that the distribution will be heavy-tailed" (from theory/previous studies/whatever) rather than "if the sample is heavy-tailed", otherwise we are back at multi-step testing again which is the thing we are trying to avoid? (It seems to me that a central issue in this topic is how to justify beliefs or assumptions about distributions, without reading too much into the sample.)
Nov 10, 2014 at 16:49	history	edited	Glen_b	CC BY-SA 3.0	added 6 characters in body
Nov 10, 2014 at 11:59	history	answered	Glen_b	CC BY-SA 3.0

toggle format