Skip to main content
10 events
when toggle format what by license comment
Jan 25, 2022 at 17:04 comment added whuber Your question about being "more realistic" is overly vague and general. There potentially are problems with any imputation method--but the simplest ones tend to create the most problems. See stats.stackexchange.com/questions/561387 for an analysis of just a few of the biases that mean imputation can produce. Ultimately, whether a statistical procedure is appropriate depends partly on how you will use it.
Jan 25, 2022 at 16:57 comment added mdewey We do in fact have a whole tag for this multiple-imputation. I would endorse @seanv507 suggestion that you look at Steff Van Buuren's work though. He is something of an authority in this area.
Jan 25, 2022 at 16:56 history edited Sycorax
edited tags
Jan 25, 2022 at 16:49 answer added Vladimir Belik timeline score: 1
Jan 25, 2022 at 16:49 comment added bdeonovic Yes that example would not be missing at random, but it can of course be much more complicated (some nonlinear relationship between all of your variables, and maybe some unmeasured ones, and the missingness). There are some methods to try to figure that out...but ultimately you can't. This is why high quality data collection and experiment planning is important
Jan 25, 2022 at 16:45 comment added sangstar @bdeonovic How can I tell if the values are missing at random or not? Does 'not missing at random' mean, for instance, if all the ages of people from a certain ethnic group are NaN?
Jan 25, 2022 at 16:43 comment added bdeonovic also be wary of all of the issues that ANY kind of imputation can have on inference if the values are not misssing at random.
Jan 25, 2022 at 16:39 comment added seanv507 what you are saying is perfectly correct. look up multiple imputation eg see stefvanbuuren.name/fimd/sec-simplesolutions.html
S Jan 25, 2022 at 16:35 review First questions
Jan 25, 2022 at 17:09
S Jan 25, 2022 at 16:35 history asked sangstar CC BY-SA 4.0