Skip to main content
deleted 1 character in body
Source Link
Kuku
  • 1.7k
  • 11
  • 25

$$ \operatorname{P}(V_{\text{observed}}), V_{\text{missing}}) = \operatorname{P}(V^{*}|V_{\text{observed}}, R = 0) \operatorname{P} (V_{\text{observed}}) $$$$ \operatorname{P}(V_{\text{observed}}, V_{\text{missing}}) = \operatorname{P}(V^{*}|V_{\text{observed}}, R = 0) \operatorname{P} (V_{\text{observed}}) $$ where $V_{\text{observed}}$ and $V_{\text{missing}}$ are variables with no missing values and variables with some missing values, respectively; $R$ are the missingness mechanism or indicators for the variables with some missing values and $V^{*}$ is the proxy variable that denotes the available values of the variables with some missing values.

$$ \operatorname{P}(V_{\text{observed}}), V_{\text{missing}}) = \operatorname{P}(V^{*}|V_{\text{observed}}, R = 0) \operatorname{P} (V_{\text{observed}}) $$ where $V_{\text{observed}}$ and $V_{\text{missing}}$ are variables with no missing values and variables with some missing values, respectively; $R$ are the missingness mechanism or indicators for the variables with some missing values and $V^{*}$ is the proxy variable that denotes the available values of the variables with some missing values.

$$ \operatorname{P}(V_{\text{observed}}, V_{\text{missing}}) = \operatorname{P}(V^{*}|V_{\text{observed}}, R = 0) \operatorname{P} (V_{\text{observed}}) $$ where $V_{\text{observed}}$ and $V_{\text{missing}}$ are variables with no missing values and variables with some missing values, respectively; $R$ are the missingness mechanism or indicators for the variables with some missing values and $V^{*}$ is the proxy variable that denotes the available values of the variables with some missing values.

added images
Source Link
Kuku
  • 1.7k
  • 11
  • 25
  • The MCAR/MAR/MNAR taxonomy gives an incomplete picture and can be misleading (e.g. MNAR problems can be solved with complete case analysis).
  • You should not blindly use all available information.
  • You should not use only association measures (such as correlations) to justify inclusion of a variable in an imputation model.

The first statement is a causal one, and it can be made explicit with an m-graph. The third statement is partiallypartly true: it is sufficient but not necessary in the sense that you can get correct answers without using all variables (depending on the DAG), but it is true that for any MAR$^3$ problem the joint distribution of all variables can be recovered by (equation 34.4 in Mohan, 2022, p. 659):

m-graph for first example

However, while we say MAR under $U$ holds from the perspective of the data generating process, from the perspective of the researcher $U$ is a latent variable so that MNAR holds, as Dimitris rightly notes.

m-graph for second example

  • The MCAR/MAR/MNAR taxonomy gives an incomplete picture and can be misleading (e.g. MNAR problems can be solved with complete case analysis).
  • You should not blindly use all available information.
  • You should not use association measures (such as correlations) to justify inclusion of a variable in an imputation model.

The first statement is a causal one, and it can be made explicit with an m-graph. The third statement is partially true: it is not necessary in the sense that you can get correct answers without using all variables (depending on the DAG), but it is true that for any MAR$^3$ problem the joint distribution of all variables can be recovered by (equation 34.4 in Mohan, 2022, p. 659):

However, we say MAR under $U$ holds from the perspective of the data generating process, from the perspective of the researcher $U$ is a latent variable so that MNAR holds, as Dimitris rightly notes.

  • The MCAR/MAR/MNAR taxonomy gives an incomplete picture and can be misleading (e.g. MNAR problems can be solved with complete case analysis).
  • You should not blindly use all available information.
  • You should not use only association measures (such as correlations) to justify inclusion of a variable in an imputation model.

The first statement is a causal one, and it can be made explicit with an m-graph. The third statement is partly true: it is sufficient but not necessary (depending on the DAG), but it is true that for any MAR$^3$ problem the joint distribution of all variables can be recovered by (equation 34.4 in Mohan, 2022, p. 659):

m-graph for first example

However, while we say MAR under $U$ holds from the perspective of the data generating process, from the perspective of the researcher $U$ is a latent variable so that MNAR holds, as Dimitris rightly notes.

m-graph for second example

major re-organization of the question, deleted first example, directrly refer to other provided answer and comments
Source Link
Kuku
  • 1.7k
  • 11
  • 25

There have been great advances in the last decade in the missing data literature, but the adoption of these findings has been slow, especially for fields outside the Pearl camp of causal inference such as statistics, social sciences, economics and epidemiology$^2$.

However, we say MAR under $U$ holds from the perspective of the data generating process, from the perspective of the researcher $U$ is a latent variable so that MNAR holds, as Dimitris rightly notes.

There have been great advances in the last decade in the missing data literature, but the adoption of these findings has been slow, especially for fields outside the Pearl camp of causal inference such as statistics, social sciences and epidemiology$^2$.

There have been great advances in the last decade in the missing data literature, but the adoption of these findings has been slow, especially for fields outside the Pearl camp of causal inference such as statistics, social sciences, economics and epidemiology$^2$.

However, we say MAR under $U$ holds from the perspective of the data generating process, from the perspective of the researcher $U$ is a latent variable so that MNAR holds, as Dimitris rightly notes.

major re-organization of the question, deleted first example, directrly refer to other provided answer and comments
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading
deleted 77 characters in body
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading
Add additional example to respond to comment
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading
deleted 3 characters in body
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading
Added new example, added conclusion
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading
added 3 characters in body
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading
Source Link
Kuku
  • 1.7k
  • 11
  • 25
Loading