2
$\begingroup$

I have created some simulated data. To keep things simple, lets say I have done the following:

y = (X_1 + X_2 + X_3)/3 + e

Where each X_i is drawn from a normal distribution and I have added correlation by specifying a correlation matrix and then performing a cholesky decomposition. I create a large number of samples and then calculate the corresponding values for y. e is an error term.

Now my goal is do inference but I don't want to answer questions like: What happens to y if I change X_1, holding the other variables constant. Instead, I want to know things like: If I change X_1 and take into account the correlation matrix, what is the effect on y.

I have never done any SEM so I am probably very off but here is what I tried in semopy (I think the notation should be similar to that of lavaan)

model_desc = """ # Regression model Y ~ X_1 + X_2 + X_3 # Correlations X_1 ~~ X_2 X_1 ~~ X_3 X_2 ~~ X_3 """ 

I then pass in the dataframe containing the simulated values for y, X_1, X_2, X_3.

And while this model runs, I get some strange outputs.For the correlated variables I see the correlation coefficient from my correlation matrix. I.e.,

X1 ~~ X2 has an estimate that is very close to the correlation I have specified. However, the estimates for my target varible (e.g., y ~ X_1) are all zero.

My first question is: is using SEM even the right approach for what I want to achieve? Secondly, where is my implementation going wrong?

$\endgroup$

1 Answer 1

0
$\begingroup$

Your problem may be confusing notation. You've transformed your original $X_1,X_2,X_3$ and want to refer to both the transformed and untransformed variables. Let $f_1$ be a function over $X_1,X_2,X_3$ which gives the transformed value of $X_1$, and so on.

Your actual regression equation is: $\hat{Y} = \beta_0 + \beta_1 f_1(X_1,X_2,X_3) + \beta_2 f_2(X_1,X_2,X_3) + \beta_3 f_3(X_1,X_2,X_3)$

Generally speaking, I think the impact of changing $X_1$, then transforming, then applying the regression could be expressed as a partial derivative.

$\frac{\partial\hat{Y}}{\partial X_1} = \beta_1 \left(\frac{\partial}{\partial X_1}\right)f_1(X_1,X_2,X_3) + \beta_2 \left(\frac{\partial}{\partial X_1}\right) f_2(X_1,X_2,X_3) + \beta_3 \left(\frac{\partial}{\partial X_1}\right)f_3(X_1,X_2,X_3)$

If your transform is linear, this may be easy to compute for untransformed observations of $X_1, X_2, X_3$.

$\endgroup$
1
  • $\begingroup$ Writing derivatives with respect to random variables is confusing notation; that's usually undefined. $\endgroup$ Commented Nov 19, 2024 at 15:24

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.