Skip to main content
23 events
when toggle format what by license comment
Apr 13, 2017 at 12:44 history edited CommunityBot
replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/
Dec 9, 2016 at 15:09 history edited Giuseppe Biondi-Zoccai CC BY-SA 3.0
typo correction and integration
Apr 17, 2016 at 7:33 vote accept Giuseppe Biondi-Zoccai
Apr 17, 2016 at 7:33 vote accept Giuseppe Biondi-Zoccai
Apr 17, 2016 at 7:33
S Apr 17, 2016 at 7:09 history bounty ended CommunityBot
S Apr 17, 2016 at 7:09 history notice removed CommunityBot
Apr 9, 2016 at 7:36 comment added Marquis de Carabas @GiuseppeBiondi-Zoccai I admit my ignorance of limited independent variables...is that just like limited dependent variable but for independent variables? :) That is, the independent variable is categorical, count, etc? Can you explain why the limited independent variable might be a problem? The original Heckman model was actually used for a continuous outcome, but nowadays, there are several flavors, including probit and even Poisson.
Apr 9, 2016 at 7:17 answer added Umberto benedetto timeline score: 1
Apr 9, 2016 at 7:05 comment added Giuseppe Biondi-Zoccai @marquisdecarabas: I tried to look into Heckman type corrections, but I found they focus on limited dependent variables, and not limited independent variables (but possibly I am mistaken...): stats.stackexchange.com/questions/172508/…
Apr 9, 2016 at 6:57 history tweeted twitter.com/StackStats/status/718694216407367680
Apr 9, 2016 at 6:45 comment added Marquis de Carabas @Björn, based on the OP's comment here, the covariates are not missing at random, hence my original suggestion for using some kind of Heckman correction. Obviously, including the people who did not take the exercise stress test is clinically interesting and important here, because they are already different from those who took the exercise test. Of course, OP can drop the 1,000 or so missing exercise test, but results would have limited generalizability.
Apr 9, 2016 at 6:24 comment added Giuseppe Biondi-Zoccai (+1) Thanks @Björn. My problem is that eventually I might want to generate a clinical risk prediction score for those completing the exercise test (thus including all covariates and using them to predict risk), but also for those not doing the exercise test (so including only some variables). Thus, my problem is two-faceted: using a single model encompassing all patients and all variables (despite several variables being missing in some patients, not at random) to adjust for confounders; then creating separate models for risk prediction in the two main strata.
Apr 9, 2016 at 5:33 comment added Björn I saw a lot of hits on google.scholar for +"model selection" +"missing covariate", as well as +"model building" +"missing covariate". I suspect that it may be possible - if it is plausible that covariates are simply missing at random - would be to impute them using multiple imputation, do whatever model building you do and combine the results across imputations. I believe there's also models that implicity impute them. However, if covariates will be missing in practice when people are trying to use the prediction score also, that would be an even harder problem.
S Apr 9, 2016 at 5:24 history bounty started Giuseppe Biondi-Zoccai
S Apr 9, 2016 at 5:24 history notice added Giuseppe Biondi-Zoccai Draw attention
Apr 1, 2016 at 10:53 history edited Scortchi CC BY-SA 3.0
fixed typos
Apr 1, 2016 at 9:45 history edited Giuseppe Biondi-Zoccai CC BY-SA 3.0
substantial integration as recommended by commentators
Apr 1, 2016 at 9:42 comment added Giuseppe Biondi-Zoccai The question is not peregrine. Basically, I want to create a clinical prediction score for patients undergoing myocardial perfusion imaging. The imaging test follows an exercise stress test in fit patients, and a pharmacologic stress test in those who are not fit. The latter test is worse than the former, and does not provide several important prognostic features (eg maximum heart rate, or workload), so I must include exercise test variables in the multivariable model. But if I do so, I loose more than 1000 patients who only underwent a pharmacologic stress. I added this also in the question.
Apr 1, 2016 at 9:35 comment added Repmat You can make some arbitrary assumptions, and do data imputation. But for the sample data posted I dont see the need, you do not loose an entire variable. But yeah sure, you will loose data...
Apr 1, 2016 at 9:28 comment added Giuseppe Biondi-Zoccai I am not sure I follow you. If I use all the covariates in the model I loose several cases (those with NA). If I only use cov1, cov2, and cov3 I don't use the information in cov4 and cov5...
Apr 1, 2016 at 9:19 comment added Repmat The $x_i$ can have any features, expect they cannot be constant or a linear combination of each other. If there is not much variation in $x_i$ then the standard error will be larger than otherwise. In itself this is not a problem
Apr 1, 2016 at 8:27 history edited Giuseppe Biondi-Zoccai CC BY-SA 3.0
minor integration
Mar 31, 2016 at 21:33 history asked Giuseppe Biondi-Zoccai CC BY-SA 3.0