I would like to run an exponential survival model using the psm function in rms package, but I am still trying to understand the estimates, and how to obtain them. Using the colon data, say I would like to assess the risk of getting colon cancer for different treatment groups (Obs, Lev and Lev+5FU), while adjusting for age and sex.
Fit
# datadist is needed to fit the models: dd <- datadist(colon) options(datadist="dd") fit <- psm(Surv(time,status)~rx+age+sex, data=colon, dist="exponential", x=TRUE, y=TRUE) fit summary(fit) # Effects Response : Surv(time, status) # # Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95 # age 53 69 16 0.027584 0.045310 -0.061281 0.11645 # Survival Time Ratio 53 69 16 1.028000 NA 0.940560 1.12350 # sex 0 1 1 0.038055 0.066053 -0.091492 0.16760 # Survival Time Ratio 0 1 1 1.038800 NA 0.912570 1.18250 # rx - Lev:Obs 1 2 NA 0.034201 0.076873 -0.116570 0.18497 # Survival Time Ratio 1 2 NA 1.034800 NA 0.889970 1.20320 # rx - Lev+5FU:Obs 1 3 NA 0.490290 0.083921 0.325700 0.65488 # Survival Time Ratio 1 3 NA 1.632800 NA 1.385000 1.92490 In the summary(fit), I understand that the exp(beta) (here exp(Effect)) is the survival time ratio, and not the hazard ratio. The survival time ratio for the "Lev+5FU" group compared to the reference group is 1.632800 (i.e. exp(0.490290)); hence, my interpretation is that individuals in the "Lev+5FU" group have an expected survival time approximately 1.632800 times longer than those in “Obs” group. Alternatively, being in the “Lev+5FU” increases the time to colon cancer by 63%, as compared to “Obs” group. To obtain the hazard ratio this would be taking exp(-0.490290) = 0.612. So patients in the "Lev+5FU" has lower risk of getting colon cancer as compared to “Obs” patients. Question(1): is it possible to change the code to output hazard ratios instead?
Model diagnostics
Question(2): do you look at the raw data, e.g. Kaplan-Meier (KM) Curves or look at the residuals to assess whether exponential survival model is appropriate?
Plot
survplot(fit, rx) Running the code gives a plot of estimated survival curves (below) for different treatment groups. Question(3): How can I revise the code so that I can get the cumulative incidence curve instead? 
Other estimands (Median and Incidence rate)
I tried to get the median survival time and followed the solution here but the 2nd line of code gives an error. Questions(4): How to solve this?
QFUN <- Quantile(fit) pred.med.surv <- QFUN(predict(fit, type = "lp")) # Error in exp(lp) : non-numeric argument to mathematical function It works for Weibull distribution. Question(5): But how do I understand the results, why are there predicted median survival for each observation? How do I then get the predicted median survival for each group instead?
I understand that the estimated hazard is the # colon cancer / # total individuals i.e. incidence rate. Also, hazard is 1/mean. Question(6): How do I obtain the incidence rate per 1000 person-years.
Thanks for reading this question. I appreciate any replies. Thanks!
