1
$\begingroup$

I am comparing the survival of two unequally sized groups:

Group A, n = 10 000 Group B, n = 50

The analysis is controlled for three variables: p1, p2 and p3. As p3 violates the assumption of proportional hazards, I tried two options to overcome the problem.

  1. rcs(p3) - restricted cubic splines did not solve the problem
  2. strata(p3_binned) - I binned p3 into four, using quantiles. This solves the problem; however, can I use it when group B's sample size is small and I have 4 predictors in the model?

The model looks as follows:

S ~ group + p1 + p2 + strata(p3_binned)

Edit 2: p3 has four bins, their sizes and events have given as follows:

  • 20 patients, 4 had an event

  • 8 patients, 5 had an event

  • 13 patients, 8 had an event

  • 9 patients, 6 had an event

$\endgroup$
2
  • 1
    $\begingroup$ How many events among individuals in Group B within each of the bins of p3? $\endgroup$ Commented Apr 6, 2021 at 13:23
  • $\begingroup$ Sorry, I misread your comment at first. The correct patient numbers, including events, are given at the end of the original post. $\endgroup$ Commented Apr 9, 2021 at 9:27

1 Answer 1

2
$\begingroup$

You at least have some events in each stratum for your small group of interest, so there's no reason not to proceed this way. Quoting from Therneau and Grambsch, page 45, there is a risk:

The major advantage of stratification is that it gives the most general adjustment for a confounding variable. Disadvantages are that no direct estimate of the importance of the strata effect is produced (no p-value), and that the precision of estimated coefficients and the power of hypothesis tests may be diminished if there are a large number of strata.

If you could get away with fewer strata to maintain PH you thus might be better off. If you can accept the precision in coefficient estimates you have found, then stratification per se shouldn't be a problem.

I do suspect, however, that there will be questions raised about how you are distinguishing a Group of only 50 individuals from a much larger Group of 10,000, and whether those Groups might differ in important outcome-associated predictors beyond those you included in your model. That's outside the scope of this question, however.

$\endgroup$
4
  • $\begingroup$ Will only strata with event in both treatment groups contribute to the effect estimation? That can explain the diminished power. $\endgroup$ Commented Feb 14, 2023 at 18:09
  • 1
    $\begingroup$ @hehe that would be a problem if there were a group by strata interaction term. Then lack of an event in a group/strata combination can lead to lack of convergence or enormous standard errors of coefficients. Without that interaction, the score equation is based on the sum of log-partial-likelihoods over all strata. See page 45 of Therneau and Grambsch. So, without an interaction between groupand strata, you will still get a coefficient estimate for group provided that each group has an event. The standard error of the estimate might be higher, however, with stratification. $\endgroup$ Commented Feb 14, 2023 at 19:02
  • $\begingroup$ thanks a lot! That makes sense. What is the intuitive reason that many strata will result in reduced precision? since almost all the events will still be counted in the stratified Cox model. $\endgroup$ Commented Feb 15, 2023 at 2:34
  • 1
    $\begingroup$ @hehe the events are all counted in the stratified model, but the covariates of a case having the event are only compared against those still at risk in that stratum when you solve the score equation, not against all at-risk individuals. That might widen the profile of the log-partial-likelihood profiles for the coefficients. Therneau and Grambsch note that loss of precision and power "may be diminished if there are a large number of strata." So your sense that it isn't usually a big problem is correct, with the typical small number of strata. $\endgroup$ Commented Feb 15, 2023 at 15:09

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.