Skip to main content
deleted 9 characters in body
Source Link
user78229
  • 11k
  • 2
  • 26
  • 48

One of the best and most thorough explanations of "quasi-complete separation" issues in maximum likelihood is Paul Allison's paper. He's writing about SAS software, but the issues he addresses are generalizable to any software:

  • Complete separation occurs whenever a linear function of x can generate perfect predictions of y

  • Quasi-complete separation occurs when (a) there exists some coefficient vector b such that bxi ≥ 0 whenever yi = 1, and *bxi ≤ 0 whenever yi = 0 and this equality holds for at least one case in each category of the dependent variable. In other words in the simplest case, for any dichotomous independent variable in a logistic regression, if there is a zero in the 2 × 2 table formed by that variable and the dependent variable, the ML estimate for the regression coefficient does not exist.

Allison discusses many of the solutions already mentioned including deletion of problem variables, collapsing categories, doing nothing, leveraging exact logistic regression, Bayesian estimation and penalized maximum likelihood estimation.

http://www2.sas.com/proceedings/forum2008/360-2008.pdf

One of the best and most thorough explanations of "quasi-complete separation" issues in maximum likelihood is Paul Allison's paper. He's writing about SAS software, but the issues he addresses are generalizable to any software:

  • Complete separation occurs whenever a linear function of x can generate perfect predictions of y

  • Quasi-complete separation occurs when (a) there exists some coefficient vector b such that bxi ≥ 0 whenever yi = 1, and *bxi ≤ 0 whenever yi = 0 and this equality holds for at least one case in each category of the dependent variable. In other words in the simplest case, for any dichotomous independent variable in a logistic regression, if there is a zero in the 2 × 2 table formed by that variable and the dependent variable, the ML estimate for the regression coefficient does not exist.

Allison discusses many of the solutions already mentioned including deletion of problem variables, collapsing categories, doing nothing, leveraging exact logistic regression, Bayesian estimation and penalized maximum likelihood estimation.

http://www2.sas.com/proceedings/forum2008/360-2008.pdf

One of the most thorough explanations of "quasi-complete separation" issues in maximum likelihood is Paul Allison's paper. He's writing about SAS software, but the issues he addresses are generalizable to any software:

  • Complete separation occurs whenever a linear function of x can generate perfect predictions of y

  • Quasi-complete separation occurs when (a) there exists some coefficient vector b such that bxi ≥ 0 whenever yi = 1, and *bxi ≤ 0 whenever yi = 0 and this equality holds for at least one case in each category of the dependent variable. In other words in the simplest case, for any dichotomous independent variable in a logistic regression, if there is a zero in the 2 × 2 table formed by that variable and the dependent variable, the ML estimate for the regression coefficient does not exist.

Allison discusses many of the solutions already mentioned including deletion of problem variables, collapsing categories, doing nothing, leveraging exact logistic regression, Bayesian estimation and penalized maximum likelihood estimation.

http://www2.sas.com/proceedings/forum2008/360-2008.pdf

Source Link
user78229
  • 11k
  • 2
  • 26
  • 48

One of the best and most thorough explanations of "quasi-complete separation" issues in maximum likelihood is Paul Allison's paper. He's writing about SAS software, but the issues he addresses are generalizable to any software:

  • Complete separation occurs whenever a linear function of x can generate perfect predictions of y

  • Quasi-complete separation occurs when (a) there exists some coefficient vector b such that bxi ≥ 0 whenever yi = 1, and *bxi ≤ 0 whenever yi = 0 and this equality holds for at least one case in each category of the dependent variable. In other words in the simplest case, for any dichotomous independent variable in a logistic regression, if there is a zero in the 2 × 2 table formed by that variable and the dependent variable, the ML estimate for the regression coefficient does not exist.

Allison discusses many of the solutions already mentioned including deletion of problem variables, collapsing categories, doing nothing, leveraging exact logistic regression, Bayesian estimation and penalized maximum likelihood estimation.

http://www2.sas.com/proceedings/forum2008/360-2008.pdf