Deriving max. likelihood estimate of β for a logistic model of two classes with a single binary regressor

Question

I have the log-likelihood function: $$l(\overrightarrow\beta)=\sum_{i=1}^n [y_i log(p(\overrightarrow x_i;\overrightarrow\beta))+(1-y_i)log(1-p(\overrightarrow x_i;\overrightarrow\beta)] $$

where $p(\overrightarrow x_i;\overrightarrow\beta)=\frac{e^{{\overrightarrow\beta^T}\overrightarrow x_i}}{{1+\overrightarrow\beta^T}\overrightarrow x_i} $ where $\overrightarrow\beta=(0,\beta_1)^T$ is the parameter vector and $\overrightarrow x$ is the matrix of inputs, whose first column is all 1's.

The two classes are $y_i=0$ or $1$, and since there is a single binary regressor, $\overrightarrow x_i$ will be an $n\times2$ matrix where $x=0$ or $1$.

Additionally, $n_{1,0}$ denotes the number of observations with $x_i=1$ and $y_i=0$, and $n_{1,1}$ denotes the number of observations with $x_i=1$ and $y_i=1$.

The max. likelihood estimator of $\beta_1$ is claimed to be $log\frac{n_{1,1}}{n_{1,0}}$, but I can't see why that's the case. I know how to find the first derivative of the log-likelihood function: $\frac{\partial l(\overrightarrow \beta)}{\partial\overrightarrow \beta}=\sum_{i=1}^n [\overrightarrow x_i (y_i-p(\overrightarrow x_i;\overrightarrow\beta))]$ *

I know for maximization we woud set this equal to zero, and can see that * breaks into two equations since $\overrightarrow x_i= (1,1)$ or $(1,0)$. For the first case we would arrive at $\sum_{i=1}^n y_i = \sum_{i=1}^n p(\overrightarrow x_i;\overrightarrow\beta)$, but I'm not sure what the next step might be to arrive at the given result.

grand_chat · Accepted Answer · 2018-10-15 23:25:48Z

Since there's only one regressor $x$ and there's no intercept in the model, you can treat each $x_i$ as a scalar instead of a vector, and regard $\beta=\beta_1$ as a scalar instead of a vector. So I'll drop the vector notation from now on. Set the expression $ \sum_i[x_i(y_i-p(x_i;\beta))] $ to zero. This yields $$ \sum x_iy_i = \sum x_i p(x_i;\beta)\tag1 $$ The LHS of (1) simplifies to $n_{1,1}$ since the terms where $x_i=0$ or $y_i=0$ don't contribute.

Similarly the RHS of (1) simplifies to $$\sum_{x_i=1} p(x_i;\beta)= \#\{x_i=1\}\cdot p(1;\beta) = (n_{1,0} + n_{1,1}) e^{\beta_1}/(1+e^{\beta_1}) .$$

With these simplifications you can solve (1) for $e^{\beta_1}=n_{1,1}/n_{1,0}$.

Added: If your model had two parameters, say $\vec\beta=(\beta_1,\beta_2)$, for the intercept and the binary regressor $x_i$ (sorry, the meaning of $\beta_1$ has changed), then your equation ($*$) would split into two equations for the two unknowns $\beta_1$ and $\beta_2$. The $k$th equation would involve the column $k$ of the $x$ matrix: $$\sum x_{i,k}y_i=\sum x_{i,k} p(\vec x_i, \vec\beta).\tag2$$

The equation for column $2$ would be simplified as before using $n_{1,1}$ and $n_{1,0}$: $$ n_{1,1} = (n_{1,0}+n_{1,1})p((1,1),\vec\beta)=(n_{1,0}+n_{1,1})\frac{e^{\beta_1+\beta_2}}{1+e^{\beta_1+\beta_2}}\tag3 $$ As for the intercept column, substitute $x_{i,1}=1$ for all $i$ to get: $$ \sum y_i =\sum p(\vec x_i, \vec\beta).\tag4 $$ The LHS would involve only cases where $y_i=1$, and the RHS would break into one sum where $x_i=0$ and one sum where $x_i=1$: $$ n_{0,1}+n_{1,1}=(n_{0,0}+n_{0,1})\frac{e^{\beta_1}}{1+e^{\beta_1}} +(n_{1,0}+n_{1,1})\frac{e^{\beta_1+\beta_2}}{1+e^{\beta_1+\beta_2}}\tag5 $$

Thank you very much, that makes things clearer. If I were looking at a similar problem where the model had an intercept, what could I do instead of treating β and x as scalars at (1)? I'm assuming I'd have four cases to consider: n_(1,1), n_(1,0), n_(0,0), and n_(0,1). — sk13
– sk13, Commented Oct 15, 2018 at 22:08
@sk13 The MLE for the vector $\beta$ would involve the four possible $n$ values. See my edit. — grand_chat
– grand_chat, Commented Oct 15, 2018 at 23:26

Stack Exchange Network

Deriving max. likelihood estimate of β for a logistic model of two classes with a single binary regressor

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Deriving max. likelihood estimate of β for a logistic model of two classes with a single binary regressor

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions