$\newcommand{\I}{\mathbb{I}}\newcommand{\E}{\mathbb{E}}$Given our observed variables $x = \{x_i\}$, hidden variables $z = \{z_i\}$ and distribution parameters $\theta = \{(\mu_j, \sigma_j)\}$, let's define the complete-data log-likelihood as $$ \begin{align*} \ell_c(\theta) &= \sum_{i=1}^n \log \Pr[x_i, z_i|\theta] = \sum_{i=1}^n \sum_{j=1}^k \I[z_i = j] \log \Pr[x_i, z_i = j| \theta]\\ &= \sum_{i=1}^n \sum_{j=1}^k \I[z_i = j] \log (\Pr[x_i|\theta_j] \Pr[z_i=j|\theta]) \end{align*} $$ where $\I[\cdot]$ is the indicator function, $\Pr[x_i|\theta_j]$ is the probability of observing $x_i$ given the $j$th Gaussian and $\Pr[z_i=j|\theta]$ is the mixing weight of the $j$th distribution (i.e., the probability of observing the $j$th hidden variable). This is the model we have if we somehow knew which Gaussian generated a given data-point. We obviously don't know, so we would like to compute the expected value of this complete-data log-likelihood. That is, what is the expected outcome of this full-knowledge formulation? To do that let us define the $Q(\theta^t|\theta^{t-1})$ function as $$ \begin{align*} Q(\theta^t|\theta^{t-1}) &= \E\left[\sum_{i=1}^n \sum_{j=1}^k \I[z_i = j] \log \Pr[x_i, z_i = j| \theta]\right]\\ &= \sum_{i=1}^n \sum_{j=1}^k \E[\I[z_i = j]] \log \Pr[x_i, z_i = j| \theta]\\ &= \sum_{i=1}^n \sum_{j=1}^k \Pr[z_i = j | x_i] \log \Pr[x_i, z_i = j| \theta]\\ &= \sum_{i=1}^n \sum_{j=1}^k \underbrace{\Pr[z_i = j | x_i]}_{r_{ij}} \log (\Pr[x_i|\theta_j]\underbrace{\Pr[z_i = j| \theta]}_{\pi_j}).\\ &= \sum_{i=1}^n \sum_{j=1}^k r_{ij} \log \Pr[x_i|\theta_j] + \sum_{i=1}^n \sum_{j=1}^k r_{ij} \log \pi_j. \end{align*} $$ Computing $r_{ij}$ for the E-step is done by simple Bayes theorem, $$ r_{ij} = \frac{\pi_j \Pr[x_i|\theta_j^{t-1}]}{\sum_{j'=1}^k \pi_{j'} \Pr[x_i|\theta_{j'}^{t-1}]}.$$ Now for the M-step we would like to compute the mixing weights $\pi$ and parameters $\theta$ that maximize $Q$. Luckily for us, we can optimize for each independently as seen by the final formulation! By taking the derivative of $Q$ wrt $\pi$ with the Lagrange constraint that $\sum_j \pi_j = 1$ we have $$ \pi_j = \frac{\sum_{i=1}^n r_{ij}}{n}. $$
In order to optimize $Q$ wrt $(\mu_j, \sigma_j)$ lets look at the part only dependent on them (ignoring some constants). $$ \sum_{i=1}^n r_{ij} \log \Pr[x_i|\theta_j] = -\frac{1}{2}\sum_{i=1}^n r_{ij} \left(\log \sigma_j + \frac{(x_i - \mu_j)^2}{\sigma_j}\right) $$ Using this information helps when taking the derivative of $Q$ wrt to each parameter. Finally we have, $$ \begin{align*} \mu_j &= \frac{\sum_{i=1}^n r_{ij} x_i}{\sum_{i=1}^n r_{ij}}, &\sigma_j &= \frac{\sum_{i=1}^n r_{ij} x_i^2}{\sum_{i=1}^n r_{ij}} - \mu_j^2. \end{align*} $$
In your case, $x = (1, 3, 4, 5, 7, 8, 9, 13, 14, 15, 16, 18, 23)$, $z = (z_1,\dotsc, z_{13})$, $k=2$, and $\theta = ((\mu_1, \sigma_1),(\mu_2, \sigma_2))$. EM begins by randomly initializing $\pi = (\pi_1, \pi_2)$ and $\theta$ and then alternating between E-step and M-step. Once the parameters haven't changed much (or better the observed-data log-likelihood hasn't changed much) you can stop and output your current values $\theta^t$ and $\pi$.
One benefit from using $|\ell(\theta^t) - \ell(\theta^{t-1})|$ over $||\theta^t - \theta^{t-1}||$ as a stopping condition is for bug checking. EM guarantees that the likelihood will always monotonically increase; therefore, you can simply check the values each iteration to much sure that your likelihood is greater than or equal to last iteration's. If it isn't, then something is wrong.
Finally one important thing to note is that computing a MLE for a mixture of Gaussians will tend to overfit. One way to overcome this is by incorporating priors for $\pi$ and $\theta$ and then computing the MAP estimate instead. This would only change the M-step; however, its derivation is dependent on which prior you choose.
EDIT: Since it is clear you are more interested in just a working example than really understanding any of this here is Python code that will run EM for a basic GMM. Running this in ipython gives:
In [1]: import em In [2]: x = (1, 3, 4, 5, 7, 8, 9, 13, 14, 15, 16, 18, 23) In [3]: em.em(x, 2) Out[3]: (array([ 0.49645528, 0.50354472]), array([ 15.89753149, 5.10207907]), array([ 3.8611819 , 2.65861623]))
where the final output are the mixing weights, means for each Gaussian, and stdevs for each Gaussian.