Since the article you linked is focussed on linear regression, I will answer assuming that $f(x_{i},\beta) = x_{i}^{T}\beta$, for simplicity.
The simplest model, under homoskedasticity is assumed to be $y_{i} = x_{i}^{T}\beta + \varepsilon_{i}, i = 1, 2, \dots, n$, with $E[\varepsilon_{i}] = 0$ and $E[\varepsilon_{i}^{2}] = \sigma^{2}$ (if you want you can make these expectations conditional on $X$, but for simplicity I will assume that regressors $X$ are fixed). In matrix notation this would be $Y = X\beta + \varepsilon$ with $E[\varepsilon] = 0$ and $E[\varepsilon\varepsilon^{T}] = \Sigma$ with $\Sigma = \sigma^{2}I$, with $I$ an identity matrix of size $n$. The variance being constant refers to the fact that the true residuals $\varepsilon_{i}$ all have variance $\sigma^{2}$ under this model. If this is the case, then OLS, which minimises $$ SSE(\beta) = \sum_{i=1}^{n}(y_{i} - x_{i}^{T}\beta)^{2} = (Y-X\beta)^{T}(Y-X\beta), $$ is optimal among the class of linear unbiased estimators (BLUE), in the sense that $var(\hat{\beta})$ is smallest among this class.
Suppose now that you have heteroskedasticity, then the model is the same as in th above, except that now it is assumed that $E[\varepsilon\varepsilon^{T}] = \mathrm{diag}(\sigma_{1}^{2}, \sigma^{2}_{2}, \dots, \sigma_{n}^{2})$, so each residual $\varepsilon_{i}$ has a different variance, but all residuals are still uncorrelated. In this case, OLS is no longer optimal, and the weighted least squares estimator is now BLUE.
In-sample you will of course never find that things match up perfectly with your model assumptions, i.e. under homoskedasticity even you won't find that all residuals $e^{2}(x_{i},\beta)$ are exactly the same, but you can do tests to see whether this is the case (see e.g. here). Similar tests for autocorrelation (i.e. fully unrestricted covariance matrix $\Sigma$) also exist, but then you should use generalized least squares.
Since in practice you never know the true $\sigma_{i}^{2}$ under heteroskedasticity, there is a procedure called feasible weighted/generalized least squares (see here), which aims to estimate the weights.