2
$\begingroup$

Summary

I am looking for a convex and robust formulation to fit an ellipse to a set of points.
Specifically, can handle an extreme condition number of the Scattering Matrix.

Full Question

The Algebraic Representation of an Ellipse for a point $\left( x, y \right)$ on the ellipse is given by:

$$ a {x}^{2} + b x y + c {y}^{2} + d x + e y + f = 0 \Leftrightarrow \boldsymbol{x}^{\top} \boldsymbol{A} + \boldsymbol{b}^{\top} \boldsymbol{x} + f $$

For a set of points $\left\{ {\left[ {x}_{i}, {y}_{i} \right]}^{T} \right\}_{i = 1}^{n}$ one could build the Scatter Matrix:

$$ \boldsymbol{D} = \begin{bmatrix} \text{---} & \boldsymbol{d}_{1} & \text{---} \\ \text{---} & \boldsymbol{d}_{2} & \text{---} \\ \text{---} & \vdots & \text{---} \\ \text{---} & \boldsymbol{d}_{n} & \text{---} \end{bmatrix}, \; \boldsymbol{d}_{i} = {\left[ {x}_{i}^{2}, {x}_{i} {y}_{i}, {y}_{i}^{2}, {x}_{i}, {y}_{1}, 1 \right]} $$

Most algorithms are variation of:

$$ \arg \min_{\boldsymbol{p}} \frac{1}{2} {\left\| \boldsymbol{D} \boldsymbol{p} \right\|}_{2}^{2} $$

The solution is up to scaling hence additional constraint is added.
Usually $\boldsymbol{p}^{\top} \boldsymbol{C} \boldsymbol{p} = 1 \Leftrightarrow 4 a c - {b}^{2} = 1$ which also forces the Matrix $\boldsymbol{A}$ to represent an Ellipse (Positive Definite).

So the problem becomes:

$$ \begin{align*} \arg \min_{\boldsymbol{p}} \quad & \frac{1}{2} {\left\| \boldsymbol{D} \boldsymbol{p} \right\|}_{2}^{2} \\ \text{subject to} \quad & \begin{aligned} \boldsymbol{p}^{\top} \boldsymbol{C} \boldsymbol{p} & = 1 \\ \end{aligned} \end{align*}, \quad \boldsymbol{C} = \begin{bmatrix} 0 & 0 & 2 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} $$

This formulation is not Convex (Non Linear equality constraint).

Yet, this model will fail even for the simple case of of 100 points with no noise at all.
The reason is the condition number of $\boldsymbol{D}$.

I wonder if there are formulations which are more robust?
Specifically if there is a Convex formulation for the problem.

Maybe a different polynomial basis? I tried Chebyshev Polynomials, yet it reduced only small fraction of the condition number.

Remarks

$\endgroup$
9
  • $\begingroup$ Have you considered Ransac? $\endgroup$ Commented Jul 21 at 18:22
  • $\begingroup$ @user619894, RANSAC solves the issue with outliers in the measurements. The formulation I'm after is dealing with the case the scatter matrix has high condition number. Not the same. $\endgroup$ Commented Jul 21 at 19:24
  • $\begingroup$ if you select a small amount of points (a la RANSAC), the matrix won't be near singular (and if it is, just reject and try again). Then look for the best model among all selections. $\endgroup$ Commented Jul 22 at 6:39
  • $\begingroup$ @user619894, Then I loose the accuracy of having more data. $\endgroup$ Commented Jul 22 at 13:10
  • 1
    $\begingroup$ I’ll vote to undelete it. Looks like a good answer, to me, and fitting an ellipse to point data is a common problem. $\endgroup$ Commented Jul 22 at 22:16

2 Answers 2

2
$\begingroup$

The formulation I came up with is as following:

$$ \begin{align*} \arg \min_{\boldsymbol{p}} \quad & \frac{1}{2} {\left\| \boldsymbol{D} \boldsymbol{p} \right\|}_{2}^{2} \\ \text{subject to} \quad & \begin{aligned} \boldsymbol{A} & \in \mathbb{S}_{+}^{2} \\ \operatorname{tr} \left( \boldsymbol{A} \right) & = 1 \\ \end{aligned} \end{align*} $$

Where $\boldsymbol{A} = \begin{bmatrix} {p}_{1} & \frac{{p}_{2}}{2} \\ \frac{{p}_{2}}{2} & {p}_{3} \end{bmatrix}$.

  • The constraint $\boldsymbol{A} \in \mathbb{S}_{+}^{2}$ means the matrix is SPSD (Symmetric Positive Semi Definite) which forces the solution to be an ellipse or parabola (See Matrix Representation of Conic Sections).
  • The constraint $\operatorname{tr} \left( \boldsymbol{A} \right) = 1$ solve the scaling issue and guarantees an ellipse as it forces the sum of eigenvalues to be 1.

Both constraints are convex and serve the same purpose as the non convex constraint in the classic problem.

The whole problem is Convex and can be easily solved by any DCP Solver (CVX in MATLAB, CVXPY in Python or Convex.jl in Julia).

The result looks good even for badly conditioned cases (No noise, condition number of ~1e17).

enter image description here

I will add answers which shows how to solve this numerically and even a more robust formulation.

$\endgroup$
1
  • $\begingroup$ @user619894, A typo. Should be $p$'s. $\endgroup$ Commented Jul 23 at 9:38
0
$\begingroup$

Robust Ellipse Fit

The Algebraic Representation of an Ellipse for a point $\left( x, y \right)$ on the ellipse is given by:

$$ a {x}^{2} + b x y + c {y}^{2} + d x + e y + f = 0 \Leftrightarrow \boldsymbol{x}^{\top} \boldsymbol{A} + \boldsymbol{b}^{\top} \boldsymbol{x} + f $$

For a set of points $\left\{ {\left[ {x}_{i}, {y}_{i} \right]}^{T} \right\}_{i = 1}^{n}$ one could build the Scatter Matrix:

$$ \boldsymbol{D} = \begin{bmatrix} \text{---} & \boldsymbol{d}_{1} & \text{---} \\ \text{---} & \boldsymbol{d}_{2} & \text{---} \\ \text{---} & \vdots & \text{---} \\ \text{---} & \boldsymbol{d}_{n} & \text{---} \end{bmatrix}, \; \boldsymbol{d}_{i} = {\left[ {x}_{i}^{2}, {x}_{i} {y}_{i}, {y}_{i}^{2}, {x}_{i}, {y}_{1}, 1 \right]} $$

A Convex formulation for Ellipse Fitting model:

$$ \begin{align*} \arg \min_{\boldsymbol{p}} \quad & f \left( \boldsymbol{D} \boldsymbol{p} \right) \\ \text{subject to} \quad & \begin{aligned} \boldsymbol{A} & \in \mathbb{S}_{+}^{2} \\ \operatorname{tr} \left( \boldsymbol{A} \right) & = 1 \\ \end{aligned} \end{align*} $$

Where

  • $\boldsymbol{p} = {\left[ a, b, c, d, e, f \right]}^{\top}$ - The vector of parameters.
  • $\boldsymbol{A} = \begin{bmatrix} {p}_{1} & \frac{{p}_{2}}{2} \\ \frac{{p}_{2}}{2} & {p}_{3} \end{bmatrix}$.
  • The function $f : \mathbb{R}^{6} \to \mathbb{R}$ is a loss function which promotes lower values. Conceptually behaves like $\left\| \boldsymbol{D} \boldsymbol{p} \right\|$.
  • The constraint $\boldsymbol{A} \in \mathbb{S}_{+}^{2}$ means the matrix is SPSD (Symmetric Positive Semi Definite) which forces the solution to be an ellipse or parabola (See Matrix Representation of Conic Sections).
  • The constraint $\operatorname{tr} \left( \boldsymbol{A} \right) = 1$ solve the scaling issue and guarantees an ellipse as it forces the sum of eigenvalues to be 1.

Define:

  • $\mathcal{A} \left( \boldsymbol{p} \right) = \boldsymbol{A}$ - A linear operator which extracts the matrix $\boldsymbol{A}$ from a vector.
  • $\mathcal{C} = \left\{ \boldsymbol{A} \mid \boldsymbol{A} \in \mathbb{S}_{+}^{2}, \operatorname{tr} \left( \boldsymbol{A} \right) = 1 \right\}$ - The set of matrices which obey the constraints.
  • $\mathcal{A}^{-1} \left( \boldsymbol{A}, \boldsymbol{p} \right)$ A linear operator which updates the 3 elements of the vector $\boldsymbol{p}$ such that ${p}_{1} = {A}_{1, 1}, {p}_{2} = {A}_{1, 2}, {p}_{3} = {A}_{2, 2}$.
  • $\boldsymbol{K} = \boldsymbol{D}^{\top} \boldsymbol{D}$ (Can be pre calculated).

Then the problem can be rewritten:

$$ \begin{align*} \arg \min_{\boldsymbol{p}} \quad & f \left( \boldsymbol{D} \boldsymbol{p} \right) \\ \text{subject to} \quad & \begin{aligned} \mathcal{A} \left( \boldsymbol{p} \right) & \in \mathcal{C} \\ \end{aligned} \end{align*} $$

The concept is to search methods which can handle large condition number of $\boldsymbol{D}$.
It can be done mostly by avoiding steps which require its inverse directly.

Least Squares Based Optimization

The Least Squares variant sets $f$ to be the Least Squares loss function:

$$ \begin{align*} \arg \min_{\boldsymbol{p}} \quad & \frac{1}{2} {\left\| \boldsymbol{D} \boldsymbol{p} \right\|}_{2}^{2} \\ \text{subject to} \quad & \begin{aligned} \mathcal{A} \left( \boldsymbol{p} \right) & \in \mathcal{C} \\ \end{aligned} \end{align*} $$

Let $\operatorname{P}_{\mathcal{C}} : \mathbb{R}^{2} \to \mathbb{R}^{2}$ be the projection operator of a symmetric matrix onto $\mathcal{C}$.

Then the problem above can be solved in various methods.

Projection onto the Set $\mathcal{C}$

Given a symmetric matrix $\boldsymbol{B}$ (Built form the 3 elements of the vector in our case):

$$ \begin{align*} \arg \min_{\boldsymbol{A} \in \mathcal{C}} \frac{1}{2} {\left\| \boldsymbol{A} - \boldsymbol{B} \right\|}_{F}^{2} & = \arg \min_{\boldsymbol{A} \in \mathcal{C}} \frac{1}{2} {\left\| \boldsymbol{U}^{T} \boldsymbol{A} \boldsymbol{U} - \operatorname{Diag} \left( \boldsymbol{b} \right) \right\|}_{F}^{2} && \text{Eigen decomposition of $\boldsymbol{B}$} \\ & = \arg \min_{\boldsymbol{A} \in \mathcal{C}} \frac{1}{2} {\left\| \boldsymbol{C} - \operatorname{Diag} \left( \boldsymbol{b} \right) \right\|}_{F}^{2} && \text{} \\ & = \arg \min_{\boldsymbol{A} \in \mathcal{C}} \sum_{i} {\left( {C}_{i, i} - \boldsymbol{b}_{i} \right)}^{2} + 2 \sum_{i < j} {C}_{i, j}^{2} \end{align*} $$

In order to minimize the sum, the off diagonal element of $\boldsymbol{C}$ must be zero. In order to do so, the matrix $\boldsymbol{U}$ must also diagonalize the matrix $\boldsymbol{A}$.
Since $\operatorname{tr} \left( \boldsymbol{U}^{T} \boldsymbol{A} \boldsymbol{U} \right) = \operatorname{tr} \left( \boldsymbol{A} \boldsymbol{U} \boldsymbol{U}^{T} \right) = \operatorname{tr} \left( \boldsymbol{A} \right)$ the projection is given by:

$$ \operatorname{P}_{\mathcal{C}} \left( \boldsymbol{B} \right) = \boldsymbol{U} \operatorname{Diag} \left( \hat{\boldsymbol{b}} \right) \boldsymbol{U}^{T} $$

Where

  • $\boldsymbol{B} = \boldsymbol{U} \operatorname{Diag} \left( \boldsymbol{b} \right) \boldsymbol{U}^{T}$ - The Eigen Decomposition of $\boldsymbol{B}$.
  • $\hat{\boldsymbol{b}}$ - The projection of $\boldsymbol{b}$ onto the Unit Simplex.

In order to apply the solution on the vector, the result of the projection should be inserted into the projected vector.

Solution by (Accelerated) Projected Gradient Descent

With $f = \frac{1}{2} {\left\| \boldsymbol{D} \boldsymbol{p} \right\|}_{2}^{2}$ the Projected Gradient Descent:

  • Set $\boldsymbol{p}_{0}$.
  • For $k = 1, 2, \ldots, K$:
    1. $\boldsymbol{g}_{k} = {\nabla}_{\boldsymbol{p}} f \left( \boldsymbol{p}_{k - 1} \right)$.
    2. $\boldsymbol{p}_{k} = \boldsymbol{p}_{k - 1} - \mu \boldsymbol{g}_{k}$.
    3. $\boldsymbol{p}_{k} = \mathcal{A}^{-1} \left( \operatorname{P}_{\mathcal{C}} \left( \mathcal{A} \left( \boldsymbol{p}_{k} \right) \right), \boldsymbol{p}_{k} \right)$.
  • Return $\boldsymbol{p}_{K}$.

Remarks:

  • ${\nabla}_{\boldsymbol{p}} f \left( \boldsymbol{p} \right) = \boldsymbol{K} \boldsymbol{p}$.
  • One may use accelerated version of the algorithm.
  • One may check for convergence conditions on each iteration.

Solution by Separable Prox Method (Douglas Rachford Splitting)

The Douglas Rachford Splitting (See Proximal Splitting Methods in Signal Processing, at the Douglas Rachford Splitting chapter) optimization model:

$$ \arg \min_{\boldsymbol{p}} f \left( \boldsymbol{p} \right) + g \left( \boldsymbol{p} \right) $$

The solver iterations:

  • Set $\boldsymbol{q}_{0}, \gamma > 0, 0 < \lambda < 1$.
  • For $k = 1, 2, \ldots, K$:
    1. $\boldsymbol{p}_{k} = \operatorname{prox}_{\gamma f} \left( \boldsymbol{q}_{k - 1} \right)$.
    2. $\boldsymbol{q}_{k} = \boldsymbol{q}_{k - 1} + \lambda \left( \operatorname{prox}_{\gamma g} \left( 2 \boldsymbol{p}_{k} - \boldsymbol{q}_{k - 1} \right) - \boldsymbol{p}_{k} \right)$.
  • Return $\operatorname{prox}_{\gamma g} \left( \boldsymbol{q}_{K} \right)$.

Remarks:

  • One may use ${\lambda}_{k}$ to adapt its value per iteration.
  • One may check for convergence conditions on each iteration.
  • The last operation, $\operatorname{prox}_{\gamma g} \left( \boldsymbol{q}_{K} \right)$, ensures the output obeys the constraints.

Setting:

  • $f \left( \boldsymbol{p} \right) = \frac{1}{2} {\left\| \boldsymbol{D} \boldsymbol{p} \right\|}_{2}^{2}$ - The objective function.
  • $g \left( \boldsymbol{p} \right) = {I}_{\mathcal{C}} \left( \boldsymbol{p} \right)$ - The indicator function over $\mathcal{C}$.

The Proximal Operator of each function is given by:

$$ \operatorname{prox}_{\gamma f} \left( \boldsymbol{y} \right) = {\left( \gamma \boldsymbol{K} + \boldsymbol{I} \right)}^{-1} \boldsymbol{y}, \; \operatorname{prox}_{\gamma g} \left( \boldsymbol{y} \right) = \mathcal{A}^{-1} \left( \operatorname{P}_{\mathcal{C}} \left( \mathcal{A} \left( \boldsymbol{y} \right) \right), \boldsymbol{y} \right) $$

Remarks:

  • The matrix $\gamma \boldsymbol{K} + \boldsymbol{I}$ is an SPD matrix hence its Cholesky Decomposition can be pre calculated to accelerate the system of equations solution. Adding the Identity Matrix reduce the condition number and stabilizes the ability to solve the problem. If $\gamma < 1$ it also assists to regularize the condition number.
$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.