Skip to main content
deleted 36 characters in body
Source Link
VividD
  • 666
  • 7
  • 19

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \Big\Vert \vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\Big\Vert^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$= \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 + \frac{n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A^2 n_B}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 +\sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 - \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 = \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 - \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 = \frac{n_A n_B (n_A + n_B)}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \Big\Vert \vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\Big\Vert^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A^2 n_B}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 +\sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 - \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 = \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \Big\Vert \vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\Big\Vert^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 + \frac{n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 +\sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 - \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 = \frac{n_A n_B (n_A + n_B)}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

added 47 characters in body
Source Link
VividD
  • 666
  • 7
  • 19

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \|\vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\|^2$$$$= \sum_{i\in A} \Big\Vert \vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\Big\Vert^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A^2 n_B}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 +\sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 +\sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 - \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 - \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 = \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$$$\Delta \big(A, B\big) = \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \|\vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\|^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 +\sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 - \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \Big\Vert \vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\Big\Vert^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A^2 n_B}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 +\sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 + \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Vert\vec{x}_i-\vec{m}_{A}\Vert^2 - \sum_{i\in B} \Vert\vec{x}_i-\vec{m}_{B}\Vert^2 = \frac{n_A^2 n_B + n_A n_B^2}{(n_A+n_B)^2} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

added 53 characters in body
Source Link
VividD
  • 666
  • 7
  • 19

I will be using these facts without proofsproof (but the proofs either follow directly from definitions or are easystraightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \|\vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\|^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 +\sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 - \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

I will be using these facts without proofs (but the proofs are easy): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \|\vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\|^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 +\sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 - \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

I will be using these facts without proof (but the proofs either follow directly from definitions or are straightforward): The vector of the centroid of a set of points is arithmetic mean of all vectors of the points in the set. Also, the centroid of the union of two sets of point is on the straight line connecting centroids of these two sets, and it divides that straight line into two parts whose ratio is determined by the ratio of number of points of in these sets.

Let's consider following series of identities:

$\sum_{i\in A} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 =$

$$= \sum_{i\in A} \|\vec{x}_i-\Big(\vec{m}_{A} - \frac{n_B}{n_A+n_B} (\vec{m}_{A} - \vec{m}_{B})\Big)\|^2$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big) + \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2 $$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + 2 \cdot \sum_{i\in A}\big(\vec{x}_i-\vec{m}_{A}\big) \cdot \frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big) + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + n_A \cdot \Big\Vert\frac{n_B}{n_A+n_B} \big(\vec{m}_{A} - \vec{m}_{B}\big)\Big\Vert^2\big)$$

$$= \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Similarly:

$$\sum_{i\in B} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

By adding:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 = \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 +\sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 + 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\sum_{i\in {A\cup B}} \|\vec{x}_i-\vec{m}_{A\cup B}\|^2 - \sum_{i\in A} \Big\Vert\big(\vec{x}_i-\vec{m}_{A}\big)\Big\Vert^2 - \sum_{i\in B} \Big\Vert\big(\vec{x}_i-\vec{m}_{B}\big)\Big\Vert^2 = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Or:

$$\Delta \big(A, B\big) = 2 \cdot \frac{n_A n_B}{n_A+n_B} \Vert\vec{m}_{A} - \vec{m}_{B}\Vert^2$$

Source Link
VividD
  • 666
  • 7
  • 19
Loading