5

Given 2 ordered sets of n points each, A and B, how do I find the best circular permutation which minimizes the average pairwise-distance (using the distance of your choice) between points.

In other words, how do I algorithmically find k such that it minimizes sum(||A[i] - B[(i + k) % n||) with 0 <= k < n? (I have omitted the division by n because minimizing the total distance should yield the same result as the mean I believe).

One extra requirement is that the algorithm should be usable in N-Dimensional spaces so I can't just sort the arrays.

I could obviously compute every pairwise distance but that would yield O(n^2) (n x n pairwise distance computation + n accumulations) complexity which is sub-optimal ([edit] I mean here that I sure hope one can do better than brute force).

Application: One application is in graphics where I want to map each point of a shape to a point of another shape without creating crossing edges. See drawing below where we map each point of the red shape to a point on the blue shape. shape_points_mapping

12
  • How do you know that it is sub-optimal? Do you know a faster algorithm? Commented Dec 10, 2020 at 16:24
  • @Stef I don't so perhaps this is the best exact solution. I just hope this is not. I would imagine one could find some approximate solution in better time than brute force but I am not so sure how to proceed. I could try and find the smallest pair-wise distance and then start some sort of gradient descent from there, but will this yield the best solution? Commented Dec 10, 2020 at 16:31
  • What is your domain? Euclidean space? Vector space? What is your metric? Commented Dec 10, 2020 at 16:49
  • I specifically work in 3-D euclidian space but I would imagine the algorithm wouldn't change much with the norm you use, would it? A norm is a norm. Commented Dec 10, 2020 at 17:02
  • Why the requirement that the permutation must be "circular"? What would happen if, for example, on your 2D picture above the points would be enumerated in a different order, or if the shape was 3D? Commented Dec 10, 2020 at 17:44

1 Answer 1

2

I have two ideas.

If you're willing to optimize the sum of squares of distances, then there's an O(n log n) time algorithm based on fast convolution. The modified objective allows us to find the contribution of each coordinate separately for each possible rotation. Then we sum element-wise and choose the best.

To solve the reduced problem in 1D: we want

sum_i (A[i] - B[(i+k) mod n])**2 

for each k. Do some algebra:

sum_i (A[i] - B[(i+k) mod n])**2 = sum_i (A[i]**2 - 2*A[i]*B[(i+k) mod n] + B[(i+k) mod n]**2) = sum_i A[i]**2 + sum_i B[i]**2 - 2*sum_i (A[i]*B[(i+k) mod n]). 

The first two terms are the same for all k, so just compute them. The vector of third terms for all k can be computed quickly in bulk as a constant times A convolved with the reverse of B.


My second idea is a recursive heuristic. If n is small, just brute force it. Otherwise, make a smaller instance by computing the midpoint of each pair of consecutive points in each list. Recursively align these. Then multiply the heuristic rotation by two and check it against the rotations one up and one down from it. In constant dimensions, this yields a recurrence like T(n) = T(n/2) + O(n), which is O(n).

Sign up to request clarification or add additional context in comments.

6 Comments

I don't think the first method is really that much faster actually. Developing the square method is a good idea but in the end, even if you disregard the first term and try and minimize -2*sum (A[i] * B[(i +k) %n] you will still have to do this for all k and i so that is still n^2 operations. The second idea on the other hand seems like a very interesting lead to me. I'll try and implement it and keep you updated. Thank you very much.
@Niels computing it for one particular k would cost O(n) time, but the convolution would compute all n rotations in O(n log n) time.
Why is sum_i A[i]**2 equal to sum_i B[i]**2?
@STF It isn't, good catch. The important part was that we can extricate k from those terms.
@Niels For idea 1, if you just evaluate that formula directly it's indeed going to be quadratic. My point is that it's amenable to a Fast Fourier Transform, which achieves a significant speedup by sharing work between different values of k. For idea 2, I meant take every other midpoint.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.