0

Hopefully a quick question. I have an IEnumerable of type Position where Position is defined as follows:

public class Position { public double Latitude { get; set; } public double Longitude { get; set; } } 

What I need to do is quickly sort through the IEnumerable to find elements that fall within a certain distance of eachother. The elements in the IEnumerable do not get populated in any specific order, but at any one time I need to be able to compute which of the members of the IEnumerable fall within x kilometers of eachother.

Now, I already have a Haversine implementation and for argument's sake, we can say it's called GetDistance and has the following signature:

double GetDistance(Position one, Position two); 

I've got a few ideas, but none of them feel particularly efficient to my mind. To give a little more info, it's unlikely the IEnumerable will ever hold more than 10,000 items at any one time.

What I'd like to arrive at is something, perhaps an extension method, that lets me invoke it to return an IEnumerable which contains just the subset from the original collection which meets the criteria, for example:

OriginalEnumerable.GetMembersCloserThan(kilometers: 100); 

Any help, much appreciated.

EDIT: For clarity, consider the IEnumerable I want to search describes a circle with radius r. It's members are coordinates within the circle. I'm trying to determine which members (points) are within a given proximity of eachother.

11
  • so would OriginalEnumerable.GetMembersCloserThan(kilometers: 100); get the Position objects that are within 100 kilometers of at least 1 other Position object? Commented Oct 30, 2014 at 15:02
  • If you have a GetDistance method that compares two positions, why not simply create the desired extension method yourself? What is the problem? Commented Oct 30, 2014 at 15:03
  • @SamIam yes, the idea being that the return value of the GetMembersCloserThan()... method would return members who are all within x kilometers of eachother. Commented Oct 30, 2014 at 15:04
  • @BenRobinson tried plenty, but the best I came up with was nested for-each loops, hardly efficient. Commented Oct 30, 2014 at 15:04
  • 2
    @Richard within x kilometers of eachother is ambigous. Take the case where you have 4 elements each split up into pairs. each pair is only 1 kilometer from the other element in their pair, but the pairs are roughly 200 kilometers away from each-other. Do you want to select all of the elements in this case? or none of them? Commented Oct 30, 2014 at 15:06

2 Answers 2

2

Something like this? Assuming GetDistance is available.

public static IEnumerable<Position> GetMembersCloserThan(this IEnumerable<Position> positions, double maxDistance) { return positions.Where(a => positions.Any(b => a != b && GetDistance(a, b) < maxDistance)); } 

Edit I see now you are also interested in performance. The above is not particularly fast, though it is not horribly slow either since the math is pretty simple for comparing distances. Let me know if it satisfies your requirements.

Edit 2 This one is much faster--it won't test against past failures and will automatically add a match to the success list

public static IEnumerable<Position> GetMembersCloserThan(this IEnumerable<Position> positions, double maxDistance) { List<Position> closePositions = new List<Position>(); List<Position> testablePositions = positions.ToList(); foreach (Position position in positions) { // Skip this one, it has already been matched. if (closePositions.Contains(position)) continue; bool isClose = false; foreach (Position testAgainstPosition in testablePositions) { if (position == testAgainstPosition) continue; if (GetDistance(position, testAgainstPosition) < maxDistance) { // Both the position and the tested position pass. closePositions.Add(position); closePositions.Add(testAgainstPosition); isClose = true; break; } } // Don't test against this position in the future, it was already checked. if (!isClose) { testablePositions.Remove(position); } } return closePositions; } 
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks @smdrager, let me take a quick look to see how this works out.
Perfect, smdrager! EDIT 2 was the kind of thing I was hoping to get to. Bang on, appreciate your help!
1

If you need more performance: Put the items in a Lists sorted by latitude.

To calculate your desired set of locations, iterate one of them. But for your distance calculation, you only need to consider the items, that are at most 100km different in latitude. That means, you can go back item by item, until the difference is greater than 100km. You need to wrap around the end of the list, however. Mark all items (or yyield return) that are closer than 100km and move on.

Although I cannot quantify the expense, the sorting should amortize for large sets. Also, it may perform bad if most point lie at similar latitude. If that is an issue, use a 2D-Dictionary with rounded coordinates as keys.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.