2

I have lat lon points stored in a list, a cluster of points in a list, and a list of clusters such that:

point1 = [100,50] point2 = [100,49] point3 = [120,50] point4 = [120,49] cluster1 = [point1,point2] cluster2 = [point2,point4] clusterList = [cluster1,cluster2] 

I want to check to see if a point (as a list of lat lon) is in any of the clusters in the cluster lists. If it's not, I want to perform an operation.

checked = [] for cluster in clusterList: if point5 not in cluster: checked.append(1) else: checked.append(0) if all(checked): clusterList.append([point]) 

This solution works but it doesn't seem very elegant. I would like to know if there is a way to perform this operation either by avoiding the for loop entirely or by not having to create the object "checked" and then checking for the all() condition.

Edit

I would like to perform an operation once and only if the new point is not in any of the clusters. For clarity, I am checking to see if a point is in a cluster. If it's not, I create a new cluster for that point.

2 Answers 2

3

You can express it like this:

>>> to_check = [120, 49] >>> "Not in" if not any(to_check in cluster for cluster in clusterList) else "In" "In" 

Or this:

>>> to_check = [120, 49] >>> "Not in" if all(to_check not in cluster for cluster in clusterList) else "In" "In" 

Note: I'm using any() + not instead of all(). It is faster, than using all(), because it will fail fast. E.g. if former finds point in the first cluster it won't check all others. The latter one will check every cluster even if point was found in the first one.

UPD The previous statement is not true. The performance is the same as using all(), since all() is lazy as well.

Sign up to request clarification or add additional context in comments.

Comments

1

Using the sum-flattening hack

>>> [100, 50] in sum(clusterList, []) True 

4 Comments

Are you sure, that creating new list is good idea, especially with a lot of clusters and points?
I don't see why not. It's only one new list, no new items and the clusters have to be iterated anyway. timeit says it's even faster than your solution.
Just tested using list with 1000 clusters containing 100 points. My approach gave 2.64ms and yours - 397ms. You can try yourself with random list (clusterList = [[[random.randint(0, 100), random.randint(0, 100)] for _ in range(100)] for _ in range(1000)]).
Ok. Tried with generator and is much better, but upvoting yours anyway.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.