There are many possible ways to define that tolerance. Since, we are talking about XY coordinates, most probably we are talking about euclidean distances to set that tolerance value. So, we can use Cython-powered kd-tree for quick nearest-neighbor lookup, which is very efficient both memory-wise and with performance. The implementation would look something like this -
from scipy.spatial import cKDTree # Assuming a default tolerance value of 1 here def intersect_close(a, b, tol=1): # Get closest distances for each pt in b dist = cKDTree(a).query(b, k=1)[0] # k=1 selects closest one neighbor # Check the distances against the given tolerance value and # thus filter out rows off b for the final output return b[dist <= tol] Sample step-by-step run -
# Input 2D arrays In [68]: a Out[68]: array([[1.22, 5.64], [2.31, 7.63], [4.94, 4.15]]) In [69]: b Out[69]: array([[ 1.23, 5.63], [ 6.31, 10.63], [ 2.32, 7.65]]) # Get closest distances for each pt in b In [70]: dist = cKDTree(a).query(b, k=1)[0] In [71]: dist Out[71]: array([0.01414214, 5. , 0.02236068]) # Mask of distances within the given tolerance In [72]: tol = 1 In [73]: dist <= tol Out[73]: array([ True, False, True]) # Finally filter out valid ones off b In [74]: b[dist <= tol] Out[74]: array([[1.23, 5.63], [2.32, 7.65]]) Timings on 200,000 pts -
In [20]: N = 200000 ...: np.random.seed(0) ...: a = np.random.rand(N,2) ...: b = np.random.rand(N,2) In [21]: %timeit intersect_close(a, b) 1 loop, best of 3: 1.37 s per loop