Outline method:
Outline method
I don't think we have a sample dataset sufficient to test at the moment, but these approaches should get you there. As a beginner in R you may need to step back to master some of the basics like dataframe subsetting, using lapply to loop over lists and so on.
Full Solution with Sample Data and Code
First lets make a sample data set with n rows, with random locations, years, and territories:
set.seed(123) library(sf) n=100 df = st_as_sf(data.frame(x=runif(n), y=runif(n), year = sample(2000:2004, n, TRUE), territory = sample(LETTERS[1:12], n, TRUE) ), coords=1:2)
Let's get the full distance matrix. This is the inefficientest part of this, since it has to compute all NxN distances. Proper nearest neighbour codes can improve by making spatial indexes, but if you're not doing this with a Big Data set or doing it a zillion times with the same points then this should be okay:
dm = st_distance(df)
Next turn that NxN matrix into a data frame with index columns for the rows of df and the corresponding distance:
ijd = data.frame(expand.grid(i=1:n, j=1:n)) ijd$distance = c(dm)
Now put the year and territory into that data frame by getting the ith and jth rows from df:
ijd$year.i = df$year[ijd$i] ijd$year.j = df$year[ijd$j] ijd$territory.i = df$territory[ijd$i] ijd$territory.j = df$territory[ijd$j]
Filter out differing years and keep matching territories. The "differing years" criterion ensures a point can't be a nearest neighbour of itself.
ijd = ijd[ijd$year.i != ijd$year.j,] ijd = ijd[ijd$territory.i == ijd$territory.j,]
(Note, I've not tested this if there are no points matching the above criteria for a given point)
Split into data frames for each i point.
dg = split(ijd, ijd$i)
Now we want to sort those by distance and get the top 3, unless there's less than 3 in which case return the lot. Let's write a function that does that given a data frame so we can test this before trying:
n3 = function(d){ d = d[order(d$distance),] d[1:min(c(nrow(d),3)),] }
And let's apply it:
dn = lapply(dg,n3)
This returns a list of data frames like this:
> dn[[1]] i j distance year.i year.j territory.i territory.j 3901 1 40 0.1700273 2003 2001 I I 4101 1 42 0.2289469 2003 2000 I I 7801 1 79 0.2821005 2003 2001 I I > dn[[2]] i j distance year.i year.j territory.i territory.j 1102 2 12 0.3364577 2000 2002 A A 9902 2 100 0.3663863 2000 2004 A A 1602 2 17 0.5838280 2000 2004 A A
The first one showing that point 1's nearest neighbours matching the criteria are 40, 42, and 79, point 2's nearest neighbours are 12, 100 and 17. The list has 100 elements (unless a point has no neighbours under the matching criteria, because I think they will just drop out, untested...)
If you want to drop all the other columns and make a simple data frame then:
nnij = do.call(rbind, dn)[,c("i","j")]
giving:
> head(nnij,10) i j 1.3901 1 40 1.4101 1 42 1.7801 1 79 2.1102 2 12 2.9902 2 100 2.1602 2 17 3.5103 3 52 3.9603 3 97 3.5703 3 58 4.5504 4 56 >