1

I have two large gpkg datasets. I would like to clip one from the other and use parallel processing to speed up the process. However, I receive an error message. Code and error message shown below:

#load required libraries library(sf) library(future.apply) #Set up parallel processing options(future.numCores = 20) plan(multicore) #Load the source geopackage gpkg_source <- st_read("E:/Roads/UK_roadlink_NoPaths.gpkg") #Load the clip geopackage gpkg_clip <- st_read("E:/Areas/400m_GB_Dissolved.gpkg") #Clip the data gpkg_clipped <- future_sapply(gpkg_source, function(x) { st_intersection(x, gpkg_clip) }) #Combine the clipped data into a single sf object gpkg_clipped <- do.call(rbind, gpkg_clipped) #Save the clipped geopackage to a new file st_write(gpkg_clipped, "E:/Roads/NoPaths_400mCatchment_Clip.gpkg") #Stop parallel processing plan(sequential) 

Error message

Error in UseMethod("st_intersection") : no applicable method for 'st_intersection' applied to an object of class "character" 

I have tried to see whether the gpkg is in fact returned as character using

class("E:/Roads/UK_roadlink_NoPaths.gpkg") 

Which returns the following:

[1] "character" 

Both gpkg's load fine in QGIS and ArcGIS Pro and I am able to conduct analyses on them.

Where am I going wrong in my code?

1

1 Answer 1

3

future_sapply works along columns of the data, not rows. Here's using a sample data set generated from example(st_read); gpkg_source=nc. If I print the function arg in the loop I can see its the columns...

> s = future_sapply(gpkg_source, function(x){print(str(x))}) num [1:100] 0.114 0.061 0.143 0.07 0.153 0.097 0.062 0.091 0.118 0.124 ... NULL num [1:100] 1.44 1.23 1.63 2.97 2.21 ... NULL 

Then if I try and do st_intersection in the function I get something like your error:

> s = future_sapply(gpkg_source, function(x){st_intersection(x)}) Error in UseMethod("st_intersection") : no applicable method for 'st_intersection' applied to an object of class "c('double', 'numeric')" 

In this case I get a message about "double" and "numeric" because the first column is numbers - maybe your first column is a character variable which would explain your error message.

It seems you want to loop over rows instead. Run the loop over 1 to the number of rows using an lapply version and extract the row inside the function. Here's a trivial "intersect yourself" operation:

> s = future_lapply(1:nrow(gpkg_source), function(i){ st_intersection(gpkg_source[i,], gpkg_source[i,])}) 

The output is a list that you can then rbind up into a spatial data frame:

> out = do.call(rbind, s) > out Simple feature collection with 100 features and 28 fields Geometry type: GEOMETRY Dimension: XY Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 Geodetic CRS: NAD27 First 10 features: AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 SID74 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091 1 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487 0 

Your investigation into class("E:/Roads/UK_roadlink_NoPaths.gpkg") is a red herring. That returns "character" because you passed a character string in quotes. You could try class(st_read("E:/Roads/UK_roadlink_NoPaths.gpkg") which will tell you its a spatial data frame.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.