4

I have a sf dataframe table_sf as:

Classes ‘sf’, ‘tbl_df’, ‘tbl’ and 'data.frame': 12251 obs. of 5 variables: $ ID : int 1 2 3 4 5 6 7 8 9 10 ... $ NOMBRE : chr "AL011900" "AL011900" "AL011900" "AL011900" ... $ FECHA : POSIXct, format: "1900-08-27 00:00:00" "1900-08-27 06:00:00" "1900-08-27 12:00:00" "1900-08-27 18:00:00" ... $ INT : num 18 18 18 18 18 ... $ geometry:sfc_POINT of length 12251; first list element: 'XY' num -42.1 15 - attr(*, "sf_column")= chr "geometry" - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA ..- attr(*, "names")= chr "ID" "NOMBRE" "FECHA" "INT" 

I will like to create lines between sequential points per each NOMBRE and each line must have the INT column with the value of the first point used to create it.

That will be: the line from point 1 to point 2 will have the INT value of point 1. The line from point 2 to point 3 will have the INT value of point 2. And so on for each of the NOMBRE

My approach has been to loop through each pair of points of each NOMBRE and create the line using the st_cast function of the sf package BUT I'm still far from making it work.

Here is the code I have so far (packages tidyverse & sf):

for (i in table_sf %>% group_by(NOMBRE) %>% summarise()) { table_huracanes <- table_sf %>% filter(NOMBRE == i) %>% mutate(numb = row_number()) for (h in table_huracanes$numb) { linea <- table_huracanes %>% filter(numb <= 2) %>% group_by(NOMBRE) %>% summarise() %>% st_cast("LINESTRING") } } 

I think I'm still missing a couple of things:

  • Find a way in which I can select only two points for each of the second for loop.
  • Create an empty sf dataframe (linestring or multilinestring?) where to append each linea.

To clarify, the table before creating the sf dataframe looks like this:

NOMBRE LAT LONG FECHA INT AL011900 15 -42.1 1900-08-27T00:00:00Z 18.0054 AL011900 15.2 -43.4 1900-08-27T06:00:00Z 18.0054 AL011900 15.3 -44.7 1900-08-27T12:00:00Z 18.0054 AL011900 15.4 -45.6 1900-08-27T18:00:00Z 18.0054 AL021900 19 -59.3 1900-09-13T12:00:00Z 33.4386 AL021900 19.5 -60 1900-09-13T18:00:00Z 36.0108 AL021900 20 -60.6 1900-09-14T00:00:00Z 38.583 AL041905 36.3 -48.6 1905-10-10T18:00:00Z 46.2996 AL041905 37.9 -47.9 1905-10-11T00:00:00Z 43.7274 AL041905 39.6 -47.1 1905-10-11T06:00:00Z 41.1552 AL041905 41 -46 1905-10-11T12:00:00Z 41.1552 

Note that for each NOMBRE there are different INT values. That's the reason why I need to create lines between each pair of points.

2 Answers 2

5

Here is a tidyverse method of doing it, starting with your table from before converting to sf. The approach is to create a long-form table where each row is a start or end point, but include a lineid so that you can group_by on it and summarise to union the right points together, and then st_cast to LINESTRING.

library(tidyverse) library(sf) #> Linking to GEOS 3.6.1, GDAL 2.2.3, proj.4 4.9.3 table <- structure(list(NOMBRE = c("AL011900", "AL011900", "AL011900", "AL011900", "AL021900", "AL021900", "AL021900", "AL041905", "AL041905", "AL041905", "AL041905"), LAT = c(15, 15.2, 15.3, 15.4, 19, 19.5, 20, 36.3, 37.9, 39.6, 41), LONG = c(-42.1, -43.4, -44.7, -45.6, -59.3, -60, -60.6, -48.6, -47.9, -47.1, -46), INT = c(18.0054, 18.0054, 18.0054, 18.0054, 33.4386, 36.0108, 38.583, 46.2996, 43.7274, 41.1552, 41.1552)), row.names = c(NA, -11L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(NOMBRE = structure(list(), class = c("collector_character", "collector")), LAT = structure(list(), class = c("collector_double", "collector")), LONG = structure(list(), class = c("collector_double", "collector")), FECHA = structure(list(format = ""), class = c("collector_datetime", "collector")), INT = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec")) table_sf <- table %>% group_by(NOMBRE) %>% mutate( lineid = row_number(), # create a lineid LONG_end = lead(LONG), # create the end point coords for each start point LAT_end = lead(LAT) ) %>% unite(start, LONG, LAT) %>% # collect coords into one column for reshaping unite(end, LONG_end, LAT_end) %>% filter(end != "NA_NA") %>% # remove nas (last points in a NOMBRE group don't start lines) gather(start_end, coords, start, end) %>% # reshape to long separate(coords, c("LONG", "LAT"), sep = "_") %>% # convert our text coordinates back to individual numeric columns mutate_at(vars(LONG, LAT), as.numeric) %>% st_as_sf(coords = c("LONG", "LAT")) %>% # create points group_by(NOMBRE, INT, lineid) %>% summarise() %>% # union points into lines using our created lineid st_cast("LINESTRING") plot(table_sf[, 1:2]) 

You can see in the plot that each line between two points has its own INT as requested.

Example

0

You can do this using dplyr with sf.

library(sf) library(dplyr) ### first, roughly recreate your dataset using nc data nc <- st_read(system.file("shape/nc.shp", package="sf")) %>% st_centroid() nc$NOMBRE <- seq(1, nrow(nc)/2, 2) %>% as.character nc$INT <- nc$NAME nc %<>% arrange(NOMBRE) %>% mutate(ID = 1:nrow(.)) nc <- select(nc, NOMBRE, INT, ID) ### create linestring for each NOMBRE, arranging by ID and grabbing the first INT nc.line <- nc %>% group_by(NOMBRE) %>% arrange(ID) %>% summarise(INT = first(INT), do_union = FALSE) %>% st_cast("LINESTRING") 
1
  • The issue with your solution is that there are different INT values on each NOMBRE. With your solution I only keep one INT value for each NOMBRE and I need one for each line between two points. Commented Aug 23, 2018 at 4:37

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.