0

I have a river dataset, which is dissolved, so between two lakes there will always be one single river line.

Now I want to find all rivers that starts in a lake, and ends in a lake. So if my rivers are stored as line features, and the lakes as polygon features, I need to find all lines whose both start point and end point touches a polygon in the lake data set.

How do I do that in either ArcGIS or QGIS (preferrably without any programming).

2
  • Do you consider a Spatial SQL Query as programming ? Commented Feb 3, 2020 at 12:39
  • Find as in "select & extract", or find as in "add an attribute"? Commented Feb 3, 2020 at 12:52

2 Answers 2

1

In ArcGIS use the Feature Vertices To Points tool to create two different datasets, an end vertices dataset and a start vertices dataset. Select all those start vertices that intersect lakes. Select all end vertices that intersect lakes. Select all line that intersect the selected start vertices. From those selected lines select all lines that intersect the selected end vertices. Now you should have a line selection that includes any lines connecting lakes, and all lines inside of lakes. Remove from the selection any lines contained within the lake polygons.

You mention your data is dissolved. Are the line segments between the lakes the same record as the line segments before or after a lake? If so, some geometry editing will be required. Maybe a screen shot of your rivers symbolized by ObjectID would be useful.

3
  • That will not solve the problem, because it the lines and the polygons do not overlap other places than at the end points of the lines. The select by location in ArcGIS selects any line where at least one of the end points touches the polygons. I require that both endpoints must touch a polygon. Commented Feb 27, 2020 at 15:25
  • Sorry, I understood your question to mean you wanted to identify those river sections that started and ended inside of a lake. I have edited my answer above. Commented Feb 27, 2020 at 16:02
  • I agree with GBG, @AgnarRenolen should edit your question and insert an image of your data. Commented Feb 27, 2020 at 16:57
0

If you're working with smaller datasets where computational efficiency isn't a primary concern, you can simply iterate over all line features and check whether their start and end vertices fall within a polygon layer. Below is a basic PyQGIS script that does exactly this. In the example, I check each polygon in the polygon layer; however, you can adjust this to your specific polygon geometries.

from qgis.core import QgsSpatialIndex, QgsGeometry polygon_layer = QgsProject.instance().mapLayersByName('polygon_layer_name')[0] line_layer = QgsProject.instance().mapLayersByName('line_layer_name')[0] polygon_index = QgsSpatialIndex(polygon_layer.getFeatures()) output_layer = QgsVectorLayer('LineString?crs=' + line_layer.crs().toWkt(), 'matching_lines', 'memory') output_provider = output_layer.dataProvider() output_provider.addAttributes(line_layer.fields()) output_layer.updateFields() matching_lines = [] for line_feat in line_layer.getFeatures(): line_geom = line_feat.geometry() start_point = line_geom.asPolyline()[0] end_point = line_geom.asPolyline()[-1] start_point_geom = QgsGeometry.fromPointXY(start_point) end_point_geom = QgsGeometry.fromPointXY(end_point) # use spatial index to filter polygons start_candidates = polygon_index.intersects(start_point_geom.boundingBox()) end_candidates = polygon_index.intersects(end_point_geom.boundingBox()) start_in_polygon = False end_in_polygon = False for poly_id in start_candidates: poly_feat = polygon_layer.getFeature(poly_id) poly_geom = poly_feat.geometry() if poly_geom.contains(start_point_geom): start_in_polygon = True break for poly_id in end_candidates: poly_feat = polygon_layer.getFeature(poly_id) poly_geom = poly_feat.geometry() if poly_geom.contains(end_point_geom): end_in_polygon = True break if start_in_polygon and end_in_polygon: matching_lines.append(line_feat) output_provider.addFeatures(matching_lines) output_layer.updateExtents() QgsProject.instance().addMapLayer(output_layer) # todo: other processes, for example, export to file... 

For larger datasets, where performance becomes an issue, you might consider using geopandas with multiprocessing to accelerate the process.

import geopandas as gpd from shapely.geometry import Point from multiprocessing import Pool from time import time start = time() polygon_gdf = gpd.read_file(r'polygon_layer_path', layer='polygon_layer_name') line_gdf = gpd.read_file(r'line_layer_path', layer='r'line_layer_name') def check_line_in_polygon(line_geom, polygons): start_point = Point(line_geom.coords[0]) end_point = Point(line_geom.coords[-1]) start_in_polygon = polygons.contains(start_point).any() end_in_polygon = polygons.contains(end_point).any() return start_in_polygon and end_in_polygon def parallel_process(lines, polygons): with Pool() as pool: results = pool.starmap(check_line_in_polygon, [(line, polygons) for line in lines]) return results if __name__ == '__main__': matching_lines_mask = parallel_process(line_gdf['geometry'], polygon_gdf['geometry']) matching_lines_gdf = line_gdf[matching_lines_mask] matching_lines_gdf.to_file(r'macthed_layer_path', layer='r'macthed_layer_name', driver='GPKG') 

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.