Slow join on geometry fields

Question

My situation is somewhat similar to this earlier question: In PostgreSQL I have one table containing point geometries (P), and another containing polygons (G). I would like to join on the geometry field, e.g.

SELECT P.ID, G.SHAPE FROM POINTS P JOIN NL_1KM G ON ST_CONTAINS(G.SHAPE, P.SHAPE)

The difference is that my polygons are a rectangular 1×1 km grid, so I don't think ST_Subdivide will help.

The grid table has ~126,000 records, and there are currently ~280 points. Both tables have a spatial index, all geometries are WGS84.

I would think that only 280 lookups on the grid would be required, but this takes roughly one minute. The same query with a 10km grid (1350 records) takes only 0.6 seconds

This is the query plan for the 1km grid:

QUERY PLAN Nested Loop (cost=0.00..9816799.62 rows=12459941 width=600) (actual time=2964.853..66014.294 rows=284 loops=1) Output: w.id, g.shape Join Filter: st_contains(g.shape, (w.shape)::st_geometry) Rows Removed by Join Filter: 37505400 -> Seq Scan on myschema.nl_1km g (cost=0.00..4571.58 rows=125858 width=596) (actual time=0.025..41.477 rows=125858 loops=1) Output: g.objectid, g.cellcode, g.eoforigin, g.noforigin, g.shape -> Materialize (cost=0.00..24.45 rows=297 width=484) (actual time=0.000..0.012 rows=298 loops=125858) Output: w.id, w.shape -> Seq Scan on myschema.points w (cost=0.00..22.97 rows=297 width=484) (actual time=0.007..0.106 rows=298 loops=1) Output: w.id, w.shape Planning Time: 0.293 ms Execution Time: 66014.393 ms

Note that the st_geometry data type in the execution plan is an ArcGIS data type.

What am I missing?

Update, a few minutes after I posted...

One thing I was missing is that using ST_Intersects instead of ST_Contains makes things so fast, it almost scares me (0.06 sec). The new query plan shows that the spatial index on the grid is now used. So, my question should have been, why does ST_Contains not use the spatial index? If anyone has an answer to that?

Laurenz Albe · Accepted Answer · 2023-09-27 07:43:32Z

0

I can only answer from the PostgreSQL/PostGIS point of view, as I know nothing about ArcGIS.

The clue is the line st_contains(g.shape, (w.shape)::st_geometry) from the plan.

st_geometry is an ArcGIS data type, so it looks like

points.shape is of type st_geometry, but nl_1km.shape is of a different data type (let's assume it is a PostGIS geometry)
the function st_contains is not a PostGIS function, but an ArcGIS function.

I don't know how well ArcGIS supports GiST indexes, but if there is a type cast from st_geometry to geometry, you could make use of PostGIS:

SELECT p.id, g.shape FROM points AS p JOIN nl_1km AS g ON st_contains(g.shape::geometry, p.shape);

That requires an index on the PostGIS geometry:

CREATE INDEX ON nl_1km USING gist ((g.shape::geometry));

edited Sep 27, 2023 at 7:43

answered Sep 26, 2023 at 13:35

Laurenz Albe

62.8k4 gold badges58 silver badges94 bronze badges

Thanks, but as I mentioned, both tables involved already have a spatial index, the one on the grid is literally the same as your suggestion :-) I believe st_geometry might a function installed by ArcGIS.

Berend
– Berend

2023-09-26 13:40:07 +00:00
Commented Sep 26, 2023 at 13:40
st_geometry is a data type, not a function. That type cast prevents you from using the index. Without knowing the column definition and what that st_geometry is, I cannot say more.

Laurenz Albe
– Laurenz Albe

2023-09-26 13:53:00 +00:00
Commented Sep 26, 2023 at 13:53
My bad, because of the st_ prefix, I mistakenly thought st_geometry was the geometry type used by postgis. It is however a custom type provided by ArcGIS, including many functions with the same name, eg st_contains, st_intersects, etc. I should probably move this to gis.stackexchange.com

Berend
– Berend

2023-09-26 14:29:34 +00:00
Commented Sep 26, 2023 at 14:29
Aha. That would mean that the st_contains function in the execution plan operates on that ArcGIS data type. Then it cannot be the PostGIS function st_contains. If it is an ArcGIS function, that would explain why it cannot make use of a GiST index. Pray tell: what is the data type of points.shape?

Laurenz Albe
– Laurenz Albe

2023-09-27 06:42:27 +00:00
Commented Sep 27, 2023 at 6:42
Points are the same datatype, ie st_geometry. The index is created by ArcGIS as well, and is defined as CREATE INDEX IF NOT EXISTS a134_ix1 ON myschema.points USING gist(shape).

Berend
– Berend

2023-09-27 07:04:11 +00:00
Commented Sep 27, 2023 at 7:04

| Show 3 more comments

Stack Exchange Network

Slow join on geometry fields

Update, a few minutes after I posted...

1 Answer 1

Linked

Hot Network Questions

Slow join on geometry fields

Update, a few minutes after I posted...

1 Answer 1

Linked

Related

Hot Network Questions