PostgreSQL query from large raster table problem

Question

I am trying to build a fast query to PostgreSQL. My raster (DEM) is 1.6 GB, about 2.5 cm precision...With a smaller DEM it was ok, about 20 seconds to query a 300m radius elevations. Now with this larger raster it is not even responding, or at least not within a day... My queries are structured this way:

WITH pairs(x,y) AS ( VALUES (-74.61067886106919,45.60987140366732), (-74.61068132465692,45.60987191852093) ) SELECT ST_Value(rast, ST_SetSRID(ST_MakePoint(x, y), 4326)) AS height FROM alfred2 rs CROSS JOIN pairs WHERE ST_Intersects(rs.rast, ST_SetSRID(ST_MakePoint(x, y), 4326));

This one with only a few points already takes like over 30 seconds... and I need a lot more points to calculate line-of-sight in my JavaScript program.

I was thinking of using ST_Clip to clip the raster prior to query but can't figure out how to do that for a circle of 250 meters around a lat,lng.

NOTE: I reloaded the DEM with 10x10 and The result of EXPLAIN ANALYSE:

Gather (cost=1000.00..6964763.37 rows=1890080 width=8) (actual time=7500.324..7505.978 rows=2 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop (cost=0.00..6774755.37 rows=787533 width=8) (actual time=6970.136..7435.628 rows=1 loops=3) Join Filter: st_intersects(rs.rast, st_setsrid(st_makepoint(("*VALUES*".column1)::double precision, ("*VALUES*".column2)::double precision), 4326), NULL::integer) Rows Removed by Join Filter: 1890075 -> Parallel Seq Scan on alfred5_10x10 rs (cost=0.00..200821.00 rows=1181300 width=472) (actual time=0.038..227.874 rows=945038 loops=3) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=64) (actual time=0.000..0.001 rows=2 loops=2835113) Planning Time: 0.156 ms JIT: Functions: 12 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 2.133 ms, Inlining 123.100 ms, Optimization 150.016 ms, Emission 83.455 ms, Total 358.705 ms Execution Time: 7506.925 ms

st_buffer((st_setsrid(st_makepoint(x,y),4326))::geography,250) — jbalk
– jbalk, Commented Sep 7, 2021 at 19:09
also make sure you have a spatial index on the raster column — jbalk
– jbalk, Commented Sep 7, 2021 at 19:13
How did you load the raster? i.e. was it tiled with the -t option? — Pierre Racine
– Pierre Racine, Commented Sep 7, 2021 at 23:09
You should attach the result of the explain of the request. This is just a supposition, but I think the query planner is not able to properly optimise this requeste due to the Where coupled with a Cross join. It probably has to perform a Nested Loop on the table and then filtering, hence the long query time. If this is really what is happening I think that replacing the "CROSS JOIN <table> WHERE <filtrer>" part with "JOIN <table> ON<filtrer>" should work. — Atm
– Atm, Commented Sep 8, 2021 at 7:49

Atm · Accepted Answer · 2021-09-13 08:43:51Z

Like said in comments, this query is particularly long because the planner does not consider it necessary to use a geographical index when executing the WHERE clause. Initially I thought it was either because the table had no index, or the statistics were not up to date, or because of the use of a CROSS JOIN. So I tested these queries on a large raster table (a 1m DEM of a full french departement [NUTS 3], ~5 500 km²) with a geometry index and updated statistics.

The adaptation of your request have a similar comportment,the query is very long and the query plan as very similar :

with pairs(x,y) as ( values (980279.44, 6431129.89), (982573.24, 6429308.57) ) SELECT st_value(rast, ST_SetSRID(ST_Point(x, y), 2154)) FROM rgealti_fxx_0892_6372_mnt_lamb93_ign69 cross join pairs WHERE st_intersects(rast, ST_SetSRID(ST_Point(x, y), 2154));

The planner doesn't use a index, despite the fact that only two lines out of 600,000 need to be selected.

Gather (cost=1000.00..1441615.80 rows=399801 width=8) Workers Planned: 2 -> Nested Loop (cost=0.00..1400635.70 rows=166584 width=8) Join Filter: st_intersects(rgealti_fxx_0892_6372_mnt_lamb93_ign69.rast, st_setsrid(st_point(("*VALUES*".column1)::double precision, ("*VALUES*".column2)::double precision), 2154), NULL::integer) -> Parallel Seq Scan on rgealti_fxx_0892_6372_mnt_lamb93_ign69 (cost=0.00..10075.76 rows=249876 width=20) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=64)

But with small changes it's works :

WITH points(geom) as ( values (ST_SetSRID(ST_Point(980279.44, 6431129.89), 2154)), (ST_SetSRID(ST_Point(982573.24, 6429308.57), 2154)), (ST_SetSRID(ST_Point(926078.71, 6371869.75), 2154)) ) SELECT rid, geom as select_point, st_value(rast, geom) as height, st_envelope(rast) as raster_bbox FROM rgealti_fxx_0892_6372_mnt_lamb93_ign69 Join points on st_intersects(rast, geom);

Unfortunately, I am unable to explain why this second version uses an index. Maybe these can't work when a point is created on the fly, but it seems to me that it is possible in other situations, this is a question to be explored.

Nested Loop (cost=0.29..1330.14 rows=600 width=76) (actual time=0.855..1.335 rows=3 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.04 rows=3 width=32) (actual time=0.003..0.005 rows=3 loops=1) -> Index Scan using rgealti_fxx_0892_6372_mnt_lamb93_ign69_st_convexhull_idx on rgealti_fxx_0892_6372_mnt_lamb93_ign69 (cost=0.29..392.67 rows=20 width=24) (actual time=0.177..0.178 rows=1 loops=3) Index Cond: ((rast)::geometry && "*VALUES*".column1) Filter: _st_intersects("*VALUES*".column1, rast, NULL::integer) Planning Time: 0.560 ms Execution Time: 1.381 ms

A B S O L U T E L Y amazing, went from 7506.925 ms to Execution Time: 0.324 ms on my dev server! — Felix Blackburn
– Felix Blackburn, Commented Sep 13, 2021 at 15:35

Stack Exchange Network

PostgreSQL query from large raster table problem

1 Answer 1

Hot Network Questions

PostgreSQL query from large raster table problem

1 Answer 1

Related

Hot Network Questions