I have a table, etablissements with millions of rows, and a geom (st_point) column , spatially indexed (using gist). Those features may have a different category, activite_principale.
I want to find from a given, fixed point the closest companies for 5 or 6 different categories called "activite_principale" (1 closest company per category).
Here's what I did right now:
(WITH closest_candidates AS ( SELECT ent.id, ent.name, ent.geom FROM geo_data.etablissements ent WHERE ent.activite_principale = '1071C' ORDER BY ent.geom <-> 'SRID=4326;POINT (5.4153978921979125 43.271437384501965)'::geometry LIMIT 10 ) SELECT id FROM closest_candidates ORDER BY ST_Distance( geom, 'SRID=4326;POINT (5.4153978921979125 43.271437384501965)'::geometry ) LIMIT 1) UNION ALL (WITH closest_candidates AS ( SELECT ent.id, ent.name, ent.geom FROM geo_data.etablissements ent WHERE ent.activite_principale = '4711D' ORDER BY ent.geom <-> 'SRID=4326;POINT (5.4153978921979125 43.271437384501965)'::geometry LIMIT 10 ) SELECT id FROM closest_candidates ORDER BY ST_Distance( geom, 'SRID=4326;POINT (5.4153978921979125 43.271437384501965)'::geometry ) LIMIT 1) -- UNION ALL -- [...] And so on... I then clustered the etablissements table around the geom spatial index and ran VACUUM ANALYZE geo_data.etablissements;
Here's the result of EXPLAIN ANALYZE after clustering.
The planning is much shorter and the execution too but it's still slow (350-450ms).
I investigated compound index on geom & another text column, but that does not seem to be possible today ?
I use postgres 10 & postgis 2.4.
I don't know how to improve based on those explanations from the query planner. Can I do better performance ?
<->returns true distance from 2.5 on); still, if it's point-to-point, bbox comparison equals true distance (I'm sure that's noted in the article somewhere). I'm going to post a query that I'm curious to see the speed for on your system...