I'm working on PostgreSQL (latest version), in a simple SCADA system. The problem is that every query involving the largest table in my schema [figure 1] takes about 2 hours.
What I want is to obtain the constructora's PKs for buildings (edificio) whose departments (departamento) record (valor in table medicion) surpluses in the measurement of "gas" or "electricidad" in 2010. So...
variablecontains the max value for a record (valmax)medicioncontains the value of the metric (valor)
Tables' Size (in rows).
constructura -> 10edificio -> 100departamento -> 50.000variable -> 8medicion_departamento -> 400.000medicion -> 8.000.000
Query
I've made a query with INNER JOINs that goes from the littlest tables (constructora and variable) to the biggest one (medicion).
SELECT DISTINCT C.id_constructora FROM constructora C INNER JOIN variable V ON (V.nombre = 'electricidad' OR V.nombre = 'gas') INNER JOIN edificio E ON (E.id_constructora = C.id_constructora) INNER JOIN departamento D ON (E.id_edificio = D.id_edificio) INNER JOIN medicion M ON (M.id_variable = V.id_variable) WHERE ( (M.valor > V.valmax) AND EXTRACT(YEAR FROM M.fecha) = 2010 ); EXPLAIN
"HashAggregate (cost=2343438.58..2343438.68 rows=10 width=4)" " Group Key: c.id_constructora" " -> Hash Join (cost=164536.25..1947605.25 rows=158333333 width=4)" " Hash Cond: (e.id_constructora = c.id_constructora)" " -> Hash Join (cost=4.25..1510.75 rows=50000 width=4)" " Hash Cond: (d.id_edificio = e.id_edificio)" " -> Seq Scan on departamento d (cost=0.00..819.00 rows=50000 width=4)" " -> Hash (cost=3.00..3.00 rows=100 width=8)" " -> Seq Scan on edificio e (cost=0.00..3.00 rows=100 width=8)" " -> Hash (cost=164136.12..164136.12 rows=31670 width=4)" " -> Nested Loop (cost=0.00..164136.12 rows=31670 width=4)" " -> Nested Loop (cost=0.00..163739.12 rows=3167 width=0)" " Join Filter: ((m.valor > v.valmax) AND (v.id_variable = m.id_variable))" " -> Seq Scan on medicion m (cost=0.00..162408.00 rows=38000 width=8)" " Filter: (date_part('year'::text, fecha) = '2010'::double precision)" " -> Materialize (cost=0.00..1.13 rows=2 width=24)" " -> Seq Scan on variable v (cost=0.00..1.12 rows=2 width=24)" " Filter: (((nombre)::text = 'electricidad'::text) OR ((nombre)::text = 'gas'::text))" " -> Materialize (cost=0.00..1.15 rows=10 width=4)" " -> Seq Scan on constructora c (cost=0.00..1.10 rows=10 width=4)" My question is what can I do to reduce -significantly- the execution time of my query?

variableandmedicionlinked with other tables in the query? It seems that you have missedmedicion_departamentoin it...medicionwithdepartamentousingid_departamentocolumn. Isn't it?