Put succinctly, if a query tells me A overlaps B then I don't need it to also tell me that B also overlaps A as they overlap each other.
So I am trying to use a self join in sql to select just 'DISTINCT' overlaps.
To illustrate, here is a simple SQL fiddle that I wrote to show inclusive overlap selection (http://sqlfiddle.com/#!9/7af84f/1)
In detail...
Assume I have a table of name (char), d1 (int), d2 (int) , the schema of which is below. Here d1 and d2 represent the start and end of some interval that might overlap with another interval in the same table,.
CREATE TABLE test ( letter char , d1 int , d2 int ) ; Given this table I fill it with some values
INSERT INTO test (letter,d1,d2) VALUES ('A', 2, 10), -- overlaps C and D ('B', 12, 20), -- overlaps E ('C', 5, 10), -- overlaps A and D ('D', 1, 8), -- overlaps A and C ('E', 13, 15), -- overlaps B ('F', 25, 30); -- doesn't overlap anything and run the following query that uses a self join to correctly find the rows where d1 and d2 in one row has an inclusive overlap with d1 and d2 in other rows.
-- selects all records that overlap in the range d1 - d2 inclusive -- (excluding the implicit overlap between a record and itself) -- The results are sorted by letter followed by d1 SELECT basetable.letter as test_letter, basetable.d1, basetable.d2, overlaptable.letter as overlap_letter, overlaptable.d1 as overlap_d1, overlaptable.d2 as overlap_d2 FROM test as basetable, test as overlaptable WHERE -- there is an inclusive overlap basetable.d1 <= overlaptable.d2 and basetable.d2 >= overlaptable.d1 AND -- the row being checked is not itsself basetable.letter <> overlaptable.letter AND basetable.d1 <> overlaptable.d1 AND basetable.d2 <> overlaptable.d2 ORDER BY basetable.letter, basetable.d1 That correctly gives me the following, showing all 6 versions of overlaps eg left hand column indicates that A overlaps C and another row shows that C overlaps A (note the sqlfiddle doesn't seem to understand field aliases so my column headers are different)
test_letter d1 d2 overlap_letter overlap_d1 overlap_d2 A 2 10 D 1 8 B 12 20 E 13 15 C 5 10 D 1 8 D 1 8 A 2 10 D 1 8 C 5 10 E 13 15 B 12 20 My question is this:
How can I alter the sql to just get four rows of 'DISTINCT' or 'one way' overlaps?
ie this result...
test_letter d1 d2 overlap_letter overlap_d1 overlap_d2 A 2 10 D 1 8 A 2 10 C 5 10 B 12 20 E 13 15 C 5 10 D 1 8 eg:
a result that just shows records for A, B and C in the left hand column according to the following reasoning
- A(2,10) overlaps with D(1,8) and C(5,10) and {SHOW THESE TWO ROWS}
- B(12,20) overlaps with E(13,15) {SHOW THIS ROW}
- C(5,10) overlaps with D(1,8) {SHOW THIS ROW but don't show the A(1,10) overlap as row 2 already shows that A and C overlap}
- D(1,8) {DON'T SHOW anything new as we already know about A(1,10) and C(5,10)}
- E(13,15) {DON'T SHOW anything new as we already know about B(12,20) }
- F(25,30) {DON'T SHOW anything as there are no overlaps}
basetable.letter <> overlaptable.lettertobasetable.letter < overlaptable.letter. It will also speed up your query by up to 50 percent. (Which is, now that I can see it, exactly what @Strawberry wrote using words).basetable.letter {< | > }overlaptable.letterother possible options arebasetable.d1 > overlaptable.d1' ,basetable.d2 < overlaptable.d2`