I have this table in MySQL, for example:
ID | Name 1 | Bob 4 | Adam 6 | Someguy If you notice, there is no ID number (2, 3 and 5).
How can I write a query so that MySQL would answer the missing IDs only, in this case: "2,3,5" ?
SELECT a.id+1 AS start, MIN(b.id) - 1 AS end FROM testtable AS a, testtable AS b WHERE a.id < b.id GROUP BY a.id HAVING start < MIN(b.id) Here is some additional reaading on this subject: https://web.archive.org/web/20220706195255/http://www.codediesel.com/mysql/sequence-gaps-in-mysql/
A more efficent query:
SELECT (t1.id + 1) as gap_starts_at, (SELECT MIN(t3.id) -1 FROM my_table t3 WHERE t3.id > t1.id) as gap_ends_at FROM my_table t1 WHERE NOT EXISTS (SELECT t2.id FROM my_table t2 WHERE t2.id = t1.id + 1) HAVING gap_ends_at IS NOT NULL Note: the initial gap starting at id=1 doesn't appear, but can be easily retrieved by querying the "MIN(id)".
Rather than returning multiple ranges of IDs, if you instead want to retrieve every single missing ID itself, each one on its own row, you could do the following:
SELECT id+1 FROM table WHERE id NOT IN (SELECT id-1 FROM table) ORDER BY 1 The query is very efficient. However, it also includes one extra row on the end, which is equal to the highest ID number, plus 1. This last row can be ignored in your server script, by checking for the number of rows returned (mysqli_num_rows), and then using a for loop if the number of rows is greater than 1 (the query will always return at least one row).
Edit: I recently discovered that my original solution did not return all ID numbers that are missing, in cases where missing numbers are contiguous (i.e. right next to each other). However, the query is still useful in working out whether or not there are numbers missing at all, very quickly, and would be a time saver when used in conjunction with hagensoft's query (top answer). In other words, this query could be run first to test for missing IDs. If anything is found, then hagensoft's query could be run immediately afterwards to help identify the exact IDs that are missing (no time saved, but not much slower at all). If nothing is found, then a considerable amount of time is potentially saved, as hagensoft's query would not need to be run.
SELECT id+1 AS id FROM users WHERE id NOT IN (SELECT id-1 FROM users) AND id != ( SELECT `AUTO_INCREMENT` FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'my_db' AND TABLE_NAME = 'users' ) ORDER BY 1;To add a little to Ivan's answer, this version shows numbers missing at the beginning if 1 doesn't exist:
SELECT 1 as gap_starts_at, (SELECT MIN(t4.id) -1 FROM testtable t4 WHERE t4.id > 1) as gap_ends_at FROM testtable t5 WHERE NOT EXISTS (SELECT t6.id FROM testtable t6 WHERE t6.id = 1) HAVING gap_ends_at IS NOT NULL limit 1 UNION SELECT (t1.id + 1) as gap_starts_at, (SELECT MIN(t3.id) -1 FROM testtable t3 WHERE t3.id > t1.id) as gap_ends_at FROM testtable t1 WHERE NOT EXISTS (SELECT t2.id FROM testtable t2 WHERE t2.id = t1.id + 1) HAVING gap_ends_at IS NOT NULL; It would be far more efficient to get the start of the gap in one query and the end of the gap in one query.
I had 18M records and it took me less than a second each to get the two results. When I tried getting them together my query timed out after an hour.
Get the start of gap:
SELECT (t1.id + 1) as MissingID FROM sequence t1 WHERE NOT EXISTS (SELECT t2.id FROM sequence t2 WHERE t2.id = t1.id + 1); Get the end of gap:
SELECT (t1.id - 1) as MissingID FROM sequence t1 WHERE NOT EXISTS (SELECT t2.id FROM sequence t2 WHERE t2.id = t1.id - 1); Above queries will give two columns so you can try this to get the missing numbers in a single column
select start from (SELECT a.id+1 AS start, MIN(b.id) - 1 AS end FROM sequence AS a, sequence AS b WHERE a.id < b.id GROUP BY a.id HAVING start < MIN(b.id)) b UNION select c.end from (SELECT a.id+1 AS start, MIN(b.id) - 1 AS end FROM sequence AS a, sequence AS b WHERE a.id < b.id GROUP BY a.id HAVING start < MIN(b.id)) c order by start; 475, 477, 506, 508, 513 but with the two-column version, it gets me the [475,475], [477,506], [508,513] which tells me I am missing numbers 475, 477-506, and 508-513.By using window functions (available in mysql 8) finding the gaps in the id column can be expressed as:
WITH gaps AS ( SELECT LAG(id, 1, 0) OVER(ORDER BY id) AS gap_begin, id AS gap_end, id - LAG(id, 1, 0) OVER(ORDER BY id) AS gap FROM test ) SELECT gap_begin, gap_end FROM gaps WHERE gap > 1 ; if you are on the older version of the mysql you would have to rely on the variables (so called poor-man's window function idiom)
SELECT gap_begin, gap_end FROM ( SELECT @id_previous AS gap_begin, id AS gap_end, id - @id_previous AS gap, @id_previous := id FROM ( SELECT t.id FROM test t ORDER BY t.id ) AS sorted JOIN ( SELECT @id_previous := 0 ) AS init_vars ) AS gaps WHERE gap > 1 ; gap_begin be LAG(...) OVER(...)+1, and gap_end be id-1 (assuming inclusive)?if you want a lighter way to search millions of rows of data,
SET @st=0,@diffSt=0,@diffEnd=0; SELECT res.startID, res.endID, res.diff , CONCAT( "SELECT * FROM lost_consumer WHERE ID BETWEEN " ,res.startID+1, " AND ", res.endID-1) as `query` FROM ( SELECT @diffSt:=(@st) `startID` , @diffEnd:=(a.ID) `endID` , @st:=a.ID `end` , @diffEnd-@diffSt-1 `diff` FROM consumer a ORDER BY a.ID ) res WHERE res.diff>0; check out this http://sqlfiddle.com/#!9/3ea00c/9