Given a postgres table foo that contains an external_id BIGINT column which has some gaps. Is there a way, without looping or using an excessively large generate_sequence to exclude from or join against, to get n values that are not present in said column?
Let's say it has rows with external_ids 1, 3, 5, 7 and I want it to return 5 sequential values, it should return 2, 4, 6, 8, 9. The next select (given that nothing changed) should return 10, 11, 12, 13, 14 etc. So both gaps and continuously rising, sequential ids should be returned. I could do something like
WITH id_series AS ( SELECT generate_series(1, 10) AS ids ) SELECT id_series.ids FROM id_series left join known_ids ki on id_series.ids = kid.id where ki.id is null limit 5; this however might not return enough unused ids if the sequence isn't big enough for it to find unused ones, having to loop for as long as 5 ids have been returned.
My table currently contains ids ranging between 50,000 and 1,800,000,000. Some gaps are a few million ids wide, some others ranges don't have a gap for another few millions (potentially requiring to loop many many times or using a very large sequence)
Is there any clever & efficient way to solve this?
Here's the reason I'm doing this. The table contains the external_ids from an external data source. this data source has to be queried with individual ids. the data source exposes its entities over time. so id 1 billion might be "public" before id 5. so I need to know for which external ids I didn't get a response YET and query them periodically. As I said before, it's something I can't change because I don't control the external data source. It's an actual business requirement I cannot change.
external_idimplies you do not control the value. So why are you manufacturing values? Further, if you are generating the values, it is still not a problem. Starting the highest value you current have (1,800,000,000) and you generate 1,000,000 values per second then you have 290,000 years before you exceed the maximum for a bigint.