3

There is DB with unknown amount of corrupted unique indexes. Currently I discover them one by one when I try REINDEX DATABASE CONCURRENTLY <db_name>, handle the specific duplication violation (manipulating data, mainly delete the duplications using scripts), reindex the specific table or index, and continue to the next index (again only right after I discover it using REINDEX DATABASE CONCURRENTLY).

Not mentioning that each time I get indexes with the '_ccnew' suffix, that AFAIK are indexes that were tried to be created by a previous reindex concurrently but couldn’t be done, usually because they are violating a uniqueness check. The failed attempts to reindex concurrently will sit in there and should be dropped manually.

Concurrent reindex is used in order to prevent a shutdown.

I want to reduce those "roundtrips" of searching the next violation and wonder if there is a more efficient way to get a general data about the status of all the index corruptions or unique violations that a Postgres DB has.

1 Answer 1

4

You can use the amcheck extension to check a B-tree index for corruption. For other index types, there is no alternative to rebuilding the index.

You can run the following to check an index for corruption:

SELECT bt_index_check('indexname'::regclass); 

That can be automatized to run against all the indexes in your database:

DO $$DECLARE indid oid; msg text; BEGIN FOR indid IN SELECT i.oid FROM pg_class AS i JOIN pg_am ON pg_am.oid = i.relam WHERE pg_am.amname = 'btree' AND i.relkind = 'i' AND i.relpersistence <> 't' LOOP BEGIN PERFORM bt_index_check(indid); EXCEPTION WHEN index_corrupted THEN GET STACKED DIAGNOSTICS msg = MESSAGE_TEXT; RAISE WARNING 'Index "%" is currupted: %', indid::regclass, msg; END; END LOOP; END;$$; 

bt_index_check() has an optional second parameter; if you supply TRUE, the check will take much longer, but verify that all table rows are indexed.

Note that bt_index_check() does not cover all kinds of index corruption. There is also bt_index_parent_check(), which tests more thoroughly, but requires a SHARE lock that prevents concurrent data modification.

2
  • 1
    From official docs you could add i.relpersistence != 't' "Don't check temp tables, which may be from another session" postgresql.org/docs/17/amcheck.html Commented Aug 4 at 16:16
  • @Cepxio Thanks; edited. Commented Aug 9 at 13:54

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.