5

We are running on PostgreSQL version 9.1, previously we had over 1Billion rows in one table and has been deleted. However, it looks like the \l+ command still reports inaccurately about the actual database size (it reported 568GB but in reality it's much much less than).

The proof of that 568GB is wrong is that the individual table size tally didn't add up to the number, as you can see, top 20 relations has 4292MB in size, the remaining 985 relations are all well below 10MB. In fact all of them add up to about less than 6GB.

Any idea why PostgreSQL so much bloat? If confirmed, how can I debloat? I am not super familiar with VACUUM, is that what I need to do? If so, how?

Much appreciate it.

pmlex=# \l+ List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description -----------------+----------+----------+-------------+-------------+-----------------------+---------+------------+-------------------------------------------- pmlex | pmlex | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 568 GB | pg_default | pmlex_analytics | pmlex | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 433 MB | pg_default | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 5945 kB | pg_default | default administrative connection database template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 5841 kB | pg_default | unmodifiable empty database | | | | | postgres=CTc/postgres | | | template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 5841 kB | pg_default | default template for new databases | | | | | postgres=CTc/postgres | | | (5 rows) pmlex=# SELECT nspname || '.' || relname AS "relation", pmlex-# pg_size_pretty(pg_relation_size(C.oid)) AS "size" pmlex-# FROM pg_class C pmlex-# LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) pmlex-# WHERE nspname NOT IN ('pg_catalog', 'information_schema') pmlex-# ORDER BY pg_relation_size(C.oid) DESC; relation | size -------------------------------------+--------- public.page_page | 1289 MB public.page_pageimagehistory | 570 MB pg_toast.pg_toast_158103 | 273 MB public.celery_taskmeta_task_id_key | 233 MB public.page_page_unique_hash_uniq | 140 MB public.page_page_ad_text_id | 136 MB public.page_page_kn_result_id | 125 MB public.page_page_seo_term_id | 124 MB public.page_page_kn_search_id | 124 MB public.page_page_direct_network_tag | 124 MB public.page_page_traffic_source_id | 123 MB public.page_page_active | 123 MB public.page_page_is_referrer | 123 MB public.page_page_category_id | 123 MB public.page_page_host_id | 123 MB public.page_page_serp_id | 121 MB public.page_page_domain_id | 120 MB public.celery_taskmeta_pkey | 106 MB public.page_pagerenderhistory | 102 MB public.page_page_campaign_id | 89 MB ... ... ... pg_toast.pg_toast_4354379 | 0 bytes (1005 rows) 
3
  • Does select pg_size_pretty(pg_database_size('pmlex')); show 568GB as well? Commented Aug 7, 2013 at 1:30
  • 1
    Autovacuum is enabled? You can expedite the process of tuple reuse by issuing a manual VACUUM;, but that isn't going to reclaim the space, only mark it as reusable. Did you delete the entire table? If so, why not TRUNCATE? Next point, if you deleted most of the table, it might be worth creating a copy of the table, truncating the original, then copying back in the data. That will free up space and recreate the indexes (the TRUNCATE frees up the space). Commented Aug 7, 2013 at 2:01
  • @bma thanks for the comments. And yes the query showed the bloat. ` pmlex=# select pg_size_pretty(pg_database_size('pmlex'));` gives 568 GB. As regards to the other comments about the proper way to truncate a large table - unfortunately those were done predating me. so we are at what we are now. So I am just trying to see if there is a post-mortem remediation. Commented Aug 8, 2013 at 14:34

2 Answers 2

3

Your options include:

1). Ensuring autovacuum is enabled and set aggressively.

2). Recreating the table as I mentioned in an earlier comment (create-table-as-select + truncate + reload the original table).

3). Running CLUSTER on the table if you can afford to be locked out of that table (exclusive lock).

4). VACUUM FULL, though CLUSTER is more efficient and recommended.

5). Running a plain VACUUM ANALYZE a few times and leaving the table as-is, to eventually fill the space back up as new data comes in.

6). Dump and reload the table via pg_dump

7). pg_repack (though I haven't used it in production)

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @bma! I will give these a shot soon.
0

it will likely look different if you use pg_total_relation_size instead of pg_relation_size

pg_relation_size doesn't give the total size of the table, see

https://www.postgresql.org/docs/9.5/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.