We've been tasked to restore one large database on daily basis for analytics purpose.
Here is something that we cannot change:
- Vendor provided database dump file on daily basis on S3
- We cannot control how the database get dumped. We only get the dump file.
- We don't have the option to do replication or access to the database
What we can control / do :
- We can shutdown the DB / kick the user out
- We have full control of hardware / VM for this process
Current process:
We use a VM hosted in VMC ( Vmware cloud ). When we detected a new database dump fiel in s3 - we spin up a VM with maximum resources one ESX host can take. Then fully allocate all CPU and memory to it, then run the restore. Once resource completed , we shutdown the VM and allocate a more "appropriate" resource for it and allow user in.
I managed to optimise this in less than 3 hours, however we would like to avoid this process because it affect performance on another VMs in the ESX cluster when we take one whole host
New process:
We provision a new a bare metal physical server with more powerful CPU, bigger memory and locally attached enterprise SSD disk.
Issue:
Restore 5 to 6 hours when run on the Physical server using the same configuration as the VM.
Server Spec :
Restore command:
pg_restore -vvv -U postgres -j 40 -d $DB ~/$DB.pgdump pg_dump TOC :
; Archive created at 2025-09-14 16:08:54 AEST ; dbname: database ; TOC Entries: 8221 ; Compression: none ; Dump Version: 1.14-0 ; Format: CUSTOM ; Integer: 4 bytes ; Offset: 8 bytes ; Dumped from database version: 14.15 ; Dumped by pg_dump version: 14.19 (Ubuntu 14.19-1.pgdg22.04+1) postgresql.conf (restore mode)
listen_addresses = '' max_connections = 98 ssl = off shared_buffers = 16GB work_mem = 64MB maintenance_work_mem = 64GB wal_level = minimal fsync = off synchronous_commit = off full_page_writes = off wal_buffers = 16MB checkpoint_timeout = 2h max_wal_size = 16GB archive_mode = off max_wal_senders = 0 Observation during restore :
- pg_restore processes seems to be the one using up most CPU resources. At times where multiple postgres processes use 0% CPU , and multiple pg_restore prcesses uses 100% CPU
- No storage contention - majority of the time I/O wait time is 0.
I'm at lost as to why this is happening. I tried to increase paralism but it doesn't help. Any ideas on how to speed things up ?

perf. The problem could be anywhere: hardware, operating system configuration, database. A Q&A forum is not suited for investigations like that.-j=40results in a 5 to 6 hours restore time, while-j 5results in 3 hours 42 minutes. That tells you that40is too much; it's not the best performing setting. Try 10 or 20, and monitor the progress to identify the bottleneck and determine when you achieve the best performance.