If your server is well configured, most of the storage IO should be write IO and all read IO except the (big) first one should be done from ram: your server should nearly only makes write IO to store to disk blocks which where modify in ram
As postgreSQL constantly makes some small write IO (depending of its page size), it would not be surprising that all postgreSQL datafiles would have an high fragmentation level
I did notice the same kind of "issue" (making lost more than 28% on a terabytes BTRFS filesystem) in a configuration where I have VM stored on a big BTRFS filesystem and each VM are using COW VMDK files as disk and some VM are running databases (specially MariaDB / postgreSQL)
The way I recover most of this space was to run :
$ sudo btrfs balance start -musage=100 -dusage=100 /mnt/vm $ sudo btrfs filesystem defrag -r -f -v /mnt/vm and to run a balance again :
$ sudo btrfs balance start -musage=100 -dusage=100 /mnt/vm This way I recover the most part of the "lost" space
Please also note that before all you should read :
Here are the results in my case:
Real data (original and last state):
$ sudo btrfs fi du -s /mnt/vm Total Exclusive Set shared Filename 669.62GiB 669.22GiB 401.46MiB /mnt/vm BTRFS filesystem usage (original state)
$ sudo btrfs fi usage /mnt/vm Overall: Device size: 1000.00GiB Device allocated: 955.07GiB Device unallocated: 44.93GiB Device missing: 0.00B Device slack: 0.00B Used: 950.71GiB Free (estimated): 46.74GiB (min: 24.28GiB) Free (statfs, df): 46.74GiB Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Multiple profiles: no Data,single: Size:949.01GiB, Used:947.19GiB (99.81%) /dev/mapper/vgu2nvme-lvvm 949.01GiB Metadata,DUP: Size:3.00GiB, Used:1.76GiB (58.56%) /dev/mapper/vgu2nvme-lvvm 6.00GiB System,DUP: Size:32.00MiB, Used:144.00KiB (0.44%) /dev/mapper/vgu2nvme-lvvm 64.00MiB Unallocated: /dev/mapper/vgu2nvme-lvvm 44.93GiB Last state after all operation (the first balance only recover 10 GiB):
$ sudo btrfs fi usage /mnt/vm Overall: Device size: 1000.00GiB Device allocated: 711.07GiB Device unallocated: 288.93GiB Device missing: 0.00B Device slack: 0.00B Used: 708.02GiB Free (estimated): 291.72GiB (min: 147.26GiB) Free (statfs, df): 291.72GiB Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Multiple profiles: no Data,single: Size:709.01GiB, Used:706.21GiB (99.61%) /dev/mapper/vgu2nvme-lvvm 709.01GiB Metadata,DUP: Size:1.00GiB, Used:926.33MiB (90.46%) /dev/mapper/vgu2nvme-lvvm 2.00GiB System,DUP: Size:32.00MiB, Used:144.00KiB (0.44%) /dev/mapper/vgu2nvme-lvvm 64.00MiB Unallocated: /dev/mapper/vgu2nvme-lvvm 288.93GiB So not perfect (still about 4% of lost space is not recoverd) but the best I achieve to do !
NB: All those operations were done online with filesystem mounted and with about ~ 20 running VM on it Maybe the only way to recover the last 4% would be to do a cold copy of data to another newly BTRFS formated filesystem (= stopping ~20 production VM and doing cp -a )...
So if someone know how to recover the last 4% lost space without copying the data to another filesystem, it would help a lot
2025-02-12 EDIT Using the 'new cache' option of btrfs (space_cache=v2) did recover the 4% of lost space: mounting the filesystem with option : ssd,discard=async,space_cache=v2
mount -o ssd,discard=async,space_cache=v2 /dev/vgu2nvme/lvvm /mnt/vm Now the difference is less than 0.5% :
$ df /mnt/vm/ Sys. de fichiers Type Taille Utilisé Dispo Uti% Monté sur /dev/mapper/vgu2nvme-lvvm btrfs 1,0T 776G 248G 76% /mnt/vm $ du -sh /mnt/vm/ 772G /mnt/vm/ $ sudo btrfs fi du -s /mnt/vm Total Exclusive Set shared Filename 771.85GiB 771.85GiB 0.00B /mnt/vm And output of btrfs fi usage confirm it:
$ sudo btrfs fi usage /mnt/vm/ Overall: Device size: 1.00TiB Device allocated: 780.07GiB Device unallocated: 243.93GiB Device missing: 0.00B Device slack: 0.00B Used: 774.75GiB Free (estimated): 247.33GiB (min: 125.37GiB) Free (statfs, df): 247.33GiB Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Multiple profiles: no Data,single: Size:776.01GiB, Used:772.60GiB (99.56%) /dev/mapper/vgu2nvme-lvvm 776.01GiB Metadata,DUP: Size:2.00GiB, Used:1.07GiB (53.57%) /dev/mapper/vgu2nvme-lvvm 4.00GiB System,DUP: Size:32.00MiB, Used:128.00KiB (0.39%) /dev/mapper/vgu2nvme-lvvm 64.00MiB Unallocated: /dev/mapper/vgu2nvme-lvvm 243.93GiB