5

Modern HDDs all are "Advanced Format" ones, e.g. by default they report a logical/physical sector size of 512/4096.

By default, most Linux formatting tools use a block size of 4096 bytes (at least that's the default on Debian/EXT4).

Until today, I thought that this was kind of optimized : Linux/EXT4 sends chunks of 4K data to the HDD, which can handle them optimally, even though its logical sector size is 512K.

But today I read this quite recent (2021) post. The guy did some HDD benchmarks, in order to check if switching his HDD's logical sector size from 512e to 4Kn would provide better performances. His conclusion :

Remember: My theory going in was that the filesystem uses 4k blocks, and everything is properly aligned, so there shouldn’t be a meaningful difference.

Does that hold up? Well, no. Not at all. (...) Using 4kb blocks… there’s an awfully big difference here. This is single threaded benchmarking, but there is consistently a huge lead going to the 4k sector drive here on 4kb block transfers. (...)

Conclusions: Use 4k Sectors!
As far as I’m concerned, the conclusions here are pretty clear. If you’ve got a modern operating system that can handle 4k sectors, and your drives support operating either as 512 byte or 4k sectors, convert your drives to 4k native sectors before doing anything else. Then go on your way and let the OS deal with it.

Basically, his conclusion was that there was quite a performance improvement in switching the HDD's logical sector size to 4Kn, vs the out-of-box 512e :

enter image description here

Now, an important thing to note : that particular benchmark was single threaded. He also did a 4-threaded benchmark, which didn't show any significant differences between 512e and 4Kn.

Thus my questions :

  • His conclusion holds up only if you have single threaded processes that read/write on the drive. Does Linux have such single threaded processes ?
  • And thus, would you recommend to set a HDD's logical sector size to 4Kn ?
6
  • 1
    This short but very good article leads me to believe "if you're not using 4k in 2023 you're either doing something wrong or have old systems". Commented Nov 17, 2023 at 14:28
  • Also, did you actually check the phyiscal block size of your HDD? "even though its logical sector size is 512K" how old is this drive we're talking baout? Commented Nov 17, 2023 at 14:29
  • I believe from 2017. It's a 512e HDD. Sector sizes (and not block sizes !) can be checked with smartctl -d sat -i /dev/sde | grep ^Sector, or fdisk -l /dev/sde | grep ^Sector. Commented Nov 17, 2023 at 14:55
  • "if you're not using 4k in 2023 you're either doing something wrong or have old systems" : I agree. See also that comment : unix.stackexchange.com/questions/562571/… Commented Nov 17, 2023 at 14:57
  • That being said, I'm not sure ALL block devices today come with 4Kn by default. I installed a laptop with 2 fairly recent NVMe SSDs, a WD_BLACK SN850X (on which I installed the OS) and a Crucial CT4000P3PSSD8. I just noticed (nvme id-ns -H /dev/nvmeXn1 | grep "^LBA Format") that the WD BLACK uses 512b sectors, and the Crucial uses 4096b sectors. So it seems some NVMe SSDs are shipped with the legacy 512 bytes sector size, and others already with the newer 4096 bytes one. I wonder why. nvme id-ns clearly states that 512 bytes is Good, while 4096 bytes is Better. Commented Nov 17, 2023 at 15:00

2 Answers 2

5

Following @Tomes advice, I'm trying to answer my own question, based on my comment exchange with @user10489.

Of course I am no expert on this matter, so don't hesitate to amend or correct my statements if needed.

But first, a clarification, because on a lot of websites people confuse block size and sector size :

  • A block is the smallest amount of data a file system can handle (very often 4096 bytes by default, for example for EXT4, but it can be changed during formatting). I believe in the Windows world that's called a cluster.
  • A sector is the smallest amount of data a drive can handle. Since circa 2010, all HDDs use 4096 byte sectors (e.g., the physical sector size is 4096 bytes). But to stay compatible with older OSes, that can only handle HDDs with 512 bytes sectors, modern drives still present themselves as HDDs with 512 bytes (e.g., their logical sector size is 512 bytes). The conversion from the logical 512 bytes, as seen by the OS, and the physical 4096 bytes of the HDD, is done by the HDD's firmware. This is called Advanced Format HDDs (aka 512e/4Kn HDDs, e for emulated and n for native)

So, an out-of-the-box HDD presents itself with a logical sector size of 512 bytes, because the drive's manufacturer want it to be recognized by all OSes, including old ones. But all modern OSes can handle native 4K drives (Linux can do this since kernel 2.6.31 in 2010). So a legitimate question is : if you know you won't ever use pre-2010 OSes, wouldn't it make sense, prior to using a new HDD, to modify it's logical sector size from 512 bytes to 4096 bytes ?

Someone did a benchmark to find out if there are real benefits to this, and found out that there was a real difference only in one case : single-threaded R/W tests. In multi-threaded tests, he found no significant difference.

My question is : does this specific use case translate in real life ? E.g., does Linux do a lot of single threaded R/W operations ? In which case setting the HDD's logical sector size to 4096 would result in some real benefits.

I still don't have the answer to this question. But I think another way to look at it is to say that, on modern OSes, it doesn't hurt to change a drive's default 512 bytes logical sector size to 4096 bytes : best case scenario you are getting some performance improvements if the OS does single-threaded R/W operations, and worst case scenario nothing changes.

Again, the only reason a drive uses 512 bytes logical sectors out-of-the-box is to stay compatible with older pre-2010 OSes. On modern OSes, setting it to 4096 bytes won't hurt.

One last thing to notice is that all HDD's don't support that change. As far as I know, those who do report explicitly their supported logical sector sizes :

# hdparm -I /dev/sdX | grep 'Sector size:' Logical Sector size: 512 bytes [ Supported: 512 4096 ] Physical Sector size: 4096 bytes 

It can then be changed also with hdparm, or with the manufacturer's proprietary tools.

[ EDIT ]

But there's a reason why changing the logical sector size from 512 to 4K may not be such a good idea. According to Wikipedia, aside from the OS, an application is also a potential area using 512-byte-based code :

enter image description here

So, does that mean that even with a modern OS supporting 4Kn, you can get into trouble if a specific application doesn't support it ?

In that case it makes probably more sense to keep the HDD's default 512e logical sector size, unless you can be absolutely sure that all your applications can handle 4Kn.

[ EDIT 2 ]

At second thought, there's probably no big risk to switch to 4K sectors on modern hardware and software. Most software will work at the filesystem level, and those who have direct raw block access (formatting tools, cloning tools, ...) will probably support 4K sectors, unless they're outdated. See also Switching HDD sector size to 4096 bytes

0

I am not proficient about file systems but upon reading your post I immediately ask myself how the benchmarks were executed and if the diagram shown may be in favor of the bigger block size simply due to the bus used to transfer the data to the disk.

Maybe you have use for some links I collected onto my "read later" list. Regarding multi-threading I/O: I assume you need separate busses in order to truely transport the data to the disk(s!) in parallel.

Anyhow, I am green but maybe you like these reads:

Regarding benchmarking and workload of disks

Comparing two file systems.. My note was "good benchmarking"

I would also like you to point to simple man pages if you get attracted to specific file systems. Within the man pages there are the options described and at the bottom you find the necessary kernel versions for the support of a feature. But reading your question this may already be obvious to you. Sorry for not answering.

10
  • >"if the diagram shown may be in favor of the bigger block size" Nope, the block size is the same in all tests (4K). It's the logical sector size that was tested (512e vs 4Kn). Blocks and sectors are different things... Commented Nov 13, 2023 at 18:36
  • 1
    If a disk has a physical block size of 4k and you are using logical blocks of 512b, that means that writing a single logical sector means reading and writing a single block; i.e., reading 8 logical sectors and then writing 8 sectors back (1 changed, 7 unchanged). In some circumstances, this won't impact performance, especially when you are doing sequential operations with read and write cache. For other operations, this could make a huge difference. Commented Nov 14, 2023 at 5:27
  • 1
    Parallel operation on disks is less meaningful that you might think It might mean writing 8 bits of one byte to 8 platters. It might mean transfering 8 bits over 8 wires and reassembling them on the disk end into one byte of memory in the controller cache. The real difference in busses is if the bus can only do one operation at a time or if it can accept multiple pending operations into the controller cache at once and then (optionally and optimally) reorder them and then acknowledge them as they complete while it accepts more operations... Again, sometimes this may make a difference. Commented Nov 14, 2023 at 5:32
  • 1
    I suspect that 4k matters less for multi-threaded than it does for sequential vs. random reads. But for sufficiently large numbers of sequential reads, it starts looking random. Basically, by increasing block size, you are forcing more sequential read chunks. It doesn't matter if a particular linux process is multi-threaded or not, it just matters how many total threads are doing disk I/O. Commented Nov 15, 2023 at 0:00
  • 1
    @Tomes : it's done Commented Nov 17, 2023 at 9:16

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.