1

Why if i write to a raw hard disk (without FS) the kernel also makes reads.

$ sudo dd if=/dev/zero of=/dev/sda bs=32k count=1 oflag=direct status=none $ iostat -xc 1 /dev/sda | grep -E "Device|sda" Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sda 45,54 0,99 1053,47 31,68 0,00 0,00 0,00 0,00 1,17 3071,00 3,04 23,13 32,00 66,38 308,91 

Is it readahead?
Instead of dd i wrote a c program that does the same, i even used posix_fadvise to hint the kernel that i do not want read ahead.

#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <stdlib.h> #include <stdio.h> #define _GNU_SOURCE #define BLOCKSIZE 512 #define COUNT 32768 int main(void) { // read COUNT bytes from /dev/zero int fd; mode_t mode = O_RDONLY; char *filename = "/dev/zero"; fd = openat(AT_FDCWD, filename, mode); if (fd < 0) { perror("Creation error"); exit (1); } void *pbuf; posix_memalign(&pbuf, BLOCKSIZE, COUNT); size_t a; a = COUNT; ssize_t ret; ret = read(fd, pbuf, a); if (ret < 0) { perror("read error"); exit (1); } close (fd); // write COUNT bytes to /dev/sda int f = open("/dev/sda", O_WRONLY|__O_DIRECT); ret = posix_fadvise (f, 0, COUNT, POSIX_FADV_NOREUSE); if (ret < 0) perror ("posix_fadvise"); ret = write(f, pbuf, COUNT); if (ret < 0) { perror("write error"); exit (1); } close(f); free(pbuf); return 0; } 

But the result is the same

 $ iostat -xc 1 /dev/sda | grep -E "Device|sda" Device r/s w/s rkB/s wkB/s r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sda 46,00 1,00 1064,00 32,00 10,78 1,00 0,43 23,13 32,00 10,55 49,60 

It does not matter if it is a spindel disk or ssd , the result is the same.
Also tried different kernels.

1
  • 1
    It could be actual reads, for example devices are scanned for new UUIDs, and if you dd a filesystem header instead of zeroes, it will magically appear in /dev/disk/by-uuid/... Commented Jun 18, 2024 at 11:09

1 Answer 1

1

I have found "the article" that mentions phantom reads. The article mentions the tools blktrace , blkparse .

I have launched dd and the tools.

sudo dd if=/dev/zero of=/dev/sda bs=4M count=30 oflag=direct status=progress sudo blktrace -d /dev/sda -a read -o - | blkparse -i - 

It showed lots of rows with

 8,0 3 1 1266874889.708944253 25480 D N 0 [systemd-udevd] 8,0 1 1511 0.055573485 25510 I RA 468861824 + 8 [(udev-worker)] 

So my assumption was - the reads were initiated by udev. After my write request to the raw disk was finished udev was checking if a new partition (or whatever) appeared, to create new "/dev/N" file.

I temporarily disabled systemd-udevd , relaunched dd write request and at that time "phantom" reads gone.

I started systemd-udevd back. I launched long write with dd. "Phantom" reads appeared again but only at the very end. So definetely it is udev.

Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util sda 0.00 0.00 0.00 0.00 0.00 0.00 2.00 8192.00 0.00 0.00 10.50 4096.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 3.00 Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util sda 0.00 0.00 0.00 0.00 0.00 0.00 109.00 446464.00 0.00 0.00 8.07 4096.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.88 100.00 Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util sda 0.00 0.00 0.00 0.00 0.00 0.00 109.00 446464.00 0.00 0.00 8.04 4096.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.88 100.40 Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util sda 267.00 1068.00 0.00 0.00 0.07 4.00 80.00 327680.00 0.00 0.00 8.03 4096.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.66 76.00 

I think it makes sense. When the disk stops processing writes from a user process udev checks if a new partition\fs appears on it

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.