1

I understand that when I write to a file on my hard drive, a kernel thread (pdflush in pre 3.6 kernels) will at some point actually flush that out to the file. I have been reading about networking in "Understanding the Linux Kernel, Second Edition" (sorry there is no ebook I can link to) and it seems to suggest that from calling send(), we go right down into the kernel ending with the data being placed the network card's outgoing queue.

There is no mention of any other threads.

Can someone clarify that I have not misread or misunderstood, and that for each send() call I make, the thread that makes the call in my process goes right through to the point of having the kernel place it on the NIC's TX queue?

If this is the case, I am confused as to how this happens with asynchronous sends; or in this case, does asynchronous simply mean we get notification at a later point that the send has happened?

Confused.

(I previously asked this on normal SO, but was advised this was the better area for it.)

1
  • Not too much of an expert, but I think the situation with NICs is more complicated, because they can proactively send data and handle RX queues themselves; and then there's the whole business of "to interrupt-coalesce or not to interrupt-coalesce", which changes the way the kernel learns of when the card is ready to send more data, on modern cards even adaptively changing between modes Commented Nov 22, 2024 at 22:05

1 Answer 1

1

You're discussing the technique of writing data to a device with the data being cached in memory before it's delivered to the device. In your two examples, the devices are a "hard disk" (which may use SSD technology for faster speed), and a network interface (usually wired or wireless Ethernet).

In the books/websites you have been reading, writes to a hard drive are cached in memory and a background thread moves the data from the cache to the physical disk. Writes to a network device are not cached to the same extent (there is cache, but it's much smaller). Why is this?

Hard drives are slow. This answer in a Redis mailing list thread about memory/disk speed illustrates the difference between the speeds of memory, SSD, and HD. Programs live in memory and when they work with code and variables and data structures in memory, they can run at memory speeds. But when programs must write to disk, the slower speed of the device - without buffering - brings the program to a screeching halt. As the comparison in that mailing list answer shows, it's the difference between about a second (writing to memory) and waiting several days (writing to HD) before you can do anything else. SSD is better, but still represents waits that will cripple any program. And it's not just writes to disk; reads suffer as badly.

It isn't tolerable to slow programs down so much, so for decades now operating systems have placed a memory buffer (cache) between disk and programs. Writes go into the memory buffer quickly, allowing the program to continue its function. Kernel thread(s) flush the buffered data to disk at the disk's slower rate, but it's done in the background where it won't hurt performance. When reading from disk, the operating system tries to read extra data into the read cache so the program that requests the data will wait less. It's not perfect, and when the program requests disk data that's not in the cache, the program has to wait for it.

With network devices there is also data buffering to help programs perform well, but the buffering is smaller and simpler. Books and websites don't go into the same amount of detail as they do with disk buffering.

1
  • Thanks. The missing info was located here Commented Nov 24, 2024 at 16:47

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.