2

I'm a bit confused about how the Linux kernel handles parallelism during I/O operations (if it handles it at all).

I assume it can concurrently operate on file descriptors, but does it achieve parallelism when reading files, network sockets, etc.? Or is it a suspend/resume task under the hood (async)?

What happens in these scenarios?

  1. Multiple threads reading the same file
  2. Multiple threads reading different files
  3. Multiple threads reading files and network sockets
5
  • 1
    You might start with Wikipedia "Direct Memory Access". Concurrent hardware-assisted data transfers have been around for nearly half a century. All the kernel needs to do is set up the initial conditions, and process an interrupt when the DMA completes. Commented May 2, 2023 at 7:31
  • 1
    It gets better even than what Paul just said: modern storage device interfaces like SAS, SATA, NVMe have command queues and can and will respond out of order of a later request is quicker to answer than a newer one - that means you bring down the storage access serialization from the software side of things to the hardware. All this makes parallelism in IO the easier and default way of conceptualizing storage. I don't understand your distinction with async, though, I must amount. You can do multiple parallel async requests. Commented May 2, 2023 at 7:41
  • access.redhat.com/documentation/en-us/red_hat_enterprise_linux/… ? Commented May 2, 2023 at 10:19
  • @Paul_Pedant How does the kernel setup these conditions? My question originated from a discussion I had with a coworker whether a program would benefit from parallelism when performing I/O operations. I assume it boils down to the hardware bandwidth but I still would like to understand how the kernel handles that. Commented May 2, 2023 at 14:51
  • The DMA units (and more modern equivalents) are hardware devices, either integrated on the CPU chips or separate. They are set up with parameters written into their registers, and they can steal unused data bus cycles (for example while the general-purpose CPUs are accessing CPU cache). They will have firmware loaded at boot time. Think of them as extra CPUs, but with very limited and targeted abilities, as if working under supervision. None of this is new (apart from Moore's Law): my first mainframe (1969, ICL-1901A) had multi-processing and autonomous parallel I/O channels. Commented May 3, 2023 at 10:14

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.