0

I’m trying to understand how operating systems load a program into memory from a hardware perspective.

I know that DMA (Direct Memory Access) is used for I/O operations, allowing data transfer between a device (like a disk) and memory without CPU involvement. What I’m confused about is whether DMA is involved specifically in the process of loading a program (i.e., turning an executable on disk into a process in memory).

Here’s what I think so far:

• The OS (via the loader) initiates the loading of a program.

• The loader fetches the binary data from disk and places it into memory.

• Traditionally, CPU could receive data from a disk controller and then copy it into memory (Programmed I/O).

• With DMA, this transfer could be done directly between the disk and memory, which should make the process more efficient.

However, I couldn’t find any explicit source confirming that DMA is used during process loading.

My question:

Is DMA actually used when the OS loads a program from disk into memory (i.e., during the process creation/loading phase)? Or does the OS still rely on CPU-based I/O for this task?

If possible, I’d also appreciate any OS documentation or academic sources that clarify this.

Thanks in advance!

2
  • For most OSes there is no difference between loading program from device to memory and loading data from device to memory (OS might reuse already loaded program/library between several processes but that is not really relevant to I/O). Virtually all device I/O is DMA based and there's virtually no PIO done nowadays. Commented Apr 9 at 5:14
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Commented Apr 9 at 10:26

1 Answer 1

0

Yes, DMA is used for this purpose. As Andrey Turkin notes in his comment, reading a program off the disk and into memory is no different than how any other kind of data is read. Rather than view DMA as a mechanism specific to loading programs, you should instead view it as a general mechanism for transferring any kind of data.

When a process is launched on an operating system, the OS creates a new address space for the process and loads the program binary into memory. The program binary must be in memory so that the CPU can read and execute the program's assembly instructions.

To initiate this load, the OS uses DMA. DMA allows attached devices (such as the disk) to transfer data (1) to/from memory and (2) to/from other devices as applicable (such as a network card). There is a piece of hardware called the DMA controller (also called the DMA engine) that manages the transfer of data. The OS tells the DMA controller that it wants to transfer the program binary from the disk to memory. The DMA controller then manages this transfer itself and notifies the OS when the transfer is complete. At this point, the program binary is in memory and so the OS starts executing the program.

As you mentioned, there is another way to transfer data between an attached device and the CPU called programmed I/O. There are a couple of different ways PIO is implemented, but as one example, a disk could expose its data contents to the CPU via a range of addresses that the CPU then accesses with normal load/store instructions. For example, if the disk exposes addresses 0x100 through 0x200 to the CPU, the CPU could read a file in the middle of the disk by loading some address like 0x180 and then issuing additional loads to addresses through to the end of the file (which could end at some other address, such as 0x194). As the CPU reads bytes from the disk, the CPU then writes them to another range of addresses that live in memory. So the CPU is copying data from the disk to memory.

Both DMA and PIO allow an OS to read a file (such as a program binary) from disk, but they have different tradeoffs:

  1. DMA is better for large transfers of data. The CPU just needs to tell the DMA controller the starting point and length of the data to transfer from disk, and then the DMA controller manages the entire transfer itself. While this transfer is taking place, the CPU can then go on and do other things, such as run other applications.

  2. PIO is better for low-latency transfers of small amounts of data. With PIO, the CPU can directly load/store a byte on a device without needing to coordinate with the DMA controller. By avoiding this coordination, the byte can be transferred with lower latency. However, if the CPU wants to transfer a lot of data (e.g., KiBs, MiBs, or GiBs), it would need to issue a long series of loads/stores. While the CPU issues these instructions, it cannot run other applications. So this is just a waste of CPU time, since the CPU could instead coordinate with the DMA controller once and then have the DMA controller manage the entire transfer instead.

Program binaries are typically big enough (KiBs to GiBs) that DMA is better for loading them from disk to memory.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.