CUDA (GPGPU) programmers are taught early on to measure their programs' bandwidth to see if they come close to the theoretical maximum (which can be looked up in the GPU spec or measured using standard utilities).
This is obviously useful for programs that are memory bandwidth-bound.
Is there a way to calculate the theoretically maximal CPU-RAM bandwidth instead (considering that typical setups are multi-DIMM and multi-core, and sometimes multi-socket)?
