2

CUDA (GPGPU) programmers are taught early on to measure their programs' bandwidth to see if they come close to the theoretical maximum (which can be looked up in the GPU spec or measured using standard utilities).

This is obviously useful for programs that are memory bandwidth-bound.

Is there a way to calculate the theoretically maximal CPU-RAM bandwidth instead (considering that typical setups are multi-DIMM and multi-core, and sometimes multi-socket)?

1 Answer 1

3

You may find some relevant information in ark.intel.com, for example, on the core i7-6700K, the site reads a bandwidth of 34.1 GB/s, which is 8(bits/Transactions)*2(memory channels)*2.133GT/s(transactions of the RAM module) = 34.128 GB/s.

For more recent processors, not all information is available, but you may find it: Core i9-9980XE has 8*4*2.666 = 85.312 GB/s peak bandwidth.

This is assuming the optimal DDR4 memory module is installed.

Piriform's speccy tool can provide you with this hardware information on the RAM menu:

In this case, two slots with 2133 GT/s with two channels = 34.1GB/s.

Sign up to request clarification or add additional context in comments.

2 Comments

My experience has been that you need multiple threads on different cores to get anywhere close to the peak bandwidth. Depending on the CPU vendor, single core bandwidth may range from okay to very low.
You should have gotten 34G bits /s according to your own formula (which is incorrect). The correct formula is 64 bits (bus width) * 2 * 2133 = 237 Gbits/s = 34 GB/s.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.